MatchPapers Save

Worth-reading papers and related awesome resources on matching task. 值得一读的匹配任务相关论文与资源集合

Project README

MatchPapers

Worth-reading papers and related awesome resources on matching task. Matching task is common in many tasks, like natural language inference (NLI), question answering (QA), recommendation system (RecSys), information retrieval (IR) and advertising. This repository also contains many relative research field of this task, including approximately approximate nearest neighbor (ANN), text matching algorithm, CTR, LTR (learning-to-rank) and so on.

Suggestions about adding papers, repositories and other resources are welcomed!

Since I am Chinese, I mainly focus on Chinese resources. Welcome to recommend excellent resources in English or other languages!

值得一读的匹配任务相关论文与资源集合。匹配任务常见于自然语言推断、问答、推荐系统、信息检索、广告等场景。本仓库还包含该任务的许多相关研究领域,包括最近邻搜索、文本匹配算法和CTR、LTR等。

欢迎新增论文、代码仓库与其他资源等建议!

Papers

Text Matching

  • Enhanced-RCNN: An Efficient Method for Learning Sentence Similarity. Shuang Peng, Hengbin Cui, Niantao Xie, Sujian Li, Jiaxing Zhang, Xiaolong Li. (https://dl.acm.org/doi/10.1145/3366423.33799981145/3366423.3379998)
  • Match^2: A Matching over Matching Model for Similar Question Identification. Zizhen Wang, Yixing Fan, Jiafeng Guo, Liu Yang, Ruqing Zhang, Yanyan Lan, Xueqi Cheng, Hui Jiang, Xiaozhao Wang. (SIGIR 2020) [paper]
  • CLEAR: Contrastive Learning for Sentence Representation. Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, Hao Ma. (CoRR 2020) [paper]
  • Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks. Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang. (https://arxiv.org/abs/2102.10934v.org/abs/2102.10934)[code]

Text Retrieval

  • DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, Daxin Jiang. (SIGIR 2020) [paper]
  • Dense Passage Retrieval for Open-Domain Question Answering. Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih. (EMNLP 2020) [paper][code] - DPR
  • ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. Omar Khattab, Matei Zaharia. (SIGIR 2020) [paper][code]
  • Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring. Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston. (ICLR 2020) [paper][unofficial code]
  • Pre-training Tasks for Embedding-based Large-scale Retrieval. Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar. (ICLR 2020) [paper]
  • Distilling Knowledge from Reader to Retriever for Question Answering. Gautier Izacard, Edouard Grave. (ICLR 2021) [paper][code]

Sentence Embedding

  • On the Sentence Embeddings from Pre-trained Language Models. Bohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei Li. (EMNLP 2020) [paper][code] - BERT-flow

Query Expansion

  • BERT-QE: Contextualized Query Expansion for Document Re-ranking. Zhi Zheng, Kai Hui, Ben He, Xianpei Han, Le Sun, Andrew Yates. (Findings of EMNLP 2020) [paper][code]

Recommadation System Retrieval & Matching

  • CFGAN: A Generic Collaborative Filtering Framework based on Generative Adversarial Networks. Dong-Kyu Chae, Jinsoo Kang, Sangwook Kim, Jungtae Lee. (CIKM 2018) [paper][code]
  • Multi-Interest Network with Dynamic Routing for Recommendation at Tmall. Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Pipei Huang, Huan Zhao, Guoliang Kang, Qiwei Chen, Wei Li, Dik Lun Lee. (CIKM 2019) [paper] - MIND
  • SDM: Sequential Deep Matching Model for Online Large-scale Recommender System. Fuyu Lv, Taiwei Jin, Changlong Yu, Fei Sun, Quan Lin, Keping Yang, Wilfred Ng. (CIKM 2019) [paper][code]
  • Learning Robust Models for e-Commerce Product Search. Thanh V. Nguyen, Nikhil Rao, Karthik Subbian. (ACL 2020) [paper] - QUARTS
  • Internal and Contextual Attention Network for Cold-start Multi-channel Matching in Recommendation. Ruobing Xie, Zhijie Qiu, Jun Rao, Yi Liu, Bo Zhang, Leyu Lin. (IJCAI 2020) [paper] - ICAN
  • Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations. Weihao Gao, Xiangjun Fan, Jiankai Sun, Kai Jia, Wenzhi Xiao, Chong Wang, Xiaobing Liu. (CoRR 2020) [paper]

CTR

  • Deep & Cross Network for Ad Click Predictions. Ruoxi Wang, Bin Fu, Gang Fu, Mingliang Wang. (KDD 2017) [paper] - DCN
  • DCN-M: Improved Deep & Cross Network for Feature Cross Learning in Web-scale Learning to Rank Systems. Ruoxi Wang, Rakesh Shivanna, Derek Z. Cheng, Sagar Jain, Dong Lin, Lichan Hong, Ed H. Chi. (CoRR 2020) [paper]
  • Deep Session Interest Network for Click-Through Rate Prediction. Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, Keping Yang. (IJCAI 2019) [paper][codee] - DSIN
  • Behavior Sequence Transformer for E-commerce Recommendation in Alibaba. Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, Wenwu Ou. (DLP-KDD 2019) [paper] - BST
  • Deep Match to Rank Model for Personalized Click-Through Rate Prediction. Ze Lyu, Yu Dong, Chengfu Huo, Weijun Ren. (AAAI 2020) [paper][code][blog] - DMR
  • Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction. Qi Pi, Xiaoqiang Zhu, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Kun Gai. (CoRR 2020) [paper] - SIM
  • GateNet: Gating-Enhanced Deep Network for Click-Through Rate Prediction. Tongwen Huang, Qingyun She, Zhiqiang Wang, Junlin Zhang. (CoRR 2020) [paper]
  • Deep Feedback Network for Recommendation. Ruobing Xie, Cheng Ling, Yalong Wang, Rui Wang, Feng Xia, Leyu Lin. (IJCAI 2020) [paper][code] - DFN
  • Deep Interest with Hierarchical Attention Network for Click-Through Rate Prediction. Weinan Xu, Hengxu He, Minshi Tan, Yunming Li, Jun Lang, Dongbai Guo. (SIGIR 2020) [paper] [code] - DHAN
  • MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction. Wentao Ouyang, Xiuwu Zhang, Lei Zhao, Jinmei Luo, Yu Zhang, Heng Zou, Zhaojie Liu, Yanlong Du. (CIKM 2020) [paper][blog]
  • Operation-aware Neural Networks for User Response Prediction. Yi Yang, Baile Xu, Furao Shen, Jian Zhao. (Neural Networks Volume 121, January 2020) [paper] - ONN NFFM
  • CAN: Revisiting Feature Co-Action for Click-Through Rate Prediction. Guorui Zhou, Weijie Bian, Kailun Wu, Lejian Ren, Qi Pi, Yujing Zhang, Can Xiao, Xiang-Rong Sheng, Na Mou, Xinchen Luo, Chi Zhang, Xianjie Qiao, Shiming Xiang, Kun Gai, Xiaoqiang Zhu, Jian Xu. (CoRR 2020) [paper][code][blog]
  • FuxiCTR: An Open Benchmark for Click-Through Rate Prediction. Jieming Zhu, Jinyang Liu, Shuai Yang, Qi Zhang, Xiuqiang He. (CoRR 2020) [paper]

Sequential RecSys

  • BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, Peng Jiang. (CIKM 2019) [paper][code]
  • Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation. Chang Liu, Xiaoguang Li, Guohao Cai, Zhenhua Dong, Hong Zhu, Lifeng Shang. (AAAI 2021) [paper][Chinese blog] - NOVA-BERT*

LTR

  • IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, Dell Zhang. (SIGIR 2017) [paper][code]

Embedding & ANN

  • Detecting Near-Duplicates for Web Crawling. Gurmeet Singh Manku, Arvind Jain profile, Anish Das Sarma. (http://www.wwwconference.org/www2007/papers/paper215.pdf/papers/paper215.pdf) - Simhash
  • Product Quantization for Nearest Neighbor Search. Hervé Jégou, Matthijs Douze, Cordelia Schmid. (IEEE Transactions on Pattern Analysis and Machine Intelligence 2011) [paper] - PQ
  • Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. Yu. A. Malkov, D. A. Yashunin. (IEEE Trans. Pattern Anal. Mach. Intell. 42(4)) [paper] - HNSW
  • The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform. Jie Li, Haifeng Liu, Chuanghua Gui, Jianyu Chen, Zhenyun Ni, Ning Wang. (Middleware Industry 2018) [paper][code]
  • ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms. Martin Aumüller, Erik Bernhardsson, Alexander Faithfull. (Information Systems 2019) [paper][code]
  • Embedding-based Retrieval in Facebook Search. Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, Linjun Yang. (KDD 2020) [paper]
  • Accelerating Large-Scale Inference with Anisotropic Vector Quantization. Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar. [paper][code] - ScaNN

Architecture & System

  • Real-time Attention Based Look-alike Model for Recommender System. Yudan Liu, Kaikai Ge, Xu Zhang, Leyu Lin. (KDD 2019) [paper] - RALM
  • Applying Deep Learning To Airbnb Search. Malay Haldar, Mustafa Abdool, Prashant Ramanathan, Tao Xu, Shulin Yang, Huizhong Duan, Qing Zhang, Nick Barrow-Williams, Bradley C. Turnbull, Brendan M. Collins, Thomas Legrand. (KDD 2019) [paper]
  • MOBIUS: Towards the Next Generation of Query-Ad Matching in Baidu's Sponsored Search. Miao Fan, Jiacheng Guo, Shuai Zhu, Shuo Miao, Mingming Sun, Ping Li. (KDD 2019) [paper]
  • Embedding-based Retrieval in Facebook Search. Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, Linjun Yang. (KDD 2020) [paper]
  • Learning to Build User-tag Profile in Recommendation System. Su Yan, Xin Chen, Ran Huo, Xu Zhang, Leyu Lin. (CIKM 2020) [paper]
  • Managing Diversity in Airbnb Search. Mustafa Abdool, Malay Haldar, Prashant Ramanathan, Tyler Sax, Lanbo Zhang, Aamir Mansawala, Shulin Yang, Thomas Legrand. (KDD 2020) [paper]

Survey/Tutorial

  • Deep Learning for Matching in Search and Recommendation. Jun Xu, Xiangnan He, Hang Li. (SIGIR 2018) [slides][paper]
  • A Survey on Knowledge Graph-Based Recommender Systems. Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong, Qing He. (CoRR 2020) [paper]
  • Graph Learning Approaches to Recommender Systems: A Review. Shoujin Wang, Liang Hu, Yan Wang, Xiangnan He, Quan Z. Sheng, Mehmet A. Orgun, Longbing Cao, Nan Wang, Francesco Ricci, Philip S. Yu. (CoRR 2020) [paper]
  • Adversarial Machine Learning in Recommender Systems: State of the art and Challenges. Yashar Deldjoo, Tommaso Di Noia, Felice Antonio Merra. (CoRR 2020) [paper]
  • A Comparison of Supervised Learning to Match Methods for Product Search. Fatemeh Sarvi, Nikos Voskarides, Lois Mooiman, Sebastian Schelter, Maarten de Rijke. (SIGIR 2020) [paper][code]

Repositories/Resources

ANN

Dataset

Natural Language Inference

  • Adversarial NLI: A New Benchmark for Natural Language Understanding. Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, Douwe Kiela. (ACL 2020) [paper][data][blog]
  • OCNLI: Original Chinese Natural Language Inference. Hai Hu, Kyle Richardson, Liang Xu, Lu Li, Sandra Kuebler, Lawrence S. Moss. (EMNLP 2020) [paper][data]
  • ConjNLI: Natural Language Inference Over Conjunctive Sentences. Swarnadeep Saha, Yixin Nie, Mohit Bansal. (EMNLP 2020) [paper][data]

Recommendation System

  • MIND: A Large-scale Dataset for News Recommendation. Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, Ming Zhou. (ACL 2020) [paper][data]

Chinese Blog

Open Source Agenda is not affiliated with "MatchPapers" Project. README Source: ZhengZixiang/MatchPapers

Open Source Agenda Badge

Open Source Agenda Rating