在TF等级中学习与BERT一起学习

论文标题

在TF等级中学习与BERT一起学习

Learning-to-Rank with BERT in TF-Ranking

论文作者

Han, Shuguang, Wang, Xuanhui, Bendersky, Mike, Najork, Marc

论文摘要

本文介绍了用于文档（RE）排名的机器学习算法，其中首先使用BERT [1]对查询和文档进行编码，最重要的是，使用TF-Ranking（TFR）[2]应用学习对级别（LTR）模型，以进一步优化排名表现。事实证明，这种方法在MARCO的公共基准中有效[3]。我们的前两个意见可以实现通过重新排列任务的最佳表现[4]，也是截至2020年4月10日[5]的通道全等级任务的第二好的表现。为了利用预先训练的语言模型的最近发展，我们最近整合了Roberta [6]和Electra [7]。我们的最新提交的提交将我们以前的最先进的重新排列绩效提高了4.3％[8]，截至2020年6月8日，全排名任务的第三最佳性能[9]。他们俩都证明了将排名损失与BERT表示与文档排名相结合的有效性。

This paper describes a machine learning algorithm for document (re)ranking, in which queries and documents are firstly encoded using BERT [1], and on top of that a learning-to-rank (LTR) model constructed with TF-Ranking (TFR) [2] is applied to further optimize the ranking performance. This approach is proved to be effective in a public MS MARCO benchmark [3]. Our first two submissions achieve the best performance for the passage re-ranking task [4], and the second best performance for the passage full-ranking task as of April 10, 2020 [5]. To leverage the lately development of pre-trained language models, we recently integrate RoBERTa [6] and ELECTRA [7]. Our latest submissions improve our previously state-of-the-art re-ranking performance by 4.3% [8], and achieve the third best performance for the full-ranking task [9] as of June 8, 2020. Both of them demonstrate the effectiveness of combining ranking losses with BERT representations for document ranking.

下载PDF全文

下载文献需遵守相关版权规定

论文标题