论文标题
BEW:回答与业务与实体相关的网络问题
Bew: Towards Answering Business-Entity-Related Web Questions
论文作者
论文摘要
我们介绍BEWQA,该系统专门旨在回答我们称为BEW问题的一类问题。 BEW问题与餐馆,酒店和电影院等企业/服务有关;例如,“直到欢乐时光什么时候?”。这些问题很具有挑战性,因为答案是在开放域网络中找到的,在没有周围环境的短句子中存在,并且是动态的,因为可以频繁地更新网页信息。在这些条件下,现有的质量检查系统的性能较差。我们提出了一种称为BEWQA的实用方法,可以通过挖掘与业务相关的网页的模板并使用模板指导搜索来回答BEW查询。我们通过利用聚合网站来自动提取模板,这些网站汇总了有关域中业务实体(例如餐厅)的信息。我们通过从提取的模板中识别最有可能包含答案的部分来回答给定的问题。通过这样做,即使答案跨度没有足够的上下文,我们也可以提取答案。重要的是,BEWQA不需要任何培训。我们众包在餐厅域中有1066个BEW问题和地面答案的新数据集。与最先进的质量检查模型相比,BEWQA的F1得分提高了27%。与商业搜索引擎相比,BEWQA回答了29%的BEW问题。
We present BewQA, a system specifically designed to answer a class of questions that we call Bew questions. Bew questions are related to businesses/services such as restaurants, hotels, and movie theaters; for example, "Until what time is happy hour?". These questions are challenging to answer because the answers are found in open-domain Web, are present in short sentences without surrounding context, and are dynamic since the webpage information can be updated frequently. Under these conditions, existing QA systems perform poorly. We present a practical approach, called BewQA, that can answer Bew queries by mining a template of the business-related webpages and using the template to guide the search. We show how we can extract the template automatically by leveraging aggregator websites that aggregate information about business entities in a domain (e.g., restaurants). We answer a given question by identifying the section from the extracted template that is most likely to contain the answer. By doing so we can extract the answers even when the answer span does not have sufficient context. Importantly, BewQA does not require any training. We crowdsource a new dataset of 1066 Bew questions and ground-truth answers in the restaurant domain. Compared to state-of-the-art QA models, BewQA has a 27 percent point improvement in F1 score. Compared to a commercial search engine, BewQA answered correctly 29% more Bew questions.