论文标题
使用自动单或多型答案提取的活动报告分析
Activity report analysis with automatic single or multispan answer extraction
论文作者
论文摘要
在批次(物联网)时代,我们被启用的AL启用设备所包围,可以将图像,视频,音频和传感器信号转录为文本说明。当在活动报告中捕获此类抄录以进行监视,生命记录和异常检测应用程序时,用户通常会请求摘要或询问有关他们感兴趣的报告的某些部分的目标问题。取决于上下文和问题类型,问题答案(QA)系统是否需要自动确定答案封面封面是单跨跨语或多型范围的单型文本文本或多型文本文本。当前可用的QA数据集主要关注单个跨度响应(例如小队[4])或包含具有多个跨度答案的示例比例较低(例如Drop [3])。为了在描述的用例中研究单个/多跨度答案的自动选择,我们创建了一个新的智能家居环境数据集,该数据集由问题和多个跨度答案配对,具体取决于问题和上下文。此外,我们提出了一个基于罗伯塔[6]的多个跨度提取问题答案(MSEQA)模型,返回给定问题的适当答案跨度。我们的实验表明,所提出的模型在我们的数据集上优于最先进的质量检查模型,同时在已发布的单个单/多跨度任务数据集上提供了可比的性能。
In the era of loT (Internet of Things) we are surrounded by a plethora of Al enabled devices that can transcribe images, video, audio, and sensors signals into text descriptions. When such transcriptions are captured in activity reports for monitoring, life logging and anomaly detection applications, a user would typically request a summary or ask targeted questions about certain sections of the report they are interested in. Depending on the context and the type of question asked, a question answering (QA) system would need to automatically determine whether the answer covers single-span or multi-span text components. Currently available QA datasets primarily focus on single span responses only (such as SQuAD[4]) or contain a low proportion of examples with multiple span answers (such as DROP[3]). To investigate automatic selection of single/multi-span answers in the use case described, we created a new smart home environment dataset comprised of questions paired with single-span or multi-span answers depending on the question and context queried. In addition, we propose a RoBERTa[6]-based multiple span extraction question answering (MSEQA) model returning the appropriate answer span for a given question. Our experiments show that the proposed model outperforms state-of-the-art QA models on our dataset while providing comparable performance on published individual single/multi-span task datasets.