SE3M：使用预训练的嵌入模型进行软件努力估算的模型

论文标题

SE3M：使用预训练的嵌入模型进行软件努力估算的模型

SE3M: A Model for Software Effort Estimation Using Pre-trained Embedding Models

论文作者

Fávero, Eliane M. De Bortoli, Casanova, Dalcimar, Pimentel, Andrey Ricardo

论文摘要

基于要求文本的估算努力提出了许多挑战，尤其是在获得可行的特征来推断努力方面。旨在探索一种更有效的技术来代表文本需求，以类比来推断努力估计，本文建议评估预训练的嵌入模型的有效性。为此，使用了两种嵌入方法，即无上下文和上下文化的模型。两种方法的通用预训练模型都经历了微调过程。生成的模型被用作具有线性输出的应用深度学习体系结构中的输入。结果非常有前途，意识到预先训练的合并模型可用于仅根据要求文本来估计软件工作。我们强调了在单个项目存储库中进行微调的预训练的BERT模型所获得的结果，其值是平均绝对误差（MAE）为4.25，仅为0.17的标准偏差，与类似作品相比，这是非常正面的结果。提出的估计方法的主要优点是可靠性，概括，速度和低计算过程提供的计算成本的可能性，以及推断新或现有要求的可能性。

Estimating effort based on requirement texts presents many challenges, especially in obtaining viable features to infer effort. Aiming to explore a more effective technique for representing textual requirements to infer effort estimates by analogy, this paper proposes to evaluate the effectiveness of pre-trained embeddings models. For this, two embeddings approach, context-less and contextualized models are used. Generic pre-trained models for both approaches went through a fine-tuning process. The generated models were used as input in the applied deep learning architecture, with linear output. The results were very promising, realizing that pre-trained incorporation models can be used to estimate software effort based only on requirements texts. We highlight the results obtained to apply the pre-trained BERT model with fine-tuning in a single project repository, whose value is the Mean Absolute Error (MAE) is 4.25 and the standard deviation of only 0.17, which represents a result very positive when compared to similar works. The main advantages of the proposed estimation method are reliability, the possibility of generalization, speed, and low computational cost provided by the fine-tuning process, and the possibility to infer new or existing requirements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题