论文标题
使用特定类型的图书馆的社交机器人的深度手势生成
Deep Gesture Generation for Social Robots Using Type-Specific Libraries
论文作者
论文摘要
诸如对话手势之类的肢体语言是缓解交流的有力方法。会话手势不仅使演讲更加活跃,而且还包含有助于强调讨论中重要信息的语义含义。在机器人技术领域中,给对话剂(人形机器人或虚拟化身)正确使用手势的能力至关重要,但仍然是一个非常困难的任务。这是因为只给出了文本作为输入,因此有许多可能性和歧义可以产生适当的手势。与以前的作品不同,我们提出了一种新方法,该方法明确考虑了手势类型,以减少这些歧义并产生类似人类的对话性手势。我们提出的系统的关键是一个建立在TED数据集上的新手势数据库,它使我们可以将单词映射到三种类型的手势之一:“ Imamistic”手势,它表达语音的内容,“ Beat”手势,强调单词和“无手势”。我们提出了一个系统,该系统首先将输入文本中的单词映射到其相应的手势类型,生成特定于类型的手势,然后将生成的手势结合到一个最终的平滑手势中。在我们的比较实验中,在化身和类人形机器人的用户研究中证实了该方法的有效性。
Body language such as conversational gesture is a powerful way to ease communication. Conversational gestures do not only make a speech more lively but also contain semantic meaning that helps to stress important information in the discussion. In the field of robotics, giving conversational agents (humanoid robots or virtual avatars) the ability to properly use gestures is critical, yet remain a task of extraordinary difficulty. This is because given only a text as input, there are many possibilities and ambiguities to generate an appropriate gesture. Different to previous works we propose a new method that explicitly takes into account the gesture types to reduce these ambiguities and generate human-like conversational gestures. Key to our proposed system is a new gesture database built on the TED dataset that allows us to map a word to one of three types of gestures: "Imagistic" gestures, which express the content of the speech, "Beat" gestures, which emphasize words, and "No gestures." We propose a system that first maps the words in the input text to their corresponding gesture type, generate type-specific gestures and combine the generated gestures into one final smooth gesture. In our comparative experiments, the effectiveness of the proposed method was confirmed in user studies for both avatar and humanoid robot.