零射击多模式艺术家控制的检索和3D对象集的探索

论文标题

零射击多模式艺术家控制的检索和3D对象集的探索

Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

论文作者

Schlachter, Kristofer, Ahlbrand, Benjamin, Wang, Zhu, Ortenzi, Valerio, Perlin, Ken

论文摘要

创建3D内容时，通常需要高度专业的技能来设计和生成对象和其他资产的模型。我们通过从多模式输入（包括2D草图，图像和文本）中检索高质量的3D资产来解决此问题。我们使用夹子，因为它为高级潜在特征提供了桥梁。我们使用这些功能来执行多模式融合，以解决影响常见数据驱动方法的缺乏艺术控制。我们的方法通过使用输入潜在嵌入的组合，可以通过3D资产数据库进行多模式的条件特征驱动的检索。我们探讨了不同输入类型和加权方法的特征嵌入不同组合的影响。

When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题