论文标题

推进材料科学中的描述符搜索:功能工程和选择策略

Advancing descriptor search in materials science: feature engineering and selection strategies

论文作者

Hoock, Benedikt, Rigamonti, Santiago, Draxl, Claudia

论文摘要

数据驱动材料研究的主要目标是找到最佳的低维描述符,使我们能够预测物理特性,并以人为理解的方式来解释它们。在这项工作中,我们通过压缩感测来促进方法从大量候选特征中识别出描述符。在此范围内,我们开发了工程适当的候选功能的方案,这些方案基于构成材料的简单基本属性,这些构件构成了材料,并且能够按标量数来代表多组分系统。基于交叉验证的特征选择方法是开发用于识别最相关特征的,从而着重于高推广性。我们将方法应用于三元组IV化合物的\ textit {ab intio}数据集,以获取一组描述符,以预测晶格常数和混合能量。特别是,我们从涉及的代数操作以及利用的基本特性量提出了简单的复杂度度量。

A main goal of data-driven materials research is to find optimal low-dimensional descriptors, allowing us to predict a physical property, and to interpret them in a human-understandable way. In this work, we advance methods to identify descriptors out of a large pool of candidate features by means of compressed sensing. To this extent, we develop schemes for engineering appropriate candidate features that are based on simple basic properties of building blocks that constitute the materials and that are able to represent a multi-component system by scalar numbers. Cross-validation based feature-selection methods are developed for identifying the most relevant features, thereby focusing on high generalizability. We apply our approaches to an \textit{ab initio} dataset of ternary group-IV compounds to obtain a set of descriptors for predicting lattice constants and energies of mixing. In particular, we introduce simple complexity measures in terms of involved algebraic operations as well as the amount of utilized basic properties.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源