论文标题
ESA-ARIEL数据挑战神经2022:大气研究简介和大气大挑战(ABC)数据库的介绍
ESA-Ariel Data Challenge NeurIPS 2022: Introduction to exo-atmospheric studies and presentation of the Atmospheric Big Challenge (ABC) Database
论文作者
论文摘要
这是欧洲航行性探索的令人兴奋的时代。最近推出的JWST以及Ariel,Twinkle和Elt等即将到来的太空任务将为行星形成和进化的复杂过程及其与大气组成的连接带来新的见解。但是,随着新的机会,新的挑战带来了新的挑战。系外行星气氛的领域已经在数据的数量和数据质量和机器学习(ML)技术方面陷入困境,这是一种有希望的选择。开发此类技术是一项跨学科的任务,它需要该领域的领域知识,访问相关工具以及有关当前ML模型功能和局限性的专家见解。到目前为止,这些严格的要求将现场ML的发展限制为一些孤立的计划。在本文中,我们介绍了大气中的大挑战数据库(ABC数据库),该数据库是一个精心设计的,有条理的和公开可用的数据库,该数据库专用于在超球门研究的背景下研究逆问题。我们已经生成了105,887个正向模型和26,109个互补的后验分布,该分布用嵌套采样算法产生。除数据库外,本文还为有兴趣潜入大气研究的复杂性的非场专家提供了无术的介绍。该数据库构成了多种研究方向的基础,包括但不限于开发快速推理技术,基准测试模型性能并减轻数据漂移。神经ARIEL ML数据挑战2022中证明了该数据库的成功应用。
This is an exciting era for exo-planetary exploration. The recently launched JWST, and other upcoming space missions such as Ariel, Twinkle and ELTs are set to bring fresh insights to the convoluted processes of planetary formation and evolution and its connections to atmospheric compositions. However, with new opportunities come new challenges. The field of exoplanet atmospheres is already struggling with the incoming volume and quality of data, and machine learning (ML) techniques lands itself as a promising alternative. Developing techniques of this kind is an inter-disciplinary task, one that requires domain knowledge of the field, access to relevant tools and expert insights on the capability and limitations of current ML models. These stringent requirements have so far limited the developments of ML in the field to a few isolated initiatives. In this paper, We present the Atmospheric Big Challenge Database (ABC Database), a carefully designed, organised and publicly available database dedicated to the study of the inverse problem in the context of exoplanetary studies. We have generated 105,887 forward models and 26,109 complementary posterior distributions generated with Nested Sampling algorithm. Alongside with the database, this paper provides a jargon-free introduction to non-field experts interested to dive into the intricacy of atmospheric studies. This database forms the basis for a multitude of research directions, including, but not limited to, developing rapid inference techniques, benchmarking model performance and mitigating data drifts. A successful application of this database is demonstrated in the NeurIPS Ariel ML Data Challenge 2022.