论文标题
集体知识项目:通过开放的API,可重复使用的最佳实践和MLOPS使ML模型更便宜,可重现
The Collective Knowledge project: making ML models more portable and reproducible with open APIs, reusable best practices and MLOps
论文作者
论文摘要
本文概述了集体知识技术(CK或CKnowledge)。 CK试图使重现ML&Systems研究,在生产中部署ML模型并使其适应不断更改数据集,模型,研究技术,软件和硬件的更容易变得更加容易。 CK概念是将复杂的系统和临时研究项目分解为具有统一API,CLI和JSON META描述的可重复使用的子组件。这些组件可以使用DEVOPS原理与可重复使用的自动化操作,软件检测插件,元软件包和裸露的优化参数相结合。 CK工作流可以自动插入来自不同供应商的不同模型,数据和工具,同时以各种平台和环境的统一方式构建,运行和基准研究代码。此类工作流程还有助于执行整个系统优化,重现结果,并使用CK平台上的公共或私人记分板(https://cknowledge.io)进行比较。例如,在工业合作伙伴中成功验证了模块化的CK方法,以自动共同设计并优化软件,硬件和机器学习模型,以根据速度,准确性,能量,大小和其他特征来重现和高效的对象检测。长期目标是通过帮助研究人员和实践者共享和重复使用开放CK API来共享和重复其知识,经验,最佳实践,工件和技术来简化和加速ML模型和系统的开发和部署。
This article provides an overview of the Collective Knowledge technology (CK or cKnowledge). CK attempts to make it easier to reproduce ML&systems research, deploy ML models in production, and adapt them to continuously changing data sets, models, research techniques, software, and hardware. The CK concept is to decompose complex systems and ad-hoc research projects into reusable sub-components with unified APIs, CLI, and JSON meta description. Such components can be connected into portable workflows using DevOps principles combined with reusable automation actions, software detection plugins, meta packages, and exposed optimization parameters. CK workflows can automatically plug in different models, data and tools from different vendors while building, running and benchmarking research code in a unified way across diverse platforms and environments. Such workflows also help to perform whole system optimization, reproduce results, and compare them using public or private scoreboards on the CK platform (https://cKnowledge.io). For example, the modular CK approach was successfully validated with industrial partners to automatically co-design and optimize software, hardware, and machine learning models for reproducible and efficient object detection in terms of speed, accuracy, energy, size, and other characteristics. The long-term goal is to simplify and accelerate the development and deployment of ML models and systems by helping researchers and practitioners to share and reuse their knowledge, experience, best practices, artifacts, and techniques using open CK APIs.