论文标题

动态替代切换:在线建议中的样本搜索计算机配置

Dynamic Surrogate Switching: Sample-Efficient Search for Factorization Machine Configurations in Online Recommendations

论文作者

Škrlj, Blaž, Schwartz, Adi, Ferlež, Jure, Kopič, Davorin, Ziporin, Naama

论文摘要

超参数优化是识别给定的机器学习模型的适当的超参数配置的过程。对于较小的数据集,可以进行详尽的搜索。但是,当数据大小和模型复杂性增加时,配置评估的数量成为主要计算瓶颈。解决此类问题的有希望的范式是基于替代物的优化。此范式基础的主要思想考虑了超参数空间与输出(目标)空间之间关系的增量更新模型;该模型的数据是通过评估主学习引擎来获得的,例如基于计算机的模型。通过学习近似高参数目标关系,替代(机器学习)模型可用于评分大量的超参数配置,并探索与直接机器学习引擎评估达到的配置空间的一部分。通常,在优化初始化之前选择替代物,并且在搜索过程中保持不变。我们调查了在优化本身期间替代物的动态切换是否是为大规模在线建议选择最合适的基于计算机的模型的实际相关性的明智想法。我们对包含数亿个实例的数据集进行了基准测试,以针对既定基线,例如随机森林和高斯基于过程的替代物。结果表明,替代转换可以提供良好的性能,同时考虑更少的学习引擎评估。

Hyperparameter optimization is the process of identifying the appropriate hyperparameter configuration of a given machine learning model with regard to a given learning task. For smaller data sets, an exhaustive search is possible; However, when the data size and model complexity increase, the number of configuration evaluations becomes the main computational bottleneck. A promising paradigm for tackling this type of problem is surrogate-based optimization. The main idea underlying this paradigm considers an incrementally updated model of the relation between the hyperparameter space and the output (target) space; the data for this model are obtained by evaluating the main learning engine, which is, for example, a factorization machine-based model. By learning to approximate the hyperparameter-target relation, the surrogate (machine learning) model can be used to score large amounts of hyperparameter configurations, exploring parts of the configuration space beyond the reach of direct machine learning engine evaluation. Commonly, a surrogate is selected prior to optimization initialization and remains the same during the search. We investigated whether dynamic switching of surrogates during the optimization itself is a sensible idea of practical relevance for selecting the most appropriate factorization machine-based models for large-scale online recommendation. We conducted benchmarks on data sets containing hundreds of millions of instances against established baselines such as Random Forest- and Gaussian process-based surrogates. The results indicate that surrogate switching can offer good performance while considering fewer learning engine evaluations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源