药物对评分图的分布式表示

论文标题

药物对评分图的分布式表示

Distributed representations of graphs for drug pair scoring

论文作者

Scherer, Paul, Liò, Pietro, Jamnik, Mateja

论文摘要

在本文中，我们研究了在药物对评分的背景下，将图形的分布式表示形式纳入模型的实用性和实用性。我们认为，现实世界的增长和更新毒品对分数数据集的周期，颠覆了与分布式表示相关的跨传导学习的局限性。此外，我们认为，由于原子类型有限和对化学强制执行的键合模式的限制，引起药物集的离散子结构模式的词汇并不大。基于这个借口，我们探讨了药物对评分任务（例如药物协同，多药和药物 - 药物相互作用预测）的分子图的分布式表示的有效性。为了实现这一目标，我们提出了一种学习和将图形分布式表示形式学习到统一的药物对评分框架中的方法。随后，我们扩大了许多最新和最先进的模型来利用我们的嵌入。我们从经验上表明，这些嵌入的合并改善了不同药物对评分任务的几乎每个模型的下游性能，即使是原始模型也不是为了设计的。我们公开释放所有药物嵌入，用于Dugcombdb，Drugcomb，Drugbankddi和Twosides数据集。

In this paper we study the practicality and usefulness of incorporating distributed representations of graphs into models within the context of drug pair scoring. We argue that the real world growth and update cycles of drug pair scoring datasets subvert the limitations of transductive learning associated with distributed representations. Furthermore, we argue that the vocabulary of discrete substructure patterns induced over drug sets is not dramatically large due to the limited set of atom types and constraints on bonding patterns enforced by chemistry. Under this pretext, we explore the effectiveness of distributed representations of the molecular graphs of drugs in drug pair scoring tasks such as drug synergy, polypharmacy, and drug-drug interaction prediction. To achieve this, we present a methodology for learning and incorporating distributed representations of graphs within a unified framework for drug pair scoring. Subsequently, we augment a number of recent and state-of-the-art models to utilise our embeddings. We empirically show that the incorporation of these embeddings improves downstream performance of almost every model across different drug pair scoring tasks, even those the original model was not designed for. We publicly release all of our drug embeddings for the DrugCombDB, DrugComb, DrugbankDDI, and TwoSides datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题