介绍
髓系恶性肿瘤表现出相当大的异质性,其亚型之间存在重叠的临床和遗传特征。我们提出了一种基于数据的方法,该方法在诊断时整合了突变特征和临床协变量,并将其置于它们概率关系的网络中,从而能够发现患者亚群。其关键优势在于能够将假定的因果方向纳入连接临床和突变特征的边中,并在聚类中恰当地考虑这些因素。在一个包含 1323 名患者的队列中,我们确定了在预后准确性方面优于已建立的风险分类的亚群。我们的方法在未见过的队列中具有良好的泛化能力,基于我们的亚群进行分类同样在预测预后方面具有优势。我们的发现表明,突变模式在髓系恶性肿瘤中往往是共有的,而不同的亚型可能代表着向白血病演变的进化阶段。借助泛癌 TCGA 数据,我们观察到我们的建模框架自然地扩展到其他癌症类型,同时在亚群发现方面仍能提供改进。
Myeloid malignancies exhibit considerable heterogeneity with overlapping
clinical and genetic features among subtypes. We present a data-driven
approach that integrates mutational features and clinical covariates at diag-
nosis within networks of their probabilistic relationships, enabling the dis-
covery of patient subgroups. A key strength is its ability to include presumed
causal directions in the edges linking clinical and mutational features, and
account for them aptly in the clustering. In a cohort of 1323 patients, we
identify subgroups that outperform established risk classi cations in prog-
nostic accuracy. Our approach generalises well to unseen cohorts with clas-
si cation based on our subgroups similarly offering advantages in predicting
prognosis. Our ndings suggest that mutational patterns are often shared
across myeloid malignancies, with distinct subtypes potentially representing
evolutionary stages en route to leukemia. With pancancer TCGA data, we
observe that our modelling framework extends naturally to other cancer types
while still offering improvements in subgroup discovery.
代码
https://github.com/cbg-ethz/myeloid-clustering
参考
- https://github.com/cbg-ethz/myeloid-clustering