Abstract
Machine learning may aid the choice of optimal combinations of anticancer drugs by explaining the molecular basis of their synergy. By combining accurate models with interpretable insights, explainable machine learning promises to accelerate data-driven cancer pharmacology. However, owing to the highly correlated and high-dimensional nature of transcriptomic data, naively applying current explainable machine-learning strategies to large transcriptomic datasets leads to suboptimal outcomes. Here by using feature attribution methods, we show that the quality of the explanations can be increased by leveraging ensembles of explainable machine-learning models. We applied the approach to a dataset of 133 combinations of 46 anticancer drugs tested in ex vivo tumour samples from 285 patients with acute myeloid leukaemia and uncovered a haematopoietic-differentiation signature underlying drug combinations with therapeutic synergy. Ensembles of machine-learning models trained to predict drug combination synergies on the basis of gene-expression data may improve the feature attribution quality of complex machine-learning models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The results of this study are in part based on data generated by the Cancer Target Discovery and Development (CTD2) Network (https://ocg.cancer.gov/programs/ctd2/data-portal), established by the National Cancer Institute’s Office of Cancer Genomics. Sequencing data are available in the GDC data portal under dbGaP Study Accession phs001657. The Beat AML patient sample data used in this study were done under an early access agreement, before final accrual, harmonization and public release of the full dataset. As such, the subset of samples included in this study may differ in sample representation, quality-control thresholds and data normalizations from those found in GDC and in the final study describing the full dataset.
The analysis of the haematopoietic signatures in an external dataset used data from the Cancer Dependency Map project (DepMap); specifically, the Genetic Dependency CRISPR assays (DepMap 21Q4 Public+Score, Chronos, ‘CRISPR_gene_effect.csv’) and the expression data (21Q4 Public, ‘CCLE_expression.csv’), as well as the metadata in the Cell Line Sample Info file (‘sample_info.csv’), all accessible from the DepMap portal (https://depmap.org/portal). Source data are provided with this paper.
Code availability
Code necessary to reproduce our experimental findings can be found in Zenodo at https://doi.org/10.5281/zenodo.7689076.
References
Khwaja, A. et al. Acute myeloid leukaemia. Nat. Rev. Dis. Prim. 2, Article 16010 (2016).
Kurtz, S. E. et al. Molecularly targeted drug combinations demonstrate selective effectiveness for myeloid- and lymphoid-derived hematologic malignancies. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1703094114 (2017).
Day, D. & Siu, L. L. Approaches to modernize the combination drug development paradigm. Genome Med. 8, 115 (2016).
O’Neil, J. et al. An unbiased oncology compound screen to identify novel combination strategies. Mol. Cancer Ther. 15, 1155–1162 (2016).
Jia, J. et al. Mechanisms of drug combinations: interaction and network perspectives. Nat. Rev. Drug Discov. 8, 111–128 (2009).
Nair, R., Salinas-Illarena, A. & Baldauf, H.-M. New strategies to treat AML: novel insights into AML survival pathways and combination therapies. Leukemia 35, 299–311 (2021).
Tyner, J. W. & Others, A. Functional genomic landscape of acute myeloid leukaemia. Nature 562, 526–531 (2018).
Schenone, M., Dančík, V., Wagner, B. K. & Clemons, P. A. Target identification and mechanism of action in chemical biology and drug discovery. Nat. Chem. Biol. 9, 232–240 (2013).
Hopkins, A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682–690 (2008).
Calzolari, D. et al. Search algorithms as a framework for the optimization of drug combinations. PLoS Comput. Biol. 4, e1000249 (2008).
Feala, J. D. et al. Systems approaches and algorithms for discovery of combinatorial therapies. Wiley Interdiscip. Rev. Syst. Biol. Med. 2, 181–193 (2010).
Wong, P. K. et al. Closed-loop control of cellular functions using combinatory drugs guided by a stochastic search algorithm. Proc. Natl Acad. Sci. USA 105, 5105–5110 (2008).
Menden, M. P. et al. Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen. Nat. Commun. 10, 2674 (2019).
Preuer, K. et al. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538–1546 (2018).
Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Lundberg, S. M. & Lee, S.-I. in Advances in Neural Information Processing Systems (eds Guyon, I., Von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., & Garnett, R.) 4765–4774 (Curran Associates, Inc., 2017).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3145–3153 (PMLR, 2017).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning, PMLR (eds Precup, D. & Teh, Y. W.) 3319–3328 (JMLR.org, 2017).
Shapley, L. S. A value for n-person games. Class. game theory 69 (1997).
Aas, K., Jullum, M. & Løland, A. Explaining individual predictions when features are dependent: more accurate approximations to Shapley values. Artif. Intell. 298, 103502 (2021).
Koo, P. K. & Ploenzke, M. Improving representations of genomic sequence motifs in convolutional networks with exponential activations. Nat. Mach. Intell. 3, 258–266 (2021).
Schreiber, J. & Singh, R. Machine learning for profile prediction in genomics. Curr. Opin. Chem. Biol. 65, 35–41 (2021).
Covert, I., Lundberg, S. & Lee, S.-I. Explaining by removing: a unified framework for model explanation. J. Mach. Learn. Res. 22, 1–90 (2021).
Kim, N. et al. Prediction of the sequence-specific cleavage activity of Cas9 variants. Nat. Biotechnol. 38, 1328–1336 (2020).
Kim, H. K. et al. Predicting the efficiency of prime editing guide RNAs in human cells. Nat. Biotechnol. 39, 198–206 (2021).
Schultebraucks, K. et al. A validated predictive algorithm of post-traumatic stress course following emergency department admission after a traumatic stressor. Nat. Med. 26, 1084–1088 (2020).
Hyland, S. L. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat. Med. 26, 364–373 (2020).
Meier, F. et al. Deep learning the collisional cross sections of the peptide universe from a million experimental values. Nat. Commun. 12, Article 1185 (2021).
Bar, N. et al. A reference map of potential determinants for the human serum metabolome. Nature 588, 135–140 (2020).
Rodriguez-Perez, R. & Bajorath, J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J. Med. Chem. 63, 8761–8777 (2019).
Rodriguez-Perez, R. & Bajorath, J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J. Comput. Aided Mol. Des. 34, 1013–1026 (2020).
Tang, Y.-C. & Gottlieb, A. Explainable drug sensitivity prediction through cancer pathway enrichment. Sci. Rep. 11, Article 3128 (2021).
Braithwaite, B. et al. Detection of medications associated with Alzheimer’s disease using ensemble methods and cooperative game theory. Int. J. Med. Inform. 141, 104142 (2020).
Breiman, L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16, 199–231 (2001).
Dong, J. & Rudin, C. Variable importance clouds: a way to explore variable importance for the set of good models. Preprint at https://doi.org/10.48550/arXiv.1901.03209 (2019).
Hooker, S., Erhan, D., Kindermans, P.-J. & Kim, B. A benchmark for interpretability methods in deep neural networks. In 33rd Conference on Neural Information Processing Systems (eds Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E. & Garnett, R.) (Curran Associates, Inc., 2019).
Song, L., Bedo, J., Borgwardt, K. M., Gretton, A. & Smola, A. Gene selection via the BAHSIC family of algorithms. Bioinformatics 23, i490–i498 (2007).
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005).
Guyon, I., Weston, J., Barnhill, S. & Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002).
Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
Maslova, A. et al. Deep learning of immune cell differentiation. Proc. Natl Acad. Sci. USA 117, 25655–25666 (2020).
Farzaneh, N., Williamson, C. A., Gryak, J. & Najarian, K. A hierarchical expert-guided machine learning framework for clinical decision support systems: an application to traumatic brain injury prognostication. npj Digit. Med. 4, 78 (2021).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).
King, R. D., Orhobor, O. I. & Taylor, C. C. Cross-validation is safe to use. Nat. Mach. Intell. 3, 276 (2021).
Shwartz-Ziv, R. & Armon, A. Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90 (2022).
Gurska, L. M., Ames, K. & Gritsman, K. Signaling pathways in leukemic stem cells. Adv. Exp. Med. Biol. 1143, 1–39 (2019).
Kumar, A. R., Sarver, A. L., Wu, B. & Kersey, J. H. Meis1 maintains stemness signature in MLL-AF9 leukemia. Blood 115, 3642–3643 (2010).
Liu, J. et al. Meis1 is critical to the maintenance of human acute myeloid leukemia cells independent of MLL rearrangements. Ann. Hematol. 96, 567–574 (2017).
Pei, S. et al. Monocytic subclones confer resistance to venetoclax-based therapy in patients with acute myeloid leukemia. Cancer Discov. 10, 536–551 (2020).
Takam Kamga, P. et al. Prognostic impact of notch signaling in acute myeloid leukemia (AML). Blood 132, 5242 (2018).
Kranc, K. R. et al. Cited2 is an essential regulator of adult hematopoietic stem cells. Cell Stem Cell 5, 659–665 (2009).
Korthuis, P. M. et al. CITED2-mediated human hematopoietic stem cell maintenance is critical for acute myeloid leukemia. Leukemia 29, 625–635 (2015).
Tanaka, M. et al. Targeted disruption of oncostatin M receptor results in altered hematopoiesis. Blood 102, 3154–3162 (2003).
Zhao, X., Li, Y. & Wu, H. A novel scoring system for acute myeloid leukemia risk assessment based on the expression levels of six genes. Int. J. Mol. Med. 42, 1495–1507 (2018).
Zhang, N., Chen, Y., Lou, S., Shen, Y. & Deng, J. A six-gene-based prognostic model predicts complete remission and overall survival in childhood acute myeloid leukemia. Onco. Targets Ther. 12, 6591–6604 (2019).
Lin, W. et al. SLC7A11/xCT in cancer: biological functions and therapeutic implications. Am. J. Cancer Res. 10, 3106–3126 (2020).
Kornblau, S. M. et al. Recurrent expression signatures of cytokines and chemokines are present and are independently prognostic in acute myelogenous leukemia and myelodysplasia. Blood 116, 4251–4261 (2010).
Goenka, S. & Kaplan, M. H. Transcriptional regulation by STAT6. Immunol. Res. 50, 87–96 (2011).
Peña-Martínez, P. et al. Interleukin 4 induces apoptosis of acute myeloid leukemia cells in a Stat6-dependent manner. Leukemia 32, 588–596 (2018).
Bunting, K. D. et al. Increased numbers of committed myeloid progenitors but not primitive hematopoietic stem/progenitors in mice lacking STAT6 expression. J. Leukoc. Biol. 76, 484–490 (2004).
Li, M. J. et al. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res. 40, D1047–D1054 (2012).
Churpek, J. E. et al. Genomic analysis of germ line and somatic variants in familial myelodysplasia/acute myeloid leukemia. Blood 126, 2484–2490 (2015).
Lo, F.-Y. et al. Metabolic alterations may contribute to cabozantinib resistance in acute myeloid leukemia cells with FLT3-ITD. Blood 132, 2785 (2018).
Gal, H. et al. Gene expression profiles of AML derived stem cells; similarity to hematopoietic stem cells. Leukemia 20, 2147–2154 (2006).
Gentles, A. J., Plevritis, S. K., Majeti, R. & Alizadeh, A. A. Association of a leukemic stem cell gene expression signature with clinical outcomes in acute myeloid leukemia. JAMA 304, 2706–2715 (2010).
Pollyea, D. A. et al. Venetoclax with azacitidine disrupts energy metabolism and targets leukemia stem cells in patients with acute myeloid leukemia. Nat. Med. 24, 1859–1866 (2018).
Kuusanmäki, H. et al. Phenotype-based drug screening reveals association between venetoclax response and differentiation stage in acute myeloid leukemia. Haematologica 105, 708–720 (2020).
Jones, C. L. et al. Cysteine depletion targets leukemia stem cells through inhibition of electron transport complex II. Blood 134, 389–394 (2019).
Stevens, B. M. et al. Fatty acid metabolism underlies venetoclax resistance in acute myeloid leukemia stem cells. Nat. Cancer 1, 1176–1187 (2020).
Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).
Kurtz, S. E. et al. Dual inhibition of JAK1/2 kinases and BCL2: a promising therapeutic strategy for acute myeloid leukemia. Leukemia 32, 2025–2028 (2018).
Grabisch, M. & Roubens, M. An axiomatic approach to the concept of interaction among players in cooperative games. Int. J. Game Theory 28, 547–565 (1999).
Pollyea, D. A., Amaya, M., Strati, P. & Konopleva, M. Y. Venetoclax for AML: changing the treatment paradigm. Blood Adv. 3, 4326–4335 (2019).
Karjalainen, R. et al. Elevated expression of S100A8 and S100A9 correlates with resistance to the BCL-2 inhibitor venetoclax in AML. Leukemia 33, 2548–2553 (2019).
Lannert, H. et al. Expression of S100 proteins in normal human hematopoietic stem cells and in AML. J. Clin. Oncol. 26, 7072 (2008).
Han, L. et al. Concomitant targeting of BCL2 with venetoclax and MAPK signaling with cobimetinib in acute myeloid leukemia models. Haematologica 105, 697–707 (2020).
Bock, F. J., Cloix, C., Zerbst, D. & Tait, S. W. G. Apoptosis-induced FGF signalling promotes non-cell autonomous resistance to cell death. bioRxiv (2020).
Lamba, J. K. Genetic factors influencing cytarabine therapy. Pharmacogenomics 10, 1657–1674 (2009).
DeGrave, A. J., Janizek, J. D. & Lee, S.-I. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3, 610–619 (2021).
Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673 (2020).
Kundu, S. AI in medicine must be explainable. Nat. Med. 27, 1328 (2021).
Bzdok, D., Engemann, D. & Thirion, B. Inference and prediction diverge in biomedicine. Patterns 1, 100119 (2020).
Efron, B. Prediction, estimation, and attribution. J. Am. Stat. Assoc. 115, 636–655 (2020).
Lee, S.-I. et al. A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia. Nat. Commun. 9, 42 (2018).
Erion, G., Janizek, J. D., Sturmfels, P., Lundberg, S. & Lee, S.-I. Learning explainable models using attribution priors. Preprint at arXiv1906.10670v1 (2019).
Weinberger, E., Janizek, J. & Lee, S.-I. Learning deep attribution priors based on prior knowledge. Preprint at https://doi.org/10.48550/arXiv.1912.10065 (2019).
Kuenzi, B. M. et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell 38, 672–684 (2020).
Gut, G., Stark, S. G., Rätsch, G. & Davidson, N. R. PmVAE: learning interpretable single-cell representations with pathway modules. Preprint at bioRxiv https://doi.org/10.1101/2021.01.28.428664 (2021).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
Dincer, A. B., Celik, S., Hiranuma, N. & Lee, S.-I. DeepProfile: deep learning of cancer molecular profiles for precision medicine. Preprint at bioRxiv https://doi.org/10.1101/278739 (2018).
Štrumbelj, E. & Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41, 647–665 (2014).
Chen, H., Janizek, J. D., Lundberg, S. & Lee, S.-I. True to the model or true to the data? Preprint at https://doi.org/10.48550/arXiv.2006.16234 (2020).
Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch. Preprint at https://doi.org/10.48550/arXiv.2009.07896 (2020).
Ribeiro, M. T., Singh, S. & Guestrin, C. Why should I trust you?: Explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144 (ACM, 2016).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8026–8037 (2019).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Nguyen, G., Kim, D. & Nguyen, A. The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. Adv. Neural Inf. Process. Syst. 34, 26422–26436 (2021).
Covert, I., Lundberg, S. M. & Lee, S.-I. Understanding global feature contributions with additive importance measures. Adv. Neural Inf. Process. Syst. 33, 17212–17223 (2020).
Adebayo, J., Muelly, M., Liccardi, I. & Kim, B. Debugging tests for model explanations. Adv. Neural Inf. Process. Syst. 33, 700–712 (2020).
Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Edgar, R. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).
Chou, T.-C. Drug combination studies and their synergy quantification using the Chou-Talalay Method. Cancer Res. 70, 440–446 (2010).
Narahari, Y. Game Theory and Mechanism Design Vol. 4 (World Scientific, 2014).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Hagberg, A., Swart, P. & S Chult, D. Exploring Network Structure, Dynamics, and Function Using NetworkX (US Department of Energy, 2008).
Acknowledgements
S.-I.L. discloses support for the research described in this study from the National Science Foundation (CAREER DBI-1552309 and DBI-1759487), the National Institutes of Health (R35 GM 128638 and R01 NIA AG 061132) and the American Cancer Society (127332-RSG-15-097-01-TBG). K.N. discloses support for the research described in this study from the National Institutes of Health (R37CA225655 and P01HL142494).
Author information
Authors and Affiliations
Contributions
J.D.J. and S.-I.L. conceived the study. J.D.J. prepared datasets, designed experiments and wrote software. S.-I.L. and K.N. jointly supervised the study. J.D.J., K.N. and S.-I.L. wrote the manuscript. A.B.D. ran experiments and prepared datasets. H.C. and S.C. helped design experiments. W.C. helped maintain the software and assisted with experiments.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Biomedical Engineering thanks María Rodríguez Martínez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Descriptive statistics of Beat AML cohort.
Histograms showing the relative density of prior treatment regimens, age, cause of death, and prior treatment types in the cohort of 285 patients in our dataset, which consisted of 12,362 samples with paired gene expression and drug synergy measurements for 133 pairs of 46 anticancer drugs.
Extended Data Fig. 2 Feature discovery benchmark.
For each synthetic or semi-synthetic dataset (a), we trained a variety of models (b) including neural networks, GBMs, support vector machines, and elastic net regression, as well as univariate statistics (Pearson correlation). For the machine-learning models, we then used SAGE to generate global Shapley value feature attributions (c), ranked the features according to the magnitude of their attributions (d), and compared the ranked list generated by each method to the binary ground truth importance vector (e). To measure the feature discovery quality of each method, we plotted how many “true” features are found cumulatively at each point in the ranked feature list (f), then summarized the curve generated by this procedure by measuring the AUFDC. This score is then rescaled so that a score of 0 represents random performance while a score of 1 represents perfect performance.
Extended Data Fig. 3 Predictive performance of models trained with synthetic datasets.
Predictive performance, as measured by the Pearson correlation of the predicted and true labels for the models trained in the benchmark presented in Fig. 2.
Extended Data Fig. 4 Ensembling overcomes the variability in attributions present in individual models.
To understand why ensemble models were able to attain better feature discovery performance than single models, we compared the characteristics of the attribution vectors of XGBoost models trained on bootstrap resampled versions of a correlated groups dataset with a step-function outcome. a, Heatmap of feature attributions for 20 individual XGBoost models. b, Heatmap of feature attributions for 20 ensembles of XGBoost models. c, Pairs of attribution vectors from ensembled models are more similar across bootstrap resamples of the dataset than attribution vectors from single models, as measured by cosine similarity. d, Attribution vectors from ensembled models place a larger proportion of their importance on a smaller set of features than attribution vectors from single models, as measured by the Gini coefficient of the attribution vectors, a measure of vector sparseness.
Extended Data Fig. 5 EXPRESS improves feature attributions of deep learning models.
Comparison of feature discovery performance between individual deep learning models (gray) and ensembles of deep learning models (red) across all 12 dataset types from the synthetic benchmark. Three separate feature attribution methods are tested for each model: DeepLift, Integrated Gradients, and SHAP (in this case implemented as global attributions using the SAGE software package).
Extended Data Fig. 6 EXPRESS improves feature attributions of XGBoost models.
Comparison of feature discovery performance between individual XGBoost models (gray) and ensembles of XGBoost models (red) across all 12 dataset types from the synthetic benchmark. Three separate feature attribution methods are tested for each model: a) cover, b) gain, and c) SHAP.
Extended Data Fig. 7 EXPRESS improves feature attributions of deep learning models on additional supplementary datasets.
Comparison of feature discovery performance between individual deep learning models (gray) and ensembles of deep learning models (red) across all 25 supplementary dataset types (see methods section on supplementary dataset types). Three separate feature attribution methods are tested for each model: DeepLift, Integrated Gradients, and c) SHAP (in this case implemented as SAGE). We find that for 73% of comparisons, EXPRESS improves feature discovery performance (for associated statistics, see Supplementary Dataset 25).
Extended Data Fig. 8 EXPRESS improves feature attributions of XGBoost models on additional supplementary datasets.
Comparison of feature discovery performance between individual XGBoost models (gray) and ensembles of XGBoost models (red) across all all 25 supplementary dataset types (see methods section on supplementary dataset types). Three separate feature attribution methods are tested for each model: Cover, Gain, and SHAP. We find that for 76% of comparisons, EXPRESS improves feature discovery performance (for associated statistics, see Supplementary Dataset 26).
Extended Data Fig. 9 EXPRESS improves feature attributions independently of improvement in model performance.
For both XGBoost models (a, trained on the Beat AML dataset with the AND function outcome; b, trained on the Beat AML dataset with the multiplicative outcome) and deep learning models (c, trained on the Beat AML dataset with the AND function outcome; and d, trained on the Beat AML dataset with the multiplicative outcome), we see that even after controlling for the effect of model ensembles on predictive performance by stratifying models (low, intermediate, and high predictive performance), within each stratification ensemble models have significantly higher AUFDC. Significance assessed by two-sided Mann−Whitney U-test, * represents p < 0.05, ** represents $p < 0.01, *** represents p < 0.001, and **** represents p < 0.0001 (full statistics in Supplementary Dataset 27). The boxes mark the quartiles (25th, 50th, and 75th percentiles) of the distribution within a given predictive performance stratification, while the whiskers extend to show the minimum and maximum of the distribution (excluding outliers).
Extended Data Fig. 10 Ensembling improves XGBoost attributions more than explicit regularization.
Using the synthetic datasets with real AML gene expression features, we compare the increase in AUFDC seen with explicit regularization, such as per-tree column dropout and L1 regularization, with ensembling. For the synthetic datasets with AML features and the AND true function, we see that ensembles improve AUFDC significantly more than column dropout (a, two-sided Mann−Whitney U-test, U = 2.83, P = \(4.7 \times 10^{ - 3}\)) and L1 regularization (c, U = 3.00, P = \(2.7 \times 10^{ - 3}\)). For the synthetic datasets with AML features and the multiplicative true function, we see that ensembles improve AUFDC significantly more than column dropout (b, U = 4.04, \(5.25 \times 10^{ - 5}\)) and L1 regularization (d, U = 4.87, P = \(1.12 \times 10^{ - 6}\)).
Supplementary information
Source data
Data
Additional supplementary datasets.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Janizek, J.D., Dincer, A.B., Celik, S. et al. Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models. Nat. Biomed. Eng 7, 811–829 (2023). https://doi.org/10.1038/s41551-023-01034-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41551-023-01034-0
This article is cited by
-
MFSynDCP: multi-source feature collaborative interactive learning for drug combination synergy prediction
BMC Bioinformatics (2024)
-
Algorithmic fairness in artificial intelligence for medicine and healthcare
Nature Biomedical Engineering (2023)