Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models

Abstract

Machine learning may aid the choice of optimal combinations of anticancer drugs by explaining the molecular basis of their synergy. By combining accurate models with interpretable insights, explainable machine learning promises to accelerate data-driven cancer pharmacology. However, owing to the highly correlated and high-dimensional nature of transcriptomic data, naively applying current explainable machine-learning strategies to large transcriptomic datasets leads to suboptimal outcomes. Here by using feature attribution methods, we show that the quality of the explanations can be increased by leveraging ensembles of explainable machine-learning models. We applied the approach to a dataset of 133 combinations of 46 anticancer drugs tested in ex vivo tumour samples from 285 patients with acute myeloid leukaemia and uncovered a haematopoietic-differentiation signature underlying drug combinations with therapeutic synergy. Ensembles of machine-learning models trained to predict drug combination synergies on the basis of gene-expression data may improve the feature attribution quality of complex machine-learning models.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the study design.
Fig. 2: Benchmark metric reveals the impact of nonlinearity and correlation on feature discovery.
Fig. 3: Explaining ensembles helps overcome instability in feature discovery performance for single models.
Fig. 4: Comparison of predictive performance between model classes across four stratification settings.
Fig. 5: Transcriptomic factors affecting anti-AML drug combination synergy.
Fig. 6: Transcriptomic factors affecting synergy of combinations including specific drugs.

Similar content being viewed by others

Data availability

The results of this study are in part based on data generated by the Cancer Target Discovery and Development (CTD2) Network (https://ocg.cancer.gov/programs/ctd2/data-portal), established by the National Cancer Institute’s Office of Cancer Genomics. Sequencing data are available in the GDC data portal under dbGaP Study Accession phs001657. The Beat AML patient sample data used in this study were done under an early access agreement, before final accrual, harmonization and public release of the full dataset. As such, the subset of samples included in this study may differ in sample representation, quality-control thresholds and data normalizations from those found in GDC and in the final study describing the full dataset.

The analysis of the haematopoietic signatures in an external dataset used data from the Cancer Dependency Map project (DepMap); specifically, the Genetic Dependency CRISPR assays (DepMap 21Q4 Public+Score, Chronos, ‘CRISPR_gene_effect.csv’) and the expression data (21Q4 Public, ‘CCLE_expression.csv’), as well as the metadata in the Cell Line Sample Info file (‘sample_info.csv’), all accessible from the DepMap portal (https://depmap.org/portal). Source data are provided with this paper.

Code availability

Code necessary to reproduce our experimental findings can be found in Zenodo at https://doi.org/10.5281/zenodo.7689076.

References

  1. Khwaja, A. et al. Acute myeloid leukaemia. Nat. Rev. Dis. Prim. 2, Article 16010 (2016).

  2. Kurtz, S. E. et al. Molecularly targeted drug combinations demonstrate selective effectiveness for myeloid- and lymphoid-derived hematologic malignancies. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1703094114 (2017).

  3. Day, D. & Siu, L. L. Approaches to modernize the combination drug development paradigm. Genome Med. 8, 115 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  4. O’Neil, J. et al. An unbiased oncology compound screen to identify novel combination strategies. Mol. Cancer Ther. 15, 1155–1162 (2016).

    Article  PubMed  Google Scholar 

  5. Jia, J. et al. Mechanisms of drug combinations: interaction and network perspectives. Nat. Rev. Drug Discov. 8, 111–128 (2009).

    Article  CAS  PubMed  Google Scholar 

  6. Nair, R., Salinas-Illarena, A. & Baldauf, H.-M. New strategies to treat AML: novel insights into AML survival pathways and combination therapies. Leukemia 35, 299–311 (2021).

    Article  CAS  PubMed  Google Scholar 

  7. Tyner, J. W. & Others, A. Functional genomic landscape of acute myeloid leukaemia. Nature 562, 526–531 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Schenone, M., Dančík, V., Wagner, B. K. & Clemons, P. A. Target identification and mechanism of action in chemical biology and drug discovery. Nat. Chem. Biol. 9, 232–240 (2013).

  9. Hopkins, A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 4, 682–690 (2008).

    Article  CAS  PubMed  Google Scholar 

  10. Calzolari, D. et al. Search algorithms as a framework for the optimization of drug combinations. PLoS Comput. Biol. 4, e1000249 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Feala, J. D. et al. Systems approaches and algorithms for discovery of combinatorial therapies. Wiley Interdiscip. Rev. Syst. Biol. Med. 2, 181–193 (2010).

    Article  PubMed  Google Scholar 

  12. Wong, P. K. et al. Closed-loop control of cellular functions using combinatory drugs guided by a stochastic search algorithm. Proc. Natl Acad. Sci. USA 105, 5105–5110 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Menden, M. P. et al. Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen. Nat. Commun. 10, 2674 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Preuer, K. et al. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538–1546 (2018).

    Article  CAS  PubMed  Google Scholar 

  15. Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483, 570–575 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Lundberg, S. M. & Lee, S.-I. in Advances in Neural Information Processing Systems (eds Guyon, I., Von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., & Garnett, R.) 4765–4774 (Curran Associates, Inc., 2017).

  18. Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 3145–3153 (PMLR, 2017).

  20. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning, PMLR (eds Precup, D. & Teh, Y. W.) 3319–3328 (JMLR.org, 2017).

  21. Shapley, L. S. A value for n-person games. Class. game theory 69 (1997).

  22. Aas, K., Jullum, M. & Løland, A. Explaining individual predictions when features are dependent: more accurate approximations to Shapley values. Artif. Intell. 298, 103502 (2021).

    Article  Google Scholar 

  23. Koo, P. K. & Ploenzke, M. Improving representations of genomic sequence motifs in convolutional networks with exponential activations. Nat. Mach. Intell. 3, 258–266 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Schreiber, J. & Singh, R. Machine learning for profile prediction in genomics. Curr. Opin. Chem. Biol. 65, 35–41 (2021).

    Article  CAS  PubMed  Google Scholar 

  25. Covert, I., Lundberg, S. & Lee, S.-I. Explaining by removing: a unified framework for model explanation. J. Mach. Learn. Res. 22, 1–90 (2021).

    Google Scholar 

  26. Kim, N. et al. Prediction of the sequence-specific cleavage activity of Cas9 variants. Nat. Biotechnol. 38, 1328–1336 (2020).

    Article  CAS  PubMed  Google Scholar 

  27. Kim, H. K. et al. Predicting the efficiency of prime editing guide RNAs in human cells. Nat. Biotechnol. 39, 198–206 (2021).

    Article  CAS  PubMed  Google Scholar 

  28. Schultebraucks, K. et al. A validated predictive algorithm of post-traumatic stress course following emergency department admission after a traumatic stressor. Nat. Med. 26, 1084–1088 (2020).

    Article  CAS  PubMed  Google Scholar 

  29. Hyland, S. L. et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat. Med. 26, 364–373 (2020).

    Article  CAS  PubMed  Google Scholar 

  30. Meier, F. et al. Deep learning the collisional cross sections of the peptide universe from a million experimental values. Nat. Commun. 12, Article 1185 (2021).

  31. Bar, N. et al. A reference map of potential determinants for the human serum metabolome. Nature 588, 135–140 (2020).

    Article  PubMed  Google Scholar 

  32. Rodriguez-Perez, R. & Bajorath, J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J. Med. Chem. 63, 8761–8777 (2019).

    Article  PubMed  Google Scholar 

  33. Rodriguez-Perez, R. & Bajorath, J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J. Comput. Aided Mol. Des. 34, 1013–1026 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Tang, Y.-C. & Gottlieb, A. Explainable drug sensitivity prediction through cancer pathway enrichment. Sci. Rep. 11, Article 3128 (2021).

  35. Braithwaite, B. et al. Detection of medications associated with Alzheimer’s disease using ensemble methods and cooperative game theory. Int. J. Med. Inform. 141, 104142 (2020).

    Article  CAS  PubMed  Google Scholar 

  36. Breiman, L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16, 199–231 (2001).

    Article  Google Scholar 

  37. Dong, J. & Rudin, C. Variable importance clouds: a way to explore variable importance for the set of good models. Preprint at https://doi.org/10.48550/arXiv.1901.03209 (2019).

  38. Hooker, S., Erhan, D., Kindermans, P.-J. & Kim, B. A benchmark for interpretability methods in deep neural networks. In 33rd Conference on Neural Information Processing Systems (eds Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E. & Garnett, R.) (Curran Associates, Inc., 2019).

  39. Song, L., Bedo, J., Borgwardt, K. M., Gretton, A. & Smola, A. Gene selection via the BAHSIC family of algorithms. Bioinformatics 23, i490–i498 (2007).

    Article  CAS  PubMed  Google Scholar 

  40. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005).

    Article  Google Scholar 

  41. Guyon, I., Weston, J., Barnhill, S. & Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002).

    Article  Google Scholar 

  42. Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Maslova, A. et al. Deep learning of immune cell differentiation. Proc. Natl Acad. Sci. USA 117, 25655–25666 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Farzaneh, N., Williamson, C. A., Gryak, J. & Najarian, K. A hierarchical expert-guided machine learning framework for clinical decision support systems: an application to traumatic brain injury prognostication. npj Digit. Med. 4, 78 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).

    Article  Google Scholar 

  46. Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).

  47. King, R. D., Orhobor, O. I. & Taylor, C. C. Cross-validation is safe to use. Nat. Mach. Intell. 3, 276 (2021).

    Article  Google Scholar 

  48. Shwartz-Ziv, R. & Armon, A. Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90 (2022).

    Article  Google Scholar 

  49. Gurska, L. M., Ames, K. & Gritsman, K. Signaling pathways in leukemic stem cells. Adv. Exp. Med. Biol. 1143, 1–39 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Kumar, A. R., Sarver, A. L., Wu, B. & Kersey, J. H. Meis1 maintains stemness signature in MLL-AF9 leukemia. Blood 115, 3642–3643 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Liu, J. et al. Meis1 is critical to the maintenance of human acute myeloid leukemia cells independent of MLL rearrangements. Ann. Hematol. 96, 567–574 (2017).

    Article  CAS  PubMed  Google Scholar 

  52. Pei, S. et al. Monocytic subclones confer resistance to venetoclax-based therapy in patients with acute myeloid leukemia. Cancer Discov. 10, 536–551 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Takam Kamga, P. et al. Prognostic impact of notch signaling in acute myeloid leukemia (AML). Blood 132, 5242 (2018).

    Article  Google Scholar 

  54. Kranc, K. R. et al. Cited2 is an essential regulator of adult hematopoietic stem cells. Cell Stem Cell 5, 659–665 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Korthuis, P. M. et al. CITED2-mediated human hematopoietic stem cell maintenance is critical for acute myeloid leukemia. Leukemia 29, 625–635 (2015).

  56. Tanaka, M. et al. Targeted disruption of oncostatin M receptor results in altered hematopoiesis. Blood 102, 3154–3162 (2003).

    Article  CAS  PubMed  Google Scholar 

  57. Zhao, X., Li, Y. & Wu, H. A novel scoring system for acute myeloid leukemia risk assessment based on the expression levels of six genes. Int. J. Mol. Med. 42, 1495–1507 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Zhang, N., Chen, Y., Lou, S., Shen, Y. & Deng, J. A six-gene-based prognostic model predicts complete remission and overall survival in childhood acute myeloid leukemia. Onco. Targets Ther. 12, 6591–6604 (2019).

  59. Lin, W. et al. SLC7A11/xCT in cancer: biological functions and therapeutic implications. Am. J. Cancer Res. 10, 3106–3126 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Kornblau, S. M. et al. Recurrent expression signatures of cytokines and chemokines are present and are independently prognostic in acute myelogenous leukemia and myelodysplasia. Blood 116, 4251–4261 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Goenka, S. & Kaplan, M. H. Transcriptional regulation by STAT6. Immunol. Res. 50, 87–96 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Peña-Martínez, P. et al. Interleukin 4 induces apoptosis of acute myeloid leukemia cells in a Stat6-dependent manner. Leukemia 32, 588–596 (2018).

    Article  PubMed  Google Scholar 

  63. Bunting, K. D. et al. Increased numbers of committed myeloid progenitors but not primitive hematopoietic stem/progenitors in mice lacking STAT6 expression. J. Leukoc. Biol. 76, 484–490 (2004).

    Article  CAS  PubMed  Google Scholar 

  64. Li, M. J. et al. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res. 40, D1047–D1054 (2012).

    Article  CAS  PubMed  Google Scholar 

  65. Churpek, J. E. et al. Genomic analysis of germ line and somatic variants in familial myelodysplasia/acute myeloid leukemia. Blood 126, 2484–2490 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Lo, F.-Y. et al. Metabolic alterations may contribute to cabozantinib resistance in acute myeloid leukemia cells with FLT3-ITD. Blood 132, 2785 (2018).

    Article  Google Scholar 

  67. Gal, H. et al. Gene expression profiles of AML derived stem cells; similarity to hematopoietic stem cells. Leukemia 20, 2147–2154 (2006).

  68. Gentles, A. J., Plevritis, S. K., Majeti, R. & Alizadeh, A. A. Association of a leukemic stem cell gene expression signature with clinical outcomes in acute myeloid leukemia. JAMA 304, 2706–2715 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Pollyea, D. A. et al. Venetoclax with azacitidine disrupts energy metabolism and targets leukemia stem cells in patients with acute myeloid leukemia. Nat. Med. 24, 1859–1866 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Kuusanmäki, H. et al. Phenotype-based drug screening reveals association between venetoclax response and differentiation stage in acute myeloid leukemia. Haematologica 105, 708–720 (2020).

  71. Jones, C. L. et al. Cysteine depletion targets leukemia stem cells through inhibition of electron transport complex II. Blood 134, 389–394 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Stevens, B. M. et al. Fatty acid metabolism underlies venetoclax resistance in acute myeloid leukemia stem cells. Nat. Cancer 1, 1176–1187 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Kurtz, S. E. et al. Dual inhibition of JAK1/2 kinases and BCL2: a promising therapeutic strategy for acute myeloid leukemia. Leukemia 32, 2025–2028 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Grabisch, M. & Roubens, M. An axiomatic approach to the concept of interaction among players in cooperative games. Int. J. Game Theory 28, 547–565 (1999).

    Article  Google Scholar 

  76. Pollyea, D. A., Amaya, M., Strati, P. & Konopleva, M. Y. Venetoclax for AML: changing the treatment paradigm. Blood Adv. 3, 4326–4335 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Karjalainen, R. et al. Elevated expression of S100A8 and S100A9 correlates with resistance to the BCL-2 inhibitor venetoclax in AML. Leukemia 33, 2548–2553 (2019).

    Article  PubMed  Google Scholar 

  78. Lannert, H. et al. Expression of S100 proteins in normal human hematopoietic stem cells and in AML. J. Clin. Oncol. 26, 7072 (2008).

    Article  Google Scholar 

  79. Han, L. et al. Concomitant targeting of BCL2 with venetoclax and MAPK signaling with cobimetinib in acute myeloid leukemia models. Haematologica 105, 697–707 (2020).

  80. Bock, F. J., Cloix, C., Zerbst, D. & Tait, S. W. G. Apoptosis-induced FGF signalling promotes non-cell autonomous resistance to cell death. bioRxiv (2020).

  81. Lamba, J. K. Genetic factors influencing cytarabine therapy. Pharmacogenomics 10, 1657–1674 (2009).

    Article  CAS  PubMed  Google Scholar 

  82. DeGrave, A. J., Janizek, J. D. & Lee, S.-I. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3, 610–619 (2021).

  83. Geirhos, R. et al. Shortcut learning in deep neural networks. Nat. Mach. Intell. 2, 665–673 (2020).

    Article  Google Scholar 

  84. Kundu, S. AI in medicine must be explainable. Nat. Med. 27, 1328 (2021).

  85. Bzdok, D., Engemann, D. & Thirion, B. Inference and prediction diverge in biomedicine. Patterns 1, 100119 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  86. Efron, B. Prediction, estimation, and attribution. J. Am. Stat. Assoc. 115, 636–655 (2020).

    Article  CAS  Google Scholar 

  87. Lee, S.-I. et al. A machine learning approach to integrate big data for precision medicine in acute myeloid leukemia. Nat. Commun. 9, 42 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Erion, G., Janizek, J. D., Sturmfels, P., Lundberg, S. & Lee, S.-I. Learning explainable models using attribution priors. Preprint at arXiv1906.10670v1 (2019).

  89. Weinberger, E., Janizek, J. & Lee, S.-I. Learning deep attribution priors based on prior knowledge. Preprint at https://doi.org/10.48550/arXiv.1912.10065 (2019).

  90. Kuenzi, B. M. et al. Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell 38, 672–684 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Gut, G., Stark, S. G., Rätsch, G. & Davidson, N. R. PmVAE: learning interpretable single-cell representations with pathway modules. Preprint at bioRxiv https://doi.org/10.1101/2021.01.28.428664 (2021).

  92. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Dincer, A. B., Celik, S., Hiranuma, N. & Lee, S.-I. DeepProfile: deep learning of cancer molecular profiles for precision medicine. Preprint at bioRxiv https://doi.org/10.1101/278739 (2018).

  94. Štrumbelj, E. & Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41, 647–665 (2014).

    Article  Google Scholar 

  95. Chen, H., Janizek, J. D., Lundberg, S. & Lee, S.-I. True to the model or true to the data? Preprint at https://doi.org/10.48550/arXiv.2006.16234 (2020).

  96. Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch. Preprint at https://doi.org/10.48550/arXiv.2009.07896 (2020).

  97. Ribeiro, M. T., Singh, S. & Guestrin, C. Why should I trust you?: Explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144 (ACM, 2016).

  98. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8026–8037 (2019).

    Google Scholar 

  99. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  100. Nguyen, G., Kim, D. & Nguyen, A. The effectiveness of feature attribution methods and its correlation with automatic evaluation scores. Adv. Neural Inf. Process. Syst. 34, 26422–26436 (2021).

    Google Scholar 

  101. Covert, I., Lundberg, S. M. & Lee, S.-I. Understanding global feature contributions with additive importance measures. Adv. Neural Inf. Process. Syst. 33, 17212–17223 (2020).

    Google Scholar 

  102. Adebayo, J., Muelly, M., Liccardi, I. & Kim, B. Debugging tests for model explanations. Adv. Neural Inf. Process. Syst. 33, 700–712 (2020).

    Google Scholar 

  103. Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).

    Article  Google Scholar 

  104. Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Edgar, R. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Chou, T.-C. Drug combination studies and their synergy quantification using the Chou-Talalay Method. Cancer Res. 70, 440–446 (2010).

    Article  CAS  PubMed  Google Scholar 

  107. Narahari, Y. Game Theory and Mechanism Design Vol. 4 (World Scientific, 2014).

  108. Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).

    Article  CAS  PubMed  Google Scholar 

  109. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).

    Google Scholar 

  110. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  111. Hagberg, A., Swart, P. & S Chult, D. Exploring Network Structure, Dynamics, and Function Using NetworkX (US Department of Energy, 2008).

Download references

Acknowledgements

S.-I.L. discloses support for the research described in this study from the National Science Foundation (CAREER DBI-1552309 and DBI-1759487), the National Institutes of Health (R35 GM 128638 and R01 NIA AG 061132) and the American Cancer Society (127332-RSG-15-097-01-TBG). K.N. discloses support for the research described in this study from the National Institutes of Health (R37CA225655 and P01HL142494).

Author information

Authors and Affiliations

Authors

Contributions

J.D.J. and S.-I.L. conceived the study. J.D.J. prepared datasets, designed experiments and wrote software. S.-I.L. and K.N. jointly supervised the study. J.D.J., K.N. and S.-I.L. wrote the manuscript. A.B.D. ran experiments and prepared datasets. H.C. and S.C. helped design experiments. W.C. helped maintain the software and assisted with experiments.

Corresponding authors

Correspondence to Kamila Naxerova or Su-In Lee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biomedical Engineering thanks María Rodríguez Martínez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Descriptive statistics of Beat AML cohort.

Histograms showing the relative density of prior treatment regimens, age, cause of death, and prior treatment types in the cohort of 285 patients in our dataset, which consisted of 12,362 samples with paired gene expression and drug synergy measurements for 133 pairs of 46 anticancer drugs.

Extended Data Fig. 2 Feature discovery benchmark.

For each synthetic or semi-synthetic dataset (a), we trained a variety of models (b) including neural networks, GBMs, support vector machines, and elastic net regression, as well as univariate statistics (Pearson correlation). For the machine-learning models, we then used SAGE to generate global Shapley value feature attributions (c), ranked the features according to the magnitude of their attributions (d), and compared the ranked list generated by each method to the binary ground truth importance vector (e). To measure the feature discovery quality of each method, we plotted how many “true” features are found cumulatively at each point in the ranked feature list (f), then summarized the curve generated by this procedure by measuring the AUFDC. This score is then rescaled so that a score of 0 represents random performance while a score of 1 represents perfect performance.

Extended Data Fig. 3 Predictive performance of models trained with synthetic datasets.

Predictive performance, as measured by the Pearson correlation of the predicted and true labels for the models trained in the benchmark presented in Fig. 2.

Extended Data Fig. 4 Ensembling overcomes the variability in attributions present in individual models.

To understand why ensemble models were able to attain better feature discovery performance than single models, we compared the characteristics of the attribution vectors of XGBoost models trained on bootstrap resampled versions of a correlated groups dataset with a step-function outcome. a, Heatmap of feature attributions for 20 individual XGBoost models. b, Heatmap of feature attributions for 20 ensembles of XGBoost models. c, Pairs of attribution vectors from ensembled models are more similar across bootstrap resamples of the dataset than attribution vectors from single models, as measured by cosine similarity. d, Attribution vectors from ensembled models place a larger proportion of their importance on a smaller set of features than attribution vectors from single models, as measured by the Gini coefficient of the attribution vectors, a measure of vector sparseness.

Extended Data Fig. 5 EXPRESS improves feature attributions of deep learning models.

Comparison of feature discovery performance between individual deep learning models (gray) and ensembles of deep learning models (red) across all 12 dataset types from the synthetic benchmark. Three separate feature attribution methods are tested for each model: DeepLift, Integrated Gradients, and SHAP (in this case implemented as global attributions using the SAGE software package).

Extended Data Fig. 6 EXPRESS improves feature attributions of XGBoost models.

Comparison of feature discovery performance between individual XGBoost models (gray) and ensembles of XGBoost models (red) across all 12 dataset types from the synthetic benchmark. Three separate feature attribution methods are tested for each model: a) cover, b) gain, and c) SHAP.

Extended Data Fig. 7 EXPRESS improves feature attributions of deep learning models on additional supplementary datasets.

Comparison of feature discovery performance between individual deep learning models (gray) and ensembles of deep learning models (red) across all 25 supplementary dataset types (see methods section on supplementary dataset types). Three separate feature attribution methods are tested for each model: DeepLift, Integrated Gradients, and c) SHAP (in this case implemented as SAGE). We find that for 73% of comparisons, EXPRESS improves feature discovery performance (for associated statistics, see Supplementary Dataset 25).

Extended Data Fig. 8 EXPRESS improves feature attributions of XGBoost models on additional supplementary datasets.

Comparison of feature discovery performance between individual XGBoost models (gray) and ensembles of XGBoost models (red) across all all 25 supplementary dataset types (see methods section on supplementary dataset types). Three separate feature attribution methods are tested for each model: Cover, Gain, and SHAP. We find that for 76% of comparisons, EXPRESS improves feature discovery performance (for associated statistics, see Supplementary Dataset 26).

Extended Data Fig. 9 EXPRESS improves feature attributions independently of improvement in model performance.

For both XGBoost models (a, trained on the Beat AML dataset with the AND function outcome; b, trained on the Beat AML dataset with the multiplicative outcome) and deep learning models (c, trained on the Beat AML dataset with the AND function outcome; and d, trained on the Beat AML dataset with the multiplicative outcome), we see that even after controlling for the effect of model ensembles on predictive performance by stratifying models (low, intermediate, and high predictive performance), within each stratification ensemble models have significantly higher AUFDC. Significance assessed by two-sided Mann−Whitney U-test, * represents p < 0.05, ** represents $p < 0.01, *** represents p < 0.001, and **** represents p < 0.0001 (full statistics in Supplementary Dataset 27). The boxes mark the quartiles (25th, 50th, and 75th percentiles) of the distribution within a given predictive performance stratification, while the whiskers extend to show the minimum and maximum of the distribution (excluding outliers).

Extended Data Fig. 10 Ensembling improves XGBoost attributions more than explicit regularization.

Using the synthetic datasets with real AML gene expression features, we compare the increase in AUFDC seen with explicit regularization, such as per-tree column dropout and L1 regularization, with ensembling. For the synthetic datasets with AML features and the AND true function, we see that ensembles improve AUFDC significantly more than column dropout (a, two-sided Mann−Whitney U-test, U = 2.83, P = \(4.7 \times 10^{ - 3}\)) and L1 regularization (c, U = 3.00, P = \(2.7 \times 10^{ - 3}\)). For the synthetic datasets with AML features and the multiplicative true function, we see that ensembles improve AUFDC significantly more than column dropout (b, U = 4.04, \(5.25 \times 10^{ - 5}\)) and L1 regularization (d, U = 4.87, P = \(1.12 \times 10^{ - 6}\)).

Supplementary information

Source data

Data

Additional supplementary datasets.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Janizek, J.D., Dincer, A.B., Celik, S. et al. Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models. Nat. Biomed. Eng 7, 811–829 (2023). https://doi.org/10.1038/s41551-023-01034-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41551-023-01034-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing