Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning
Abstract Integrating diverse types of biological data is essential for a holistic understanding of cancer biology, yet it remains challenging due to data heterogeneity, complexity, and sparsity. Addressing this, our study introduces an unsupervised deep learning model, MOSA (Multi-Omic Synthetic Aug...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-11-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-024-54771-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823861804943540224 |
---|---|
author | Zhaoxiang Cai Sofia Apolinário Ana R. Baião Clare Pacini Miguel D. Sousa Susana Vinga Roger R. Reddel Phillip J. Robinson Mathew J. Garnett Qing Zhong Emanuel Gonçalves |
author_facet | Zhaoxiang Cai Sofia Apolinário Ana R. Baião Clare Pacini Miguel D. Sousa Susana Vinga Roger R. Reddel Phillip J. Robinson Mathew J. Garnett Qing Zhong Emanuel Gonçalves |
author_sort | Zhaoxiang Cai |
collection | DOAJ |
description | Abstract Integrating diverse types of biological data is essential for a holistic understanding of cancer biology, yet it remains challenging due to data heterogeneity, complexity, and sparsity. Addressing this, our study introduces an unsupervised deep learning model, MOSA (Multi-Omic Synthetic Augmentation), specifically designed to integrate and augment the Cancer Dependency Map (DepMap). Harnessing orthogonal multi-omic information, this model successfully generates molecular and phenotypic profiles, resulting in an increase of 32.7% in the number of multi-omic profiles and thereby generating a complete DepMap for 1523 cancer cell lines. The synthetically enhanced data increases statistical power, uncovering less studied mechanisms associated with drug resistance, and refines the identification of genetic associations and clustering of cancer cell lines. By applying SHapley Additive exPlanations (SHAP) for model interpretation, MOSA reveals multi-omic features essential for cell clustering and biomarker identification related to drug and gene dependencies. This understanding is crucial for developing much-needed effective strategies to prioritize cancer targets. |
format | Article |
id | doaj-art-58b951e3cf64437ebfa82518119ab27f |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2024-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-58b951e3cf64437ebfa82518119ab27f2025-02-09T12:43:49ZengNature PortfolioNature Communications2041-17232024-11-0115111210.1038/s41467-024-54771-4Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learningZhaoxiang Cai0Sofia Apolinário1Ana R. Baião2Clare Pacini3Miguel D. Sousa4Susana Vinga5Roger R. Reddel6Phillip J. Robinson7Mathew J. Garnett8Qing Zhong9Emanuel Gonçalves10ProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of SydneyINESC-IDINESC-IDWellcome Sanger Institute, Wellcome Genome CampusINESC-IDINESC-IDProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of SydneyProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of SydneyWellcome Sanger Institute, Wellcome Genome CampusProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of SydneyINESC-IDAbstract Integrating diverse types of biological data is essential for a holistic understanding of cancer biology, yet it remains challenging due to data heterogeneity, complexity, and sparsity. Addressing this, our study introduces an unsupervised deep learning model, MOSA (Multi-Omic Synthetic Augmentation), specifically designed to integrate and augment the Cancer Dependency Map (DepMap). Harnessing orthogonal multi-omic information, this model successfully generates molecular and phenotypic profiles, resulting in an increase of 32.7% in the number of multi-omic profiles and thereby generating a complete DepMap for 1523 cancer cell lines. The synthetically enhanced data increases statistical power, uncovering less studied mechanisms associated with drug resistance, and refines the identification of genetic associations and clustering of cancer cell lines. By applying SHapley Additive exPlanations (SHAP) for model interpretation, MOSA reveals multi-omic features essential for cell clustering and biomarker identification related to drug and gene dependencies. This understanding is crucial for developing much-needed effective strategies to prioritize cancer targets.https://doi.org/10.1038/s41467-024-54771-4 |
spellingShingle | Zhaoxiang Cai Sofia Apolinário Ana R. Baião Clare Pacini Miguel D. Sousa Susana Vinga Roger R. Reddel Phillip J. Robinson Mathew J. Garnett Qing Zhong Emanuel Gonçalves Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning Nature Communications |
title | Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning |
title_full | Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning |
title_fullStr | Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning |
title_full_unstemmed | Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning |
title_short | Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning |
title_sort | synthetic augmentation of cancer cell line multi omic datasets using unsupervised deep learning |
url | https://doi.org/10.1038/s41467-024-54771-4 |
work_keys_str_mv | AT zhaoxiangcai syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT sofiaapolinario syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT anarbaiao syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT clarepacini syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT migueldsousa syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT susanavinga syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT rogerrreddel syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT phillipjrobinson syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT mathewjgarnett syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT qingzhong syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning AT emanuelgoncalves syntheticaugmentationofcancercelllinemultiomicdatasetsusingunsuperviseddeeplearning |