Digitaldlsorter: Deep-Learning on scRNA-Seq to Deconvolute Gene Expression Data

Torroja, Carlos and Sanchez-Cabo, Fatima (2019) Digitaldlsorter: Deep-Learning on scRNA-Seq to Deconvolute Gene Expression Data. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/2/package-entries/fgene-10-00978.pdf] Text
pubmed-zip/versions/2/package-entries/fgene-10-00978.pdf - Published Version

Download (3MB)

Abstract

The development of single cell transcriptome sequencing has allowed researchers the possibility to dig inside the role of the individual cell types in a plethora of disease scenarios. It also expands to the whole transcriptome what before was only possible for a few tenths of antibodies in cell population analysis. More importantly, it allows resolving the permanent question of whether the changes observed in a particular bulk experiment are a consequence of changes in cell type proportions or an aberrant behavior of a particular cell type. However, single cell experiments are still complex to perform and expensive to sequence making bulk RNA-Seq experiments yet more common. scRNA-Seq data is proving highly relevant information for the characterization of the immune cell repertoire in different diseases ranging from cancer to atherosclerosis. In particular, as scRNA-Seq becomes more widely used, new types of immune cell populations emerge and their role in the genesis and evolution of the disease opens new avenues for personalized immune therapies. Immunotherapy have already proven successful in a variety of tumors such as breast, colon and melanoma and its value in other types of disease is being currently explored. From a statistical perspective, single-cell data are particularly interesting due to its high dimensionality, overcoming the limitations of the “skinny matrix” that traditional bulk RNA-Seq experiments yield. With the technological advances that enable sequencing hundreds of thousands of cells, scRNA-Seq data have become especially suitable for the application of Machine Learning algorithms such as Deep Learning (DL). We present here a DL based method to enumerate and quantify the immune infiltration in colorectal and breast cancer bulk RNA-Seq samples starting from scRNA-Seq. Our method makes use of a Deep Neural Network (DNN) model that allows quantification not only of lymphocytes as a general population but also of specific CD8+, CD4Tmem, CD4Th and CD4Tregs subpopulations, as well as B-cells and Stromal content. Moreover, the signatures are built from scRNA-Seq data from the tumor, preserving the specific characteristics of the tumor microenvironment as opposite to other approaches in which cells were isolated from blood. Our method was applied to synthetic bulk RNA-Seq and to samples from the TCGA project yielding very accurate results in terms of quantification and survival prediction.

Item Type: Article
Subjects: STM Digital Library > Medical Science
Depositing User: Unnamed user with email support@stmdigitallib.com
Date Deposited: 01 Feb 2023 07:44
Last Modified: 21 May 2024 12:13
URI: http://archive.scholarstm.com/id/eprint/344

Actions (login required)

View Item
View Item