TY - JOUR
T1 - Harnessing the biological complexity of Big Data from LINCS Gene Expression Signatures
AU - Emmert-Streib, Frank
AU - Musa, Aliyu
AU - Tripathi, Shailesh
AU - Kandhavelu, Meenakshisundaram
AU - Dehmer, Matthias
N1 - Publisher Copyright:
© 2018 Musa et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2018/8
Y1 - 2018/8
N2 - Gene expression profiling using transcriptional drug perturbations are useful for many biomedical discovery studies including drug repurposing and elucidation of drug mechanisms (MoA) and many other pharmacogenomic applications. However, limited data availability across cell types has severely hindered our capacity to progress in these areas. To fill this gap, recently, the LINCS program generated almost 1.3 million profiles for over 40,000 drug and genetic perturbations for over 70 different human cell types, including meta information about the experimental conditions and cell lines. Unfortunately, Big Data like the ones generated from the ongoing LINCS program do not enable easy insights from the data but possess considerable challenges toward their analysis. In this paper, we address some of these challenges. Specifically, first, we study the gene expression signature profiles from all cell lines and their perturbagents in order to obtain insights in the distributional characteristics of available conditions. Second, we investigate the differential expression of genes for all cell lines obtaining an understanding of condition dependent differential expression manifesting the biological complexity of perturbagents. As a result, our analysis helps the experimental design of follow-up studies, e.g., by selecting appropriate cell lines.
AB - Gene expression profiling using transcriptional drug perturbations are useful for many biomedical discovery studies including drug repurposing and elucidation of drug mechanisms (MoA) and many other pharmacogenomic applications. However, limited data availability across cell types has severely hindered our capacity to progress in these areas. To fill this gap, recently, the LINCS program generated almost 1.3 million profiles for over 40,000 drug and genetic perturbations for over 70 different human cell types, including meta information about the experimental conditions and cell lines. Unfortunately, Big Data like the ones generated from the ongoing LINCS program do not enable easy insights from the data but possess considerable challenges toward their analysis. In this paper, we address some of these challenges. Specifically, first, we study the gene expression signature profiles from all cell lines and their perturbagents in order to obtain insights in the distributional characteristics of available conditions. Second, we investigate the differential expression of genes for all cell lines obtaining an understanding of condition dependent differential expression manifesting the biological complexity of perturbagents. As a result, our analysis helps the experimental design of follow-up studies, e.g., by selecting appropriate cell lines.
KW - Big Data
KW - Cell Line
KW - Databases, Genetic
KW - Humans
KW - Pharmacogenetics/methods
KW - Pharmacogenomic Variants
KW - Software
KW - Stress, Physiological/genetics
KW - Transcriptome
UR - http://www.scopus.com/inward/record.url?scp=85052825236&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0201937
DO - 10.1371/journal.pone.0201937
M3 - Article
C2 - 30157183
VL - 13
JO - PLoS ONE
JF - PLoS ONE
IS - 8
M1 - e0201937
ER -