Search | VHL Regional Portal

The evolution of computational research in a data-centric world.

Deshpande, Dhrithi; Chhugani, Karishma; Ramesh, Tejasvene; Pellegrini, Matteo; Shiffman, Sagiv; Abedalthagafi, Malak S; Alqahtani, Saleh; Ye, Jimmie; Liu, Xiaole Shirley; Leek, Jeffrey T; Brazma, Alvis; Ophoff, Roel A; Rao, Gauri; Butte, Atul J; Moore, Jason H; Katritch, Vsevolod; Mangul, Serghei.

Cell ; 187(17): 4449-4457, 2024 Aug 22.

Article in English | MEDLINE | ID: mdl-39178828

ABSTRACT

Computational data-centric research techniques play a prevalent and multi-disciplinary role in life science research. In the past, scientists in wet labs generated the data, and computational researchers focused on creating tools for the analysis of those data. Computational researchers are now becoming more independent and taking leadership roles within biomedical projects, leveraging the increased availability of public data. We are now able to generate vast amounts of data, and the challenge has shifted from data generation to data analysis. Here we discuss the pitfalls, challenges, and opportunities facing the field of data-centric research in biology. We discuss the evolving perception of computational data-driven research and its rise as an independent domain in biomedical research while also addressing the significant collaborative opportunities that arise from integrating computational research with experimental and translational biology. Additionally, we discuss the future of data-centric research and its applications across various areas of the biomedical field.

Subject(s)

Biomedical Research , Computational Biology , Computational Biology/methods , Humans

On the cross-population generalizability of gene expression prediction models.

Keys, Kevin L; Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L; Dahl, Andrew W; Mefford, Joel; Mikhaylova, Anna V; Contreras, María G; Elhawary, Jennifer R; Eng, Celeste; Hu, Donglei; Huntsman, Scott; Oh, Sam S; Salazar, Sandra; Lenoir, Michael A; Ye, Jimmie C; Thornton, Timothy A; Zaitlen, Noah; Burchard, Esteban G; Gignoux, Christopher R.

PLoS Genet ; 16(8): e1008927, 2020 08.

Article in English | MEDLINE | ID: mdl-32797036

ABSTRACT

The genetic control of gene expression is a core component of human physiology. For the past several years, transcriptome-wide association studies have leveraged large datasets of linked genotype and RNA sequencing information to create a powerful gene-based test of association that has been used in dozens of studies. While numerous discoveries have been made, the populations in the training data are overwhelmingly of European descent, and little is known about the generalizability of these models to other populations. Here, we test for cross-population generalizability of gene expression prediction models using a dataset of African American individuals with RNA-Seq data in whole blood. We find that the default models trained in large datasets such as GTEx and DGN fare poorly in African Americans, with a notable reduction in prediction accuracy when compared to European Americans. We replicate these limitations in cross-population generalizability using the five populations in the GEUVADIS dataset. Via realistic simulations of both populations and gene expression, we show that accurate cross-population generalizability of transcriptome prediction only arises when eQTL architecture is substantially shared across populations. In contrast, models with non-identical eQTLs showed patterns similar to real-world data. Therefore, generating RNA-Seq data in diverse populations is a critical step towards multi-ethnic utility of gene expression prediction.

Subject(s)

Black or African American/genetics , Genome-Wide Association Study/methods , Models, Genetic , Transcriptome , Gene Expression Profiling/methods , Gene Expression Profiling/standards , Genome-Wide Association Study/standards , Humans , Quantitative Trait Loci , RNA-Seq/methods , RNA-Seq/standards , Reference Standards

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL