Enhancing reproducibility of gene expression analysis with known protein functional relationships: The concept of well-associated protein.
PLoS Comput Biol
; 16(2): e1007684, 2020 02.
Article
em En
| MEDLINE
| ID: mdl-32058996
ABSTRACT
Identification of differentially expressed genes (DEGs) is well recognized to be variable across independent replications of genome-wide transcriptional studies. These are often employed to characterize disease state early in the process of discovery and prioritize novel targets aimed at addressing unmet medical need. Increasing reproducibility of biological findings from these studies could potentially positively impact the success rate of new clinical interventions. This work demonstrates that statistically sound combination of gene expression data with prior knowledge about biology in the form of large protein interaction networks can yield quantitatively more reproducible observations from studies characterizing human disease. The novel concept of Well-Associated Proteins (WAPs) introduced herein-gene products significantly associated on protein interaction networks with the differences in transcript levels between control and disease-does not require choosing a differential expression threshold and can be computed efficiently enough to enable false discovery rate estimation via permutation. Reproducibility of WAPs is shown to be on average superior to that of DEGs under easily-quantifiable conditions suggesting that they can yield a significantly more robust description of disease. Enhanced reproducibility of WAPs versus DEGs is first demonstrated with four independent data sets focused on systemic sclerosis. This finding is then validated over thousands of pairs of data sets obtained by random partitions of large studies in several other diseases. Conditions that individual data sets must satisfy to yield robust WAP scores are examined. Reproducible identification of WAPs can potentially benefit drug target selection and precision medicine studies.
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Proteínas
/
Biologia Computacional
/
Perfilação da Expressão Gênica
/
Mapas de Interação de Proteínas
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Limite:
Humans
Idioma:
En
Revista:
PLoS Comput Biol
Assunto da revista:
BIOLOGIA
/
INFORMATICA MEDICA
Ano de publicação:
2020
Tipo de documento:
Article
País de afiliação:
Estados Unidos