RESUMO
How do aberrations in widely expressed genes lead to tissue-selective hereditary diseases? Previous attempts to answer this question were limited to testing a few candidate mechanisms. To answer this question at a larger scale, we developed "Tissue Risk Assessment of Causality by Expression" (TRACE), a machine learning approach to predict genes that underlie tissue-selective diseases and selectivity-related features. TRACE utilized 4,744 biologically interpretable tissue-specific gene features that were inferred from heterogeneous omics datasets. Application of TRACE to 1,031 disease genes uncovered known and novel selectivity-related features, the most common of which was previously overlooked. Next, we created a catalog of tissue-associated risks for 18,927 protein-coding genes (https://netbio.bgu.ac.il/trace/). As proof-of-concept, we prioritized candidate disease genes identified in 48 rare-disease patients. TRACE ranked the verified disease gene among the patient's candidate genes significantly better than gene prioritization methods that rank by gene constraint or tissue expression. Thus, tissue selectivity combined with machine learning enhances genetic and clinical understanding of hereditary diseases.
Assuntos
Aprendizado de Máquina , Doenças Raras , Humanos , Doenças Raras/genética , Medição de Risco , CausalidadeRESUMO
Hereditary diseases tend to manifest clinically in few selected tissues. Knowledge of those tissues is important for better understanding of disease mechanisms, which often remain elusive. However, information on the tissues inflicted by each disease is not easily obtainable. Well-established resources, such as the Online Mendelian Inheritance in Man (OMIM) database and Human Phenotype Ontology (HPO), report on a spectrum of disease manifestations, yet do not highlight the main inflicted tissues. The Organ-Disease Annotations (ODiseA) database contains 4,357 thoroughly-curated annotations for 2,181 hereditary diseases and 45 inflicted tissues. Additionally, ODiseA reports 692 annotations of 635 diseases and the pathogenic tissues where they emerge. ODiseA can be queried by disease, disease gene, or inflicted tissue. Owing to its expansive, high-quality annotations, ODiseA serves as a valuable and unique tool for biomedical and computational researchers studying genotype-phenotype relationships of hereditary diseases. ODiseA is available at https://netbio.bgu.ac.il/odisea.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Doenças Genéticas Inatas , Humanos , Especificidade de Órgãos , FenótipoRESUMO
Tissue contexts are extremely valuable when studying protein functions and their associated phenotypes. Recently, the study of proteins in tissue contexts was greatly facilitated by the availability of thousands of tissue transcriptomes. To provide access to these data we developed the TissueNet integrative database that displays protein-protein interactions (PPIs) in tissue contexts. Through TissueNet, users can create tissue-sensitive network views of the PPI landscape of query proteins. Unlike other tools, TissueNet output networks highlight tissue-specific and broadly expressed proteins, as well as over- and under-expressed proteins per tissue. The TissueNet v.3 upgrade has a much larger dataset of proteins and PPIs, and represents 125 adult tissues and seven embryonic tissues. Thus, TissueNet provides an extensive, quantitative, and user-friendly interface to study the roles of human proteins in adulthood and embryonic stages. TissueNet v3 is freely available at https://netbio.bgu.ac.il/tissuenet3.