Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
Add more filters










Publication year range
1.
Genome Biol ; 23(1): 140, 2022 06 29.
Article in English | MEDLINE | ID: mdl-35768873

ABSTRACT

BACKGROUND: Coessentiality networks derived from CRISPR screens in cell lines provide a powerful framework for identifying functional modules in the cell and for inferring the roles of uncharacterized genes. However, these networks integrate signal across all underlying data and can mask strong interactions that occur in only a subset of the cell lines analyzed. RESULTS: Here, we decipher dynamic functional interactions by identifying significant cellular contexts, primarily by oncogenic mutation, lineage, and tumor type, and discovering coessentiality relationships that depend on these contexts. We recapitulate well-known gene-context interactions such as oncogene-mutation, paralog buffering, and tissue-specific essential genes, show how mutation rewires known signal transduction pathways, including RAS/RAF and IGF1R-PIK3CA, and illustrate the implications for drug targeting. We further demonstrate how context-dependent functional interactions can elucidate lineage-specific gene function, as illustrated by the maturation of proreceptors IGF1R and MET by proteases FURIN and CPD. CONCLUSIONS: This approach advances our understanding of context-dependent interactions and how they can be gleaned from these data. We provide an online resource to explore these context-dependent interactions at diffnet.hart-lab.org.


Subject(s)
Clustered Regularly Interspaced Short Palindromic Repeats , Signal Transduction , Genes, Essential , Genotype , Mutation
2.
Nucleic Acids Res ; 50(D1): D632-D639, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34747468

ABSTRACT

Network medicine has proven useful for dissecting genetic organization of complex human diseases. We have previously published HumanNet, an integrated network of human genes for disease studies. Since the release of the last version of HumanNet, many large-scale protein-protein interaction datasets have accumulated in public depositories. Additionally, the numbers of research papers and functional annotations for gene-phenotype associations have increased significantly. Therefore, updating HumanNet is a timely task for further improvement of network-based research into diseases. Here, we present HumanNet v3 (https://www.inetbio.org/humannet/, covering 99.8% of human protein coding genes) constructed by means of the expanded data with improved network inference algorithms. HumanNet v3 supports a three-tier model: HumanNet-PI (a protein-protein physical interaction network), HumanNet-FN (a functional gene network), and HumanNet-XC (a functional network extended by co-citation). Users can select a suitable tier of HumanNet for their study purpose. We showed that on disease gene predictions, HumanNet v3 outperforms both the previous HumanNet version and other integrated human gene networks. Furthermore, we demonstrated that HumanNet provides a feasible approach for selecting host genes likely to be associated with COVID-19.


Subject(s)
Algorithms , COVID-19/genetics , Communicable Diseases/genetics , Databases, Genetic , Gene Regulatory Networks , Software , COVID-19/virology , Communicable Diseases/classification , Gene Ontology , Humans , Internet , Molecular Sequence Annotation , Protein Interaction Mapping , SARS-CoV-2/pathogenicity
3.
Nat Commun ; 12(1): 6506, 2021 11 11.
Article in English | MEDLINE | ID: mdl-34764293

ABSTRACT

CRISPR knockout fitness screens in cancer cell lines reveal many genes whose loss of function causes cell death or loss of fitness or, more rarely, the opposite phenotype of faster proliferation. Here we demonstrate a systematic approach to identify these proliferation suppressors, which are highly enriched for tumor suppressor genes, and define a network of 145 such genes in 22 modules. One module contains several elements of the glycerolipid biosynthesis pathway and operates exclusively in a subset of acute myeloid leukemia cell lines. The proliferation suppressor activity of genes involved in the synthesis of saturated fatty acids, coupled with a more severe loss of fitness phenotype for genes in the desaturation pathway, suggests that these cells operate at the limit of their carrying capacity for saturated fatty acids, which we confirm biochemically. Overexpression of this module is associated with a survival advantage in juvenile leukemias, suggesting a clinically relevant subtype.


Subject(s)
Leukemia, Myeloid, Acute/metabolism , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , CRISPR-Associated Proteins/genetics , CRISPR-Associated Proteins/metabolism , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Clustered Regularly Interspaced Short Palindromic Repeats/physiology , Cyclin-Dependent Kinase Inhibitor p21/genetics , Endodeoxyribonucleases/genetics , Endodeoxyribonucleases/metabolism , Humans , Leukemia, Myeloid, Acute/genetics , Lipid Metabolism/genetics , Lipid Metabolism/physiology , Tumor Suppressor Protein p53/genetics , Tumor Suppressor p53-Binding Protein 1/genetics
4.
Proc Natl Acad Sci U S A ; 118(36)2021 09 07.
Article in English | MEDLINE | ID: mdl-34475205

ABSTRACT

Prostate cancer is a leading cause of cancer-related mortality in men. The widespread use of androgen receptor (AR) inhibitors has generated an increased incidence of AR-negative prostate cancer, triggering the need for effective therapies for such patients. Here, analysis of public genome-wide CRISPR screens in human prostate cancer cell lines identified histone demethylase JMJD1C (KDM3C) as an AR-negative context-specific vulnerability. Secondary validation studies in multiple cell lines and organoids, including isogenic models, confirmed that small hairpin RNA (shRNA)-mediated depletion of JMJD1C potently inhibited growth specifically in AR-negative prostate cancer cells. To explore the cooperative interactions of AR and JMJD1C, we performed comparative transcriptomics of 1) isogenic AR-positive versus AR-negative prostate cancer cells, 2) AR-positive versus AR-negative prostate cancer tumors, and 3) isogenic JMJD1C-expressing versus JMJD1C-depleted AR-negative prostate cancer cells. Loss of AR or JMJD1C generates a modest tumor necrosis factor alpha (TNFα) signature, whereas combined loss of AR and JMJD1C strongly up-regulates the TNFα signature in human prostate cancer, suggesting TNFα signaling as a point of convergence for the combined actions of AR and JMJD1C. Correspondingly, AR-negative prostate cancer cells showed exquisite sensitivity to TNFα treatment and, conversely, TNFα pathway inhibition via inhibition of its downstream effector MAP4K4 partially reversed the growth defect of JMJD1C-depleted AR-negative prostate cancer cells. Given the deleterious systemic side effects of TNFα therapy in humans and the viability of JMJD1C-knockout mice, the identification of JMJD1C inhibition as a specific vulnerability in AR-negative prostate cancer may provide an alternative drug target for prostate cancer patients progressing on AR inhibitor therapy.


Subject(s)
Jumonji Domain-Containing Histone Demethylases/genetics , Oxidoreductases, N-Demethylating/genetics , Prostatic Neoplasms/genetics , Receptors, Androgen/metabolism , Apoptosis/drug effects , Cell Line, Tumor , Databases, Genetic , Histone Demethylases/metabolism , Humans , Intracellular Signaling Peptides and Proteins/genetics , Intracellular Signaling Peptides and Proteins/metabolism , Jumonji Domain-Containing Histone Demethylases/metabolism , Male , Oxidoreductases, N-Demethylating/metabolism , Promoter Regions, Genetic/drug effects , Prostate/pathology , Protein Serine-Threonine Kinases/genetics , Receptors, Androgen/genetics , Signal Transduction/drug effects , Transcriptional Activation/drug effects , Tumor Necrosis Factor-alpha/metabolism
5.
Genome Med ; 13(1): 2, 2021 01 06.
Article in English | MEDLINE | ID: mdl-33407829

ABSTRACT

BACKGROUND: Identifying essential genes in genome-wide loss-of-function screens is a critical step in functional genomics and cancer target finding. We previously described the Bayesian Analysis of Gene Essentiality (BAGEL) algorithm for accurate classification of gene essentiality from short hairpin RNA and CRISPR/Cas9 genome-wide genetic screens. RESULTS: We introduce an updated version, BAGEL2, which employs an improved model that offers a greater dynamic range of Bayes Factors, enabling detection of tumor suppressor genes; a multi-target correction that reduces false positives from off-target CRISPR guide RNA; and the implementation of a cross-validation strategy that improves performance ~ 10× over the prior bootstrap resampling approach. We also describe a metric for screen quality at the replicate level and demonstrate how different algorithms handle lower quality data in substantially different ways. CONCLUSIONS: BAGEL2 substantially improves the sensitivity, specificity, and performance over BAGEL and establishes the new state of the art in the analysis of CRISPR knockout fitness screens. BAGEL2 is written in Python 3 and source code, along with all supporting files, are available on github ( https://github.com/hart-lab/bagel ).


Subject(s)
Algorithms , CRISPR-Cas Systems/genetics , Genes, Essential , Genetic Testing , Bayes Theorem , Cell Line, Tumor , Data Accuracy , Humans , Likelihood Functions , Regression Analysis
6.
Genome Biol ; 21(1): 262, 2020 10 15.
Article in English | MEDLINE | ID: mdl-33059726

ABSTRACT

BACKGROUND: Pooled library CRISPR/Cas9 knockout screening across hundreds of cell lines has identified genes whose disruption leads to fitness defects, a critical step in identifying candidate cancer targets. However, the number of essential genes detected from these monogenic knockout screens is low compared to the number of constitutively expressed genes in a cell. RESULTS: Through a systematic analysis of screen data in cancer cell lines generated by the Cancer Dependency Map, we observe that half of all constitutively expressed genes are never detected in any CRISPR screen and that these never-essentials are highly enriched for paralogs. We investigated functional buffering among approximately 400 candidate paralog pairs using CRISPR/enCas12a dual-gene knockout screening in three cell lines. We observe 24 synthetic lethal paralog pairs that have escaped detection by monogenic knockout screens at stringent thresholds. Nineteen of 24 (79%) synthetic lethal interactions are present in at least two out of three cell lines and 14 of 24 (58%) are present in all three cell lines tested, including alternate subunits of stable protein complexes as well as functionally redundant enzymes. CONCLUSIONS: Together, these observations strongly suggest that functionally redundant paralogs represent a targetable set of genetic dependencies that are systematically under-represented among cell-essential genes in monogenic CRISPR-based loss of function screens.


Subject(s)
CRISPR-Cas Systems , Genes, Essential , Neoplasms/genetics , A549 Cells , CRISPR-Associated Protein 9 , HT29 Cells , Humans
7.
Nature ; 586(7827): 120-126, 2020 10.
Article in English | MEDLINE | ID: mdl-32968282

ABSTRACT

The genetic circuits that allow cancer cells to evade destruction by the host immune system remain poorly understood1-3. Here, to identify a phenotypically robust core set of genes and pathways that enable cancer cells to evade killing mediated by cytotoxic T lymphocytes (CTLs), we performed genome-wide CRISPR screens across a panel of genetically diverse mouse cancer cell lines that were cultured in the presence of CTLs. We identify a core set of 182 genes across these mouse cancer models, the individual perturbation of which increases either the sensitivity or the resistance of cancer cells to CTL-mediated toxicity. Systematic exploration of our dataset using genetic co-similarity reveals the hierarchical and coordinated manner in which genes and pathways act in cancer cells to orchestrate their evasion of CTLs, and shows that discrete functional modules that control the interferon response and tumour necrosis factor (TNF)-induced cytotoxicity are dominant sub-phenotypes. Our data establish a central role for genes that were previously identified as negative regulators of the type-II interferon response (for example, Ptpn2, Socs1 and Adar1) in mediating CTL evasion, and show that the lipid-droplet-related gene Fitm2 is required for maintaining cell fitness after exposure to interferon-γ (IFNγ). In addition, we identify the autophagy pathway as a conserved mediator of the evasion of CTLs by cancer cells, and show that this pathway is required to resist cytotoxicity induced by the cytokines IFNγ and TNF. Through the mapping of cytokine- and CTL-based genetic interactions, together with in vivo CRISPR screens, we show how the pleiotropic effects of autophagy control cancer-cell-intrinsic evasion of killing by CTLs and we highlight the importance of these effects within the tumour microenvironment. Collectively, these data expand our knowledge of the genetic circuits that are involved in the evasion of the immune system by cancer cells, and highlight genetic interactions that contribute to phenotypes associated with escape from killing by CTLs.


Subject(s)
Genome/genetics , Genomics , Neoplasms/genetics , Neoplasms/immunology , T-Lymphocytes, Cytotoxic/immunology , Tumor Escape/genetics , Tumor Escape/immunology , Animals , Autophagy , Cell Line, Tumor , Female , Genes, Neoplasm/genetics , Humans , Interferon-gamma/immunology , Male , Mice , NF-kappa B/metabolism , Reproducibility of Results , Signal Transduction
8.
Bioinformatics ; 36(5): 1584-1589, 2020 03 01.
Article in English | MEDLINE | ID: mdl-31599923

ABSTRACT

MOTIVATION: Owing to advanced DNA sequencing and genome assembly technology, the number of species with sequenced genomes is rapidly increasing. The aim of the recently launched Earth BioGenome Project is to sequence genomes of all eukaryotic species on Earth over the next 10 years, making it feasible to obtain genomic blueprints of the majority of animal and plant species by this time. Genetic models of the sequenced species will later be subject to functional annotation, and a comprehensive molecular network should facilitate functional analysis of individual genes and pathways. However, network databases are lagging behind genome sequencing projects as even the largest network database provides gene networks for less than 10% of sequenced eukaryotic genomes, and the knowledge gap between genomes and interactomes continues to widen. RESULTS: We present BiomeNet, a database of 95 scored networks comprising over 8 million co-functional links, which can build and analyze gene networks for any species with the sequenced genome. BiomeNet transfers functional interactions between orthologous proteins from source networks to the target species within minutes and automatically constructs gene networks with the quality comparable to that of existing networks. BiomeNet enables assembly of the first-in-species gene networks not available through other databases, which are highly predictive of diverse biological processes and can also provide network analysis by extracting subnetworks for individual biological processes and network-based gene prioritizations. These data indicate that BiomeNet could enhance the benefits of decoding the genomes of various species, thus improving our understanding of the Earth' biodiversity. AVAILABILITY AND IMPLEMENTATION: The BiomeNet is freely available at http://kobic.re.kr/biomenet/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Databases, Genetic , Genome , Animals , Gene Regulatory Networks , Genomics , Sequence Analysis, DNA
9.
Life Sci Alliance ; 2(2)2019 04.
Article in English | MEDLINE | ID: mdl-30979825

ABSTRACT

Genetic interactions mediate the emergence of phenotype from genotype. The systematic survey of genetic interactions in yeast showed that genes operating in the same biological process have highly correlated genetic interaction profiles, and this observation has been exploited to infer gene function in model organisms. Such assays of digenic perturbations in human cells are also highly informative, but are not scalable, even with CRISPR-mediated methods. As an alternative, we developed an indirect method of deriving functional interactions. We show that genes having correlated knockout fitness profiles across diverse, non-isogenic cell lines are analogous to genes having correlated genetic interaction profiles across isogenic query strains and similarly imply shared biological function. We constructed a network of genes with correlated fitness profiles across 276 high-quality CRISPR knockout screens in cancer cell lines into a "coessentiality network," with up to 500-fold enrichment for co-functional gene pairs, enabling strong inference of gene function and highlighting the modular organization of the cell.


Subject(s)
Gene Knockout Techniques , Gene Regulatory Networks/genetics , Neoplasms/genetics , Neoplasms/pathology , Cell Line, Tumor , Clustered Regularly Interspaced Short Palindromic Repeats/genetics , Databases, Genetic , Genes, Neoplasm/genetics , Genotype , Humans , Phenotype , Protein Biosynthesis , RNA, Small Interfering/genetics , Saccharomyces cerevisiae/genetics , Signal Transduction/genetics
10.
Nucleic Acids Res ; 47(D1): D573-D580, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30418591

ABSTRACT

Human gene networks have proven useful in many aspects of disease research, with numerous network-based strategies developed for generating hypotheses about gene-disease-drug associations. The ability to predict and organize genes most relevant to a specific disease has proven especially important. We previously developed a human functional gene network, HumanNet, by integrating diverse types of omics data using Bayesian statistics framework and demonstrated its ability to retrieve disease genes. Here, we present HumanNet v2 (http://www.inetbio.org/humannet), a database of human gene networks, which was updated by incorporating new data types, extending data sources and improving network inference algorithms. HumanNet now comprises a hierarchy of human gene networks, allowing for more flexible incorporation of network information into studies. HumanNet performs well in ranking disease-linked gene sets with minimal literature-dependent biases. We observe that incorporating model organisms' protein-protein interactions does not markedly improve disease gene predictions, suggesting that many of the disease gene associations are now captured directly in human-derived datasets. With an improved interactive user interface for disease network analysis, we expect HumanNet will be a useful resource for network medicine.


Subject(s)
Databases, Genetic , Gene Regulatory Networks , Algorithms , Disease/genetics , Humans , User-Computer Interface
11.
Cell Syst ; 5(4): 314-316, 2017 10 25.
Article in English | MEDLINE | ID: mdl-29073370

ABSTRACT

Hemizygous deletion of a gene in tumor cells frequently causes reduced expression of its encoded mRNA and protein, as well as reduced protein-but not mRNA-expression of other members in the same protein complex.


Subject(s)
Breast Neoplasms , Humans , RNA, Messenger , Sequence Deletion
12.
Methods Mol Biol ; 1611: 183-198, 2017.
Article in English | MEDLINE | ID: mdl-28451980

ABSTRACT

The mouse, Mus musculus, is a popular model organism for the study of human genes involved in development, immunology, and disease phenotypes. Despite recent revolutions in gene-knockout technologies in mouse, identification of candidate genes for functions of interest can further accelerate the discovery of novel gene functions. The collaborative nature of genetic functions allows for the inference of gene functions based on the principle of guilt-by-association. Genome-scale co-functional networks could therefore provide functional predictions for genes via network analysis. We recently constructed such a network for mouse (MouseNet), which interconnects over 88% of protein-coding genes with 788,080 functional relationships. The companion web server ( www.inetbio.org/mousenet ) enables researchers with no bioinformatics expertise to generate predictions that facilitate discovery of novel gene functions. In this chapter, we present the theoretical framework for MouseNet, as well as step-by-step instructions and technical tips for functional prediction of genes and pathways in mouse and other model vertebrates.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks/genetics , Software , Vertebrates/genetics , Animals , Databases, Genetic , Mice
13.
Nucleic Acids Res ; 45(D1): D1082-D1089, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27492285

ABSTRACT

Soybean (Glycine max) is a legume crop with substantial economic value, providing a source of oil and protein for humans and livestock. More than 50% of edible oils consumed globally are derived from this crop. Soybean plants are also important for soil fertility, as they fix atmospheric nitrogen by symbiosis with microorganisms. The latest soybean genome annotation (version 2.0) lists 56 044 coding genes, yet their functional contributions to crop traits remain mostly unknown. Co-functional networks have proven useful for identifying genes that are involved in a particular pathway or phenotype with various network algorithms. Here, we present SoyNet (available at www.inetbio.org/soynet), a database of co-functional networks for G. max and a companion web server for network-based functional predictions. SoyNet maps 1 940 284 co-functional links between 40 812 soybean genes (72.8% of the coding genome), which were inferred from 21 distinct types of genomics data including 734 microarrays and 290 RNA-seq samples from soybean. SoyNet provides a new route to functional investigation of the soybean genome, elucidating genes and pathways of agricultural importance.


Subject(s)
Databases, Genetic , Gene Expression Regulation, Plant , Gene Regulatory Networks , Genomics/methods , Glycine max/genetics , Signal Transduction , Evolution, Molecular , Metabolic Networks and Pathways/genetics , Phenotype , Glycine max/metabolism
15.
Nucleic Acids Res ; 45(D1): D389-D396, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27679477

ABSTRACT

The use of high-throughput array and sequencing technologies has produced unprecedented amounts of gene expression data in central public depositories, including the Gene Expression Omnibus (GEO). The immense amount of expression data in GEO provides both vast research opportunities and data analysis challenges. Co-expression analysis of high-dimensional expression data has proven effective for the study of gene functions, and several co-expression databases have been developed. Here, we present a new co-expression database, COEXPEDIA (www.coexpedia.org), which is distinctive from other co-expression databases in three aspects: (i) it contains only co-functional co-expressions that passed a rigorous statistical assessment for functional association, (ii) the co-expressions were inferred from individual studies, each of which was designed to investigate gene functions with respect to a particular biomedical context such as a disease and (iii) the co-expressions are associated with medical subject headings (MeSH) that provide biomedical information for anatomical, disease, and chemical relevance. COEXPEDIA currently contains approximately eight million co-expressions inferred from 384 and 248 GEO series for humans and mice, respectively. We describe how these MeSH-associated co-expressions enable the identification of diseases and drugs previously unknown to be related to a gene or a gene group of interest.


Subject(s)
Computational Biology/methods , Databases, Genetic , Medical Subject Headings , Gene Expression Profiling/methods , Gene Expression Regulation , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Humans , Software
16.
Sci Rep ; 6: 33038, 2016 09 09.
Article in English | MEDLINE | ID: mdl-27609711

ABSTRACT

Transplantation of mesenchymal stem cells (MSCs) was reported to improve functional outcomes in a rat model of ischemic stroke, and subsequent studies suggest that MSC-derived microvesicles (MVs) can replace the beneficial effects of MSCs. Here, we evaluated three different MSC-derived MVs, including MVs from untreated MSCs (MSC-MVs), MVs from MSCs treated with normal rat brain extract (NBE-MSC-MVs), and MVs from MSCs treated with stroke-injured rat brain extract (SBE-MSC-MVs), and tested their effects on ischemic brain injury induced by permanent middle cerebral artery occlusion (pMCAO) in rats. NBE-MSC-MVs and SBE-MSC-MVs had significantly greater efficacy than MSC-MVs for ameliorating ischemic brain injury with improved functional recovery. We found similar profiles of key signalling proteins in NBE-MSC-MVs and SBE-MSC-MVs, which account for their similar therapeutic efficacies. Immunohistochemical analyses suggest that brain-extract-treated MSC-MVs reduce inflammation, enhance angiogenesis, and increase endogenous neurogenesis in the rat brain. We performed mass spectrometry proteomic analyses and found that the total proteomes of brain-extract-treated MSC-MVs are highly enriched for known vesicular proteins. Notably, MSC-MV proteins upregulated by brain extracts tend to be modular for tissue repair pathways. We suggest that MSC-MV proteins stimulated by the brain microenvironment are paracrine effectors that enhance MSC therapy for stroke injury.


Subject(s)
Brain Ischemia/therapy , Brain , Cell-Derived Microparticles , Complex Mixtures/pharmacology , Mesenchymal Stem Cells , Recovery of Function , Stroke/therapy , Animals , Brain Chemistry , Brain Ischemia/metabolism , Brain Ischemia/pathology , Brain Ischemia/physiopathology , Complex Mixtures/chemistry , Disease Models, Animal , Male , Rats , Rats, Sprague-Dawley , Stroke/pathology , Stroke/physiopathology
17.
Genome Biol ; 17(1): 129, 2016 06 23.
Article in English | MEDLINE | ID: mdl-27333808

ABSTRACT

A major challenge for distinguishing cancer-causing driver mutations from inconsequential passenger mutations is the long-tail of infrequently mutated genes in cancer genomes. Here, we present and evaluate a method for prioritizing cancer genes accounting not only for mutations in individual genes but also in their neighbors in functional networks, MUFFINN (MUtations For Functional Impact on Network Neighbors). This pathway-centric method shows high sensitivity compared with gene-centric analyses of mutation data. Notably, only a marginal decrease in performance is observed when using 10 % of TCGA patient samples, suggesting the method may potentiate cancer genome projects with small patient populations.


Subject(s)
DNA Mutational Analysis/methods , Neoplasm Proteins/genetics , Neoplasms/genetics , Signal Transduction/genetics , Computational Biology , Databases, Genetic , Genome, Human , Humans , Mutation , Oncogenes/genetics , Software
18.
Nucleic Acids Res ; 44(D1): D848-54, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26527726

ABSTRACT

Laboratory mouse, Mus musculus, is one of the most important animal tools in biomedical research. Functional characterization of the mouse genes, hence, has been a long-standing goal in mammalian and human genetics. Although large-scale knockout phenotyping is under progress by international collaborative efforts, a large portion of mouse genome is still poorly characterized for cellular functions and associations with disease phenotypes. A genome-scale functional network of mouse genes, MouseNet, was previously developed in context of MouseFunc competition, which allowed only limited input data for network inferences. Here, we present an improved mouse co-functional network, MouseNet v2 (available at http://www.inetbio.org/mousenet), which covers 17 714 genes (>88% of coding genome) with 788 080 links, along with a companion web server for network-assisted functional hypothesis generation. The network database has been substantially improved by large expansion of genomics data. For example, MouseNet v2 database contains 183 co-expression networks inferred from 8154 public microarray samples. We demonstrated that MouseNet v2 is predictive for mammalian phenotypes as well as human diseases, which suggests its usefulness in discovery of novel disease genes and dissection of disease pathways. Furthermore, MouseNet v2 database provides functional networks for eight other vertebrate models used in various research fields.


Subject(s)
Databases, Genetic , Gene Regulatory Networks , Mice/genetics , Animals , Cattle , Disease/genetics , Dogs , Genomics , Humans , Phenotype , Rats
19.
Sci Rep ; 5: 17875, 2015 Dec 07.
Article in English | MEDLINE | ID: mdl-26639839

ABSTRACT

The success of clinical genomics using next generation sequencing (NGS) requires the accurate and consistent identification of personal genome variants. Assorted variant calling methods have been developed, which show low concordance between their calls. Hence, a systematic comparison of the variant callers could give important guidance to NGS-based clinical genomics. Recently, a set of high-confident variant calls for one individual (NA12878) has been published by the Genome in a Bottle (GIAB) consortium, enabling performance benchmarking of different variant calling pipelines. Based on the gold standard reference variant calls from GIAB, we compared the performance of thirteen variant calling pipelines, testing combinations of three read aligners--BWA-MEM, Bowtie2, and Novoalign--and four variant callers--Genome Analysis Tool Kit HaplotypeCaller (GATK-HC), Samtools mpileup, Freebayes and Ion Proton Variant Caller (TVC), for twelve data sets for the NA12878 genome sequenced by different platforms including Illumina2000, Illumina2500, and Ion Proton, with various exome capture systems and exome coverage. We observed different biases toward specific types of SNP genotyping errors by the different variant callers. The results of our study provide useful guidelines for reliable variant identification from deep sequencing of personal genomes.


Subject(s)
Exome/genetics , Genetic Variation , Genotyping Techniques/methods , Genotyping Techniques/standards , Base Sequence , High-Throughput Nucleotide Sequencing , Humans , Polymorphism, Single Nucleotide/genetics , Reference Standards
20.
Sci Rep ; 5: 11432, 2015 Jun 12.
Article in English | MEDLINE | ID: mdl-26066708

ABSTRACT

The reconstruction of transcriptional regulatory networks (TRNs) is a long-standing challenge in human genetics. Numerous computational methods have been developed to infer regulatory interactions between human transcriptional factors (TFs) and target genes from high-throughput data, and their performance evaluation requires gold-standard interactions. Here we present a database of literature-curated human TF-target interactions, TRRUST (transcriptional regulatory relationships unravelled by sentence-based text-mining, http://www.grnpedia.org/trrust), which currently contains 8,015 interactions between 748 TF genes and 1,975 non-TF genes. A sentence-based text-mining approach was employed for efficient manual curation of regulatory interactions from approximately 20 million Medline abstracts. To the best of our knowledge, TRRUST is the largest publicly available database of literature-curated human TF-target interactions to date. TRRUST also has several useful features: i) information about the mode-of-regulation; ii) tests for target modularity of a query TF; iii) tests for TF cooperativity of a query target; iv) inferences about cooperating TFs of a query TF; and v) prioritizing associated pathways and diseases with a query TF. We observed high enrichment of TF-target pairs in TRRUST for top-scored interactions inferred from high-throughput data, which suggests that TRRUST provides a reliable benchmark for the computational reconstruction of human TRNs.


Subject(s)
Data Mining , Databases, Genetic , Transcription, Genetic , Transcriptome , Data Curation , Humans
SELECTION OF CITATIONS
SEARCH DETAIL