Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 19 de 19
1.
Brief Bioinform ; 24(6)2023 09 22.
Article En | MEDLINE | ID: mdl-37779249

To contain infectious diseases, it is crucial to determine the origin and transmission routes of the pathogen, as well as how the virus evolves. With the development of genome sequencing technology, genome epidemiology has emerged as a powerful approach for investigating the source and transmission of pathogens. In this study, we first presented the rationale for genomic tracing of SARS-CoV-2 and the challenges we currently face. Identifying the most genetically similar reference sequence to the query sequence is a critical step in genome tracing, typically achieved using either a phylogenetic tree or a sequence similarity search. However, these methods become inefficient or computationally prohibitive when dealing with tens of millions of sequences in the reference database, as we encountered during the COVID-19 pandemic. To address this challenge, we developed a novel genomic tracing algorithm capable of processing 6 million SARS-CoV-2 sequences in less than a minute. Instead of constructing a giant phylogenetic tree, we devised a weighted scoring system based on mutation characteristics to quantify sequences similarity. The developed method demonstrated superior performance compared to previous methods. Additionally, an online platform was developed to facilitate genomic tracing and visualization of the spatiotemporal distribution of sequences. The method will be a valuable addition to standard epidemiological investigations, enabling more efficient genomic tracing. Furthermore, the computational framework can be easily adapted to other pathogens, paving the way for routine genomic tracing of infectious diseases.


COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , COVID-19/genetics , Phylogeny , Pandemics , Genome, Viral , Genomics/methods
2.
Comput Struct Biotechnol J ; 21: 3841-3853, 2023.
Article En | MEDLINE | ID: mdl-37564101

Background: Esophageal cancers are primarily categorized as esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC). While various (epi) genomic alterations associated with tumor development in ESCC and EAC have been documented, a comprehensive comparison of the transcriptomes in these two cancer subtypes remains lacking. Methods: We collected 551 gene expression profiles from publicly available sources, including normal, ESCC, and EAC tissues or cell lines. Subsequently, we conducted a systematic analysis to compare the transcriptomes of these samples at various levels, including gene expression, promoter activity, alternative splicing (AS), alternative polyadenylation (APA), and gene fusion. Results: Seven distinct cluster gene expression patterns were identified among the differentially expressed genes in normal, ESCC, and EAC tissues. These patterns were enriched in the PI3K-Akt signaling pathway and the activation of extracellular matrix organization and exhibited repression of epidermal development. Notably, we observed additional genes or unique expression levels enriched in these shared pathways and biological processes related to tumor development and immune activation. In addition to the differentially expressed genes, there was an enrichment of lncRNA co-expression networks and downregulation of promoter activity associated with the repression of epidermal development in both ESCC and EAC. This indicates a common feature between these two cancer subtypes. Furthermore, differential AS and APA patterns in ESCC and EAC appear to partially affect the expression of host genes associated with bacterial or viral infections in these subtypes. No gene fusions were observed between ESCC and EAC, thus highlighting the distinct molecular mechanisms underlying these two cancer subtypes. Conclusions: We conducted a comprehensive comparison of ESCC and EAC transcriptomes and uncovered shared and distinct transcriptomic signatures at multiple levels. These findings suggest that ESCC and EAC may exhibit common and unique mechanisms involved in tumorigenesis.

3.
Brief Bioinform ; 24(3)2023 05 19.
Article En | MEDLINE | ID: mdl-37170752

Haplotype networks are graphs used to represent evolutionary relationships between a set of taxa and are characterized by intuitiveness in analyzing genealogical relationships of closely related genomes. We here propose a novel algorithm termed McAN that considers mutation spectrum history (mutations in ancestry haplotype should be contained in descendant haplotype), node size (corresponding to sample count for a given node) and sampling time when constructing haplotype network. We show that McAN is two orders of magnitude faster than state-of-the-art algorithms without losing accuracy, making it suitable for analysis of a large number of sequences. Based on our algorithm, we developed an online web server and offline tool for haplotype network construction, community lineage determination, and interactive network visualization. We demonstrate that McAN is highly suitable for analyzing and visualizing massive genomic data and is helpful to enhance the understanding of genome evolution. Availability: Source code is written in C/C++ and available at https://github.com/Theory-Lun/McAN and https://ngdc.cncb.ac.cn/biocode/tools/BT007301 under the MIT license. Web server is available at https://ngdc.cncb.ac.cn/bit/hapnet/. SARS-CoV-2 dataset are available at https://ngdc.cncb.ac.cn/ncov/. Contact: songshh@big.ac.cn (Song S), zhaowm@big.ac.cn (Zhao W), baoym@big.ac.cn (Bao Y), zhangzhang@big.ac.cn (Zhang Z), ybxue@big.ac.cn (Xue Y).


COVID-19 , SARS-CoV-2 , Humans , Haplotypes , SARS-CoV-2/genetics , COVID-19/genetics , Algorithms , Genomics , Software
4.
Nucleic Acids Res ; 51(D1): D1249-D1256, 2023 01 06.
Article En | MEDLINE | ID: mdl-36350608

CRISPR-Cas base editing (BE) system is a powerful tool to expand the scope and efficiency of genome editing with single-nucleotide resolution. The editing efficiency, product purity, and off-target effect differ among various BE systems. Herein, we developed CRISPRbase (http://crisprbase.maolab.org), by integrating 1 252 935 records of base editing outcomes in more than 50 cell types from 17 species. CRISPRbase helps to evaluate the putative editing precision of different BE systems by integrating multiple annotations, functional predictions and a blasting system for single-guide RNA sequences. We systematically assessed the editing window, editing efficiency and product purity of various BE systems. Intensive efforts were focused on increasing the editing efficiency and product purity of base editors since the byproduct could be detrimental in certain applications. Remarkably, more than half of cancer-related off-target mutations were non-synonymous and extremely damaging to protein functions in most common tumor types. Luckily, most of these cancer-related mutations were passenger mutations (4840/5703, 84.87%) rather than cancer driver mutations (863/5703, 15.13%), indicating a weak effect of off-target mutations on carcinogenesis. In summary, CRISPRbase is a powerful and convenient tool to study the outcomes of different base editors and help researchers choose appropriate BE designs for functional studies.


Gene Editing , Neoplasms , Humans , CRISPR-Cas Systems/genetics , Mutation , Neoplasms/genetics
5.
Brief Bioinform ; 23(2)2022 03 10.
Article En | MEDLINE | ID: mdl-35037014

Optimal methods could effectively improve the accuracy of predicting and identifying candidate driver genes. Various computational methods based on mutational frequency, network and function approaches have been developed to identify mutation driver genes in cancer genomes. However, a comprehensive evaluation of the performance levels of network-, function- and frequency-based methods is lacking. In the present study, we assessed and compared eight performance criteria for eight network-based, one function-based and three frequency-based algorithms using eight benchmark datasets. Under different conditions, the performance of approaches varied in terms of network, measurement and sample size. The frequency-based driverMAPS and network-based HotNet2 methods showed the best overall performance. Network-based algorithms using protein-protein interaction networks outperformed the function- and the frequency-based approaches. Precision, F1 score and Matthews correlation coefficient were low for most approaches. Thus, most of these algorithms require stringent cutoffs to correctly distinguish driver and non-driver genes. We constructed a website named Cancer Driver Catalog (http://159.226.67.237/sun/cancer_driver/), wherein we integrated the gene scores predicted by the foregoing software programs. This resource provides valuable guidance for cancer researchers and clinical oncologists prioritizing cancer driver gene candidates by using an optimal tool.


Neoplasms , Oncogenes , Algorithms , Computational Biology/methods , Gene Regulatory Networks , Humans , Mutation , Neoplasms/genetics , Software
6.
Nucleic Acids Res ; 50(D1): D72-D82, 2022 01 07.
Article En | MEDLINE | ID: mdl-34792166

Rapid advances in high-throughput sequencing technologies have led to the discovery of thousands of extrachromosomal circular DNAs (eccDNAs) in the human genome. Loss-of-function experiments are difficult to conduct on circular and linear chromosomes, as they usually overlap. Hence, it is challenging to interpret the molecular functions of eccDNAs. Here, we present CircleBase (http://circlebase.maolab.org), an integrated resource and analysis platform used to curate and interpret eccDNAs in multiple cell types. CircleBase identifies putative functional eccDNAs by incorporating sequencing datasets, computational predictions, and manual annotations. It classifies them into six sections including targeting genes, epigenetic regulations, regulatory elements, chromatin accessibility, chromatin interactions, and genetic variants. The eccDNA targeting and regulatory networks are displayed by informative visualization tools and then prioritized. Functional enrichment analyses revealed that the top-ranked cancer cell eccDNAs were enriched in oncogenic pathways such as the Ras and PI3K-Akt signaling pathways. In contrast, eccDNAs from healthy individuals were not significantly enriched. CircleBase provides a user-friendly interface for searching, browsing, and analyzing eccDNAs in various cell/tissue types. Thus, it is useful to screen for potential functional eccDNAs and interpret their molecular mechanisms in human cancers and other diseases.


Chromosomes/genetics , DNA, Circular/genetics , Databases, Genetic , Extrachromosomal Inheritance/genetics , Cell Lineage/genetics , Cytoplasm/genetics , Genome, Human/genetics , High-Throughput Nucleotide Sequencing , Humans
8.
Am J Respir Crit Care Med ; 204(12): 1379-1390, 2021 12 15.
Article En | MEDLINE | ID: mdl-34534435

Rationale: Alteration of human respiratory microbiota had been observed in coronavirus disease (COVID-19). How the microbiota is associated with the prognosis in COVID-19 is unclear. Objectives: To characterize the feature and dynamics of the respiratory microbiota and its associations with clinical features in patients with COVID-19. Methods: We conducted metatranscriptome sequencing on 588 longitudinal oropharyngeal swab specimens collected from 192 patients with COVID-19 (including 39 deceased patients) and 95 healthy controls from the same geographic area. Meanwhile, the concentration of 27 cytokines and chemokines in plasma was measured for patients with COVID-19. Measurements and Main Results: The upper respiratory tract (URT) microbiota in patients with COVID-19 differed from that in healthy controls, whereas deceased patients possessed a more distinct microbiota, both on admission and before discharge/death. The alteration of URT microbiota showed a significant correlation with the concentration of proinflammatory cytokines and mortality. Specifically, Streptococcus-dominated microbiota was enriched in recovered patients, and showed high temporal stability and resistance against pathogens. In contrast, the microbiota in deceased patients was more susceptible to secondary infections and became more deviated from the norm after admission. Moreover, the abundance of S. parasanguinis on admission was significantly correlated with prognosis in nonsevere patients (lower vs. higher abundance, odds ratio, 7.80; 95% CI, 1.70-42.05). Conclusions: URT microbiota dysbiosis is a remarkable manifestation of COVID-19; its association with mortality suggests it may reflect the interplay between pathogens, symbionts, and the host immune status. Whether URT microbiota could be used as a biomarker for diagnosis and prognosis of respiratory diseases merits further investigation.


COVID-19/microbiology , COVID-19/mortality , Microbiota , Respiratory Tract Infections/microbiology , Respiratory Tract Infections/mortality , Adult , Aged , COVID-19/epidemiology , Female , Humans , Male , Middle Aged , Prognosis , SARS-CoV-2
9.
Comput Struct Biotechnol J ; 19: 2416-2422, 2021.
Article En | MEDLINE | ID: mdl-34025933

Addiction, a disorder of maladaptive brain plasticity, is associated with changes in numerous gene expressions. Nowadays, high-throughput sequencing data on addictive substance-induced gene expression have become widely available. A resource for comprehensive annotation of genes that show differential expression in response to commonly abused substances is necessary. So, we developed AddictGene by integrating gene expression, gene-gene interaction, gene-drug interaction and epigenetic regulatory annotation for over 70,156 items of differentially expressed genes associated with 7 commonly abused substances, including alcohol, nicotine, cocaine, morphine, heroin, methamphetamine, and amphetamine, across three species (human, mouse, rat). We also collected 1,141 addiction-related experimentally validated genes by techniques such as RT-PCR, northern blot and in situ hybridization. The easy-to-use web interface of AddictGene (http://159.226.67.237/sun/addictgedb/) allows users to search and browse multidimensional data on DEGs of their interest: 1) detailed gene-specific information extracted from the original studies; 2) basic information about the specific gene extracted from NCBI; 3) SNP associated with substance dependence and other psychiatry disorders; 4) expression alteration of specific gene in other psychiatric disorders; 5) expression patterns of interested gene across 31 primary and 54 secondary human tissues; 6) functional annotation of interested gene; 7) epigenetic regulators involved in the alteration of specific genes, including histone modifications and DNA methylation; 8) protein-protein interaction for functional linkage with interested gene; 9) drug-gene interaction for potential druggability. AddictGene offers a valuable repository for researchers to study the molecular mechanisms underlying addiction, and might provide valuable insights into potential therapies for drug abuse and relapse.

10.
Clin Infect Dis ; 71(15): 713-720, 2020 07 28.
Article En | MEDLINE | ID: mdl-32129843

BACKGROUND: A novel coronavirus (CoV), severe acute respiratory syndrome (SARS)-CoV-2, has infected >75 000 individuals and spread to >20 countries. It is still unclear how fast the virus evolved and how it interacts with other microorganisms in the lung. METHODS: We have conducted metatranscriptome sequencing for bronchoalveolar lavage fluid samples from 8 patients with SARS-CoV-2, and also analyzed data from 25 patients with community-acquired pneumonia (CAP), and 20 healthy controls for comparison. RESULTS: The median number of intrahost variants was 1-4 in SARS-CoV-2-infected patients, ranged from 0 to 51 in different samples. The distribution of variants on genes was similar to those observed in the population data. However, very few intrahost variants were observed in the population as polymorphisms, implying either a bottleneck or purifying selection involved in the transmission of the virus, or a consequence of the limited diversity represented in the current polymorphism data. Although current evidence did not support the transmission of intrahost variants in a possible person-to-person spread, the risk should not be overlooked. Microbiotas in SARS-CoV-2-infected patients were similar to those in CAP, either dominated by the pathogens or with elevated levels of oral and upper respiratory commensal bacteria. CONCLUSION: SARS-CoV-2 evolves in vivo after infection, which may affect its virulence, infectivity, and transmissibility. Although how the intrahost variant spreads in the population is still elusive, it is necessary to strengthen the surveillance of the viral evolution in the population and associated clinical changes.


Coronavirus Infections/epidemiology , Coronavirus , Pandemics , Pneumonia, Viral/epidemiology , Severe Acute Respiratory Syndrome , Betacoronavirus , COVID-19 , Genetic Variation , Genomics , Humans , SARS-CoV-2
11.
Transl Psychiatry ; 10(1): 4, 2020 01 15.
Article En | MEDLINE | ID: mdl-32066658

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with a male-to-female prevalence of 4:1. However, the genetic mechanisms underlying this gender difference remain unclear. Mutation burden analysis, a TADA model, and co-expression and functional network analyses were performed on de novo mutations (DNMs) and corresponding candidate genes. We found that the prevalence of putative functional DNMs (loss-of-function and predicted deleterious missense mutations) in females was significantly higher than that in males, suggesting that a higher genetic load was required in females to reach the threshold for a diagnosis. We then prioritized 174 candidate genes, including 60 shared genes, 91 male-specific genes, and 23 female-specific genes. All of the three subclasses of candidate genes were significantly more frequently co-expressed in female brains than male brains, suggesting that compensation effects of the deficiency of ASD candidate genes may be more likely in females. Nevertheless, the three subclasses of candidate genes were co-expressed with each other, suggesting a convergent functional network of male and female-specific genes. Our analysis of different aspects of genetic components provides suggestive evidence supporting the female-protective effect in ASD. Moreover, further study is needed to integrate neuronal and hormonal data to elucidate the underlying gender difference in ASD.


Autism Spectrum Disorder , Autism Spectrum Disorder/genetics , Brain , DNA Mutational Analysis , Female , Humans , Male , Sex Characteristics
12.
Mol Psychiatry ; 24(11): 1720-1731, 2019 11.
Article En | MEDLINE | ID: mdl-29875476

The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) controversially combined previously distinct subcategories of autism spectrum disorder (ASD) into a single diagnostic category. However, genetic convergences and divergences between different ASD subcategories are unclear. By retrieving 1725 exonic de novo mutations (DNMs) from 1628 subjects with autistic disorder (AD), 1873 from 1564 subjects with pervasive developmental disorder not otherwise specified (PDD-NOS), 276 from 247 subjects with Asperger's syndrome (AS), and 2077 from 2299 controls, we found that rates of putative functional DNMs (loss-of-function, predicted deleterious missense, and frameshift) in all three subcategories were significantly higher than those in control. We then investigated the convergences and divergences of the three ASD subcategories based on four genetic aspects: whether any two ASD subcategories (1) shared significantly more genes with functional DNMs, (2) exhibited similar spatio-temporal expression patterns, (3) shared significantly more candidate genes, and (4) shared some ASD-associated functional pathways. It is revealed that AD and PDD-NOS were broadly convergent in terms of all four genetic aspects, suggesting these two ASD subcategories may be genetically combined. AS was divergent to AD and PDD-NOS for aspects of functional DNMs and expression patterns, whereas AS and AD/PDD-NOS were convergent for aspects of candidate genes and functional pathways. Our results indicated that the three ASD subcategories present more genetic convergences than divergences, favouring DSM-5's new classification. This study suggests that specifically defined genotypes and their corresponding phenotypes should be integrated analyzed for precise diagnosis of complex disorders, such as ASD.


Autism Spectrum Disorder/classification , Autism Spectrum Disorder/genetics , Adolescent , Asperger Syndrome/genetics , Autistic Disorder/genetics , Child , Child Development Disorders, Pervasive/genetics , Child, Preschool , Databases, Genetic , Diagnostic and Statistical Manual of Mental Disorders , Female , Genotype , Humans , Male , Phenotype
13.
Nucleic Acids Res ; 47(D1): D1044-D1055, 2019 01 08.
Article En | MEDLINE | ID: mdl-30445567

Whole-exome and whole-genome sequencing have revealed millions of somatic mutations associated with different human cancers, and the vast majority of them are located outside of coding sequences, making it challenging to directly interpret their functional effects. With the rapid advances in high-throughput sequencing technologies, genome-scale long-range chromatin interactions were detected, and distal target genes of regulatory elements were determined using three-dimensional (3D) chromatin looping. Herein, we present OncoBase (http://www.oncobase.biols.ac.cn/), an integrated database for annotating 81 385 242 somatic mutations in 68 cancer types from more than 120 cancer projects by exploring their roles in distal interactions between target genes and regulatory elements. OncoBase integrates local chromatin signatures, 3D chromatin interactions in different cell types and reconstruction of enhancer-target networks using state-of-the-art algorithms. It employs informative visualization tools to display the integrated local and 3D chromatin signatures and effects of somatic mutations on regulatory elements. Enhancer-promoter interactions estimated from chromatin interactions are integrated into a network diffusion system that quantitatively prioritizes somatic mutations and target genes from a large pool. Thus, OncoBase is a useful resource for the functional annotation of regulatory noncoding regions and systematically benchmarking the regulatory effects of embedded noncoding somatic mutations in human carcinogenesis.


Computational Biology/methods , Databases, Genetic , High-Throughput Nucleotide Sequencing/methods , Mutation , Neoplasms/genetics , Regulatory Sequences, Nucleic Acid/genetics , Base Sequence , Chromatin/genetics , Gene Expression Regulation, Neoplastic , Genomics/methods , Humans , Internet , Quantitative Trait Loci/genetics , Reproducibility of Results
14.
Cell Rep ; 24(8): 2029-2041, 2018 08 21.
Article En | MEDLINE | ID: mdl-30134165

Synaptic cytoskeleton dysfunction represents a common pathogenesis in neurodevelopmental disorders, such as autism spectrum disorder (ASD). The serine/threonine kinase PAK2 is a critical regulator of cytoskeleton dynamics. However, its function within the central nervous system and its role in ASD pathogenesis remain undefined. Here, we found that Pak2 haploinsufficiency resulted in markedly decreased synapse densities, defective long-term potentiation, and autism-related behaviors in mice. Phosphorylation levels of key actin regulators LIMK1 and cofilin, together with their mediated actin polymerization, were reduced in Pak2+/-mice. We identified one de novo PAK2 nonsense mutation that impaired PAK2 function in vitro and in vivo and four de novo copy-number deletions containing PAK2 in large cohorts of patients with ASD. PAK2 deficiency extensively perturbed functional networks associated with ASD by regulating actin cytoskeleton dynamics. Our genetic and functional results demonstrate a critical role of PAK2 in brain development and autism pathogenesis.


Autism Spectrum Disorder/genetics , Chromosome Pairing/genetics , p21-Activated Kinases/genetics , Actins/genetics , Actins/metabolism , Animals , Autism Spectrum Disorder/enzymology , Cytoskeleton/enzymology , Cytoskeleton/genetics , Cytoskeleton/pathology , HEK293 Cells , Haploinsufficiency , Humans , Long-Term Potentiation , Male , Mice , Mutation, Missense , Social Behavior , Stereotyped Behavior , p21-Activated Kinases/metabolism
15.
Nucleic Acids Res ; 46(15): 7793-7804, 2018 09 06.
Article En | MEDLINE | ID: mdl-30060008

With expanding applications of next-generation sequencing in medical genetics, increasing computational methods are being developed to predict the pathogenicity of missense variants. Selecting optimal methods can accelerate the identification of candidate genes. However, the performances of different computational methods under various conditions have not been completely evaluated. Here, we compared 12 performance measures of 23 methods based on three independent benchmark datasets: (i) clinical variants from the ClinVar database related to genetic diseases, (ii) somatic variants from the IARC TP53 and ICGC databases related to human cancers and (iii) experimentally evaluated PPARG variants. Some methods showed different performances under different conditions, suggesting that they were not always applicable for different conditions. Furthermore, the specificities were lower than the sensitivities for most methods (especially, for the experimentally evaluated benchmark datasets), suggesting that more rigorous cutoff values are necessary to distinguish pathogenic variants. Furthermore, REVEL, VEST3 and the combination of both methods (i.e. ReVe) showed the best overall performances with all the benchmark data. Finally, we evaluated the performances of these methods with de novo mutations, finding that ReVe consistently showed the best performance. We have summarized the performances of different methods under various conditions, providing tentative guidance for optimal tool selection.


Computational Biology/methods , Genetic Predisposition to Disease/genetics , Mutation, Missense/genetics , Neoplasms/genetics , PPAR gamma/genetics , Tumor Suppressor Protein p53/genetics , Autistic Disorder/genetics , High-Throughput Nucleotide Sequencing , Humans , Exome Sequencing
16.
Nucleic Acids Res ; 46(D1): D1039-D1048, 2018 01 04.
Article En | MEDLINE | ID: mdl-29112736

A growing number of genomic tools and databases were developed to facilitate the interpretation of genomic variants, particularly in coding regions. However, these tools are separately available in different online websites or databases, making it challenging for general clinicians, geneticists and biologists to obtain the first-hand information regarding some particular variants and genes of interest. Starting with coding regions and splice sties, we artificially generated all possible single nucleotide variants (n = 110 154 363) and cataloged all reported insertion and deletions (n = 1 223 370). We then annotated these variants with respect to functional consequences from more than 60 genomic data sources to develop a database, named VarCards (http://varcards.biols.ac.cn/), by which users can conveniently search, browse and annotate the variant- and gene-level implications of given variants, including the following information: (i) functional effects; (ii) functional consequences through different in silico algorithms; (iii) allele frequencies in different populations; (iv) disease- and phenotype-related knowledge; (v) general meaningful gene-level information; and (vi) drug-gene interactions. As a case study, we successfully employed VarCards in interpretation of de novo mutations in autism spectrum disorders. In conclusion, VarCards provides an intuitive interface of necessary information for researchers to prioritize candidate variations and genes.


Databases, Nucleic Acid , Genetic Variation , Genome, Human , Autism Spectrum Disorder/genetics , Gene Frequency , Humans , Mutation , Phenotype , Proteins/genetics , User-Computer Interface
17.
Nucleic Acids Res ; 46(D1): D64-D70, 2018 01 04.
Article En | MEDLINE | ID: mdl-29059379

Circadian rhythms govern various kinds of physiological and behavioral functions of the living organisms, and disruptions of the rhythms are highly detrimental to health. Although several databases have been built for circadian genes, a resource for comprehensive post-transcriptional regulatory information of circadian RNAs and expression patterns of disease-related circadian RNAs is still lacking. Here, we developed CirGRDB (http://cirgrdb.biols.ac.cn) by integrating more than 4936 genome-wide assays, with the aim of fulfilling the growing need to understand the rhythms of life. CirGRDB presents a friendly web interface that allows users to search and browse temporal expression patterns of interested genes in 37 human/mouse tissues or cell lines, and three clinical disorders including sleep disorder, aging and tumor. More importantly, eight kinds of potential transcriptional and post-transcriptional regulators involved in the rhythmic expression of the specific genes, including transcription factors, histone modifications, chromatin accessibility, enhancer RNAs, miRNAs, RNA-binding proteins, RNA editing and RNA methylation, can also be retrieved. Furthermore, a regulatory network could be generated based on the regulatory information. In summary, CirGRDB offers a useful repository for exploring disease-related circadian RNAs, and deciphering the transcriptional and post-transcriptional regulation of circadian rhythms.


Circadian Rhythm/genetics , Databases, Genetic , Animals , CLOCK Proteins/genetics , Circadian Clocks/genetics , Gene Expression Regulation , Gene Regulatory Networks , Genome , Genome-Wide Association Study , Histone Code , Humans , Internet , Mice , RNA/genetics , RNA/metabolism , RNA Editing , RNA Processing, Post-Transcriptional , User-Computer Interface
18.
Mol Psychiatry ; 22(9): 1282-1290, 2017 09.
Article En | MEDLINE | ID: mdl-28831199

Autism spectrum disorder (ASD) represents a set of complex neurodevelopmental disorders with large degrees of heritability and heterogeneity. We sequenced 136 microcephaly or macrocephaly (Mic-Mac)-related genes and 158 possible ASD-risk genes in 536 Chinese ASD probands and detected 22 damaging de novo mutations (DNMs) in 20 genes, including CHD8 and SCN2A, with recurrent events. Nine of the 20 genes were previously reported to harbor DNMs in ASD patients from other populations, while 11 of them were first identified in present study. We combined genetic variations of the 294 sequenced genes from publicly available whole-exome or whole-genome sequencing studies (4167 probands plus 1786 controls) with our Chinese population (536 cases plus 1457 controls) to optimize the power of candidate-gene prioritization. As a result, we prioritized 67 ASD-candidate genes that exhibited significantly higher probabilities of haploinsufficiency and genic intolerance, and significantly interacted and co-expressed with each another, as well as other known ASD-risk genes. Probands with DNMs or rare inherited mutations in the 67 candidate genes exhibited significantly lower intelligence quotients, supporting their strong functional impact. In addition, we prioritized 39 ASD-related Mic-Mac-risk genes, and showed their interaction and co-expression in a functional network that converged on chromatin remodeling, synapse transmission and cell cycle progression. Genes within the three functional subnetworks exhibited distinct and recognizable spatiotemporal-expression patterns in human brains and laminar-expression profiles in the developing neocortex, highlighting their important roles in brain development. Our results indicate some of Mic-Mac-risk genes are involved in ASD.


Autism Spectrum Disorder/genetics , Megalencephaly/genetics , Microcephaly/genetics , Autism Spectrum Disorder/metabolism , Brain/anatomy & histology , Brain/metabolism , Case-Control Studies , China , DNA-Binding Proteins/genetics , Exome , Female , Gene Regulatory Networks/genetics , Genetic Predisposition to Disease/genetics , Humans , Male , Mutation , Risk Factors , Transcription Factors/genetics
19.
Am J Med Genet B Neuropsychiatr Genet ; 174(5): 568-577, 2017 Jul.
Article En | MEDLINE | ID: mdl-28407358

Vitamin D deficiency is a putative environmental risk factor for autism spectrum disorder (ASD). Besides, de novo mutations (DNMs) play essential roles in ASD. However, it remains unclear whether vitamin D-related genes (VDRGs) carry a strong DNM burden. For the 943 reported VDRGs, we analyzed publicly-available DNMs from 4,327 ASD probands and 3,191 controls. We identified 126 and 44 loss-of-function or deleterious missense mutations in the probands and the controls, respectively, representing a significantly higher DNM burden (p = 1.06 × 10-5 ; odds ratio = 2.11). Specifically, 18 of the VDRGs were found to harbor recurrent functional DNMs in the probands, compared with only one in the controls. In addition, we found that 108 VDRGs with functional DNMs in the probands were significantly more likely to exhibit haploinsufficiency and genic intolerance (p < 0.0078). These VDRGs were also significantly interconnected and co-expressed, and also with other known ASD-risk genes (p < 0.0014), thereby forming a functional network enriched in chromatin modification, transcriptional regulation, and neuronal function. We provide straightforward genetic evidences for the first time that VDRGs with a strong degree of DNM burden in ASD and DNMs of VDRGs could be involved in the mechanism underlying in ASD pathogenesis.

...