Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Mol Biol Evol ; 41(3)2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38377343

RESUMO

Cis-regulatory elements have an important role in human adaptation to the living environment. However, the lag in population genomic cohort studies and epigenomic studies, hinders the research in the adaptive analysis of cis-regulatory elements in human populations. In this study, we collected 4,013 unrelated individuals and performed a comprehensive analysis of adaptive selection of genome-wide cis-regulatory elements in the Han Chinese. In total, 12.34% of genomic regions are under the influence of adaptive selection, where 1.00% of enhancers and 2.06% of promoters are under positive selection, and 0.06% of enhancers and 0.02% of promoters are under balancing selection. Gene ontology enrichment analysis of these cis-regulatory elements under adaptive selection reveals that many positive selections in the Han Chinese occur in pathways involved in cell-cell adhesion processes, and many balancing selections are related to immune processes. Two classes of adaptive cis-regulatory elements related to cell adhesion were in-depth analyzed, one is the adaptive enhancers derived from neanderthal introgression, leads to lower hyaluronidase level in skin, and brings better performance on UV-radiation resistance to the Han Chinese. Another one is the cis-regulatory elements regulating wound healing, and the results suggest the positive selection inhibits coagulation and promotes angiogenesis and wound healing in the Han Chinese. Finally, we found that many pathogenic alleles, such as risky alleles of type 2 diabetes or schizophrenia, remain in the population due to the hitchhiking effect of positive selections. Our findings will help deepen our understanding of the adaptive evolution of genome regulation in the Han Chinese.


Assuntos
Diabetes Mellitus Tipo 2 , Homem de Neandertal , Humanos , Animais , Diabetes Mellitus Tipo 2/genética , Seleção Genética , Sequências Reguladoras de Ácido Nucleico , Regiões Promotoras Genéticas , Homem de Neandertal/genética , China , Elementos Facilitadores Genéticos
2.
Sci Bull (Beijing) ; 68(20): 2391-2404, 2023 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-37661541

RESUMO

Characterizing natural selection signatures and relationships with phenotype spectra is important for understanding human evolution and both biological and pathological mechanisms. Here, we identified 24 genetic loci under recent selection by analyzing rare singletons in 3946 high-depth whole-genome sequencing data of Han Chinese. The loci include immune-related gene regions (MHC cluster, IGH cluster, STING1, and PSG), alcohol metabolism-related gene regions (ADH1B, ALDH2, and ALDH3B2), and the olfactory perception gene OR4C16, in which the MHC cluster, ADH1B, and ALDH2 were also identified by TOPMed and WestLake Biobank. Among the signals, the IGH cluster is particularly interesting, in which the favored allele of variant 14_105737776_C_T (rs117518546, IgG1-G396R) promotes immune response, but also increases the risk of an autoimmune disease systemic lupus erythematosus (SLE). It is also surprising that our newly discovered ALDH3B2 evolved in the opposite direction to ALDH2 for alcohol metabolism. Besides monogenic traits, we found that multiple complex traits experienced polygenic adaptation. Particularly, multi-methods consistently revealed that lower blood pressure was favored in natural selection. Finally, we built a database named RePoS (recent positive selection, http://bigdata.ibp.ac.cn/RePoS/) to integrate and display multi-population selection signals. Our study extended our understanding of natural evolution and phenotype adaptation in Han Chinese as well as other populations.


Assuntos
População do Leste Asiático , Seleção Genética , Humanos , Aldeído-Desidrogenase Mitocondrial/genética , População do Leste Asiático/genética , Fenótipo , Aldeído Oxirredutases/genética
3.
Nat Commun ; 14(1): 2092, 2023 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-37045857

RESUMO

Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.


Assuntos
Genoma Humano , Repetições de Microssatélites , Humanos , Genoma Humano/genética , Genótipo , Mutação , Repetições de Microssatélites/genética , Mutagênese
4.
Nucleic Acids Res ; 50(D1): D265-D272, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34871445

RESUMO

Piwi-interacting RNAs are a type of small noncoding RNA that have various functions. piRBase is a manually curated resource focused on assisting piRNA functional analysis. piRBase release v3.0 is committed to providing more comprehensive piRNA related information. The latest release covers >181 million unique piRNA sequences, including 440 datasets from 44 species. More disease-related piRNAs and piRNA targets have been collected and displayed. The regulatory relationships between piRNAs and targets have been visualized. In addition to the reuse and expansion of the content in the previous version, the latest version has additional new content, including gold standard piRNA sets, piRNA clusters, piRNA variants, splicing-junction piRNAs, and piRNA expression data. In addition, the entire web interface has been redesigned to provide a better experience for users. piRBase release v3.0 is free to access, browse, search, and download at http://bigdata.ibp.ac.cn/piRBase.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma , RNA Interferente Pequeno/genética , Interface Usuário-Computador , Animais , Conjuntos de Dados como Assunto , Humanos , Internet , Anotação de Sequência Molecular , Família Multigênica , Splicing de RNA , RNA Interferente Pequeno/classificação , RNA Interferente Pequeno/metabolismo
5.
Cell Rep ; 37(7): 110017, 2021 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-34788621

RESUMO

The lack of haplotype reference panels and whole-genome sequencing resources specific to the Chinese population has greatly hindered genetic studies in the world's largest population. Here, we present the NyuWa genome resource, based on deep (26.2×) sequencing of 2,999 Chinese individuals, and construct a NyuWa reference panel of 5,804 haplotypes and 19.3 million variants, which is a high-quality publicly available Chinese population-specific reference panel with thousands of samples. Compared with other panels, the NyuWa reference panel reduces the Han Chinese imputation error rate by a margin ranging from 30% to 51%. Population structure and imputation simulation tests support the applicability of one integrated reference panel for northern and southern Chinese. In addition, a total of 22,504 loss-of-function variants in coding and noncoding genes are identified, including 11,493 novel variants. These results highlight the value of the NyuWa genome resource in facilitating genetic research in Chinese and Asian populations.


Assuntos
Povo Asiático/genética , Genoma/genética , Genômica/métodos , Alelos , China , Bases de Dados Genéticas , Frequência do Gene/genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de Referência , Sequenciamento Completo do Genoma/normas
6.
Genomics Proteomics Bioinformatics ; 19(4): 602-610, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34536568

RESUMO

Small proteins specifically refer to proteins consisting of less than 100 amino acids translated from small open reading frames (sORFs), which were usually missed in previous genome annotation. The significance of small proteins has been revealed in current years, along with the discovery of their diverse functions. However, systematic annotation of small proteins is still insufficient. SmProt was specially developed to provide valuable information on small proteins for scientific community. Here we present the update of SmProt, which emphasizes reliability of translated sORFs, genetic variants in translated sORFs, disease-specific sORF translation events or sequences, and remarkably increased data volume. More components such as non-ATG translation initiation, function, and new sources are also included. SmProt incorporated 638,958 unique small proteins curated from 3,165,229 primary records, which were computationally predicted from 419 ribosome profiling (Ribo-seq) datasets or collected from literature and other sources from 370 cell lines or tissues in 8 species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Danio rerio, Saccharomyces cerevisiae, Caenorhabditis elegans, and Escherichia coli). In addition, small protein families identified from human microbiomes were also collected. All datasets in SmProt are free to access, and available for browse, search, and bulk downloads at http://bigdata.ibp.ac.cn/SmProt/.


Assuntos
Drosophila melanogaster , Ribossomos , Animais , Drosophila melanogaster/genética , Camundongos , Anotação de Sequência Molecular , Fases de Leitura Aberta , Proteínas/metabolismo , Ratos , Reprodutibilidade dos Testes , Ribossomos/genética , Ribossomos/metabolismo
7.
Nucleic Acids Res ; 49(D1): D165-D171, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33196801

RESUMO

NONCODE (http://www.noncode.org/) is a comprehensive database of collection and annotation of noncoding RNAs, especially long non-coding RNAs (lncRNAs) in animals. NONCODEV6 is dedicated to providing the full scope of lncRNAs across plants and animals. The number of lncRNAs in NONCODEV6 has increased from 548 640 to 644 510 since the last update in 2017. The number of human lncRNAs has increased from 172 216 to 173 112. The number of mouse lncRNAs increased from 131 697 to 131 974. The number of plant lncRNAs is 94 697. The relationship between lncRNAs in human and cancer were updated with transcriptome sequencing profiles. Three important new features were also introduced in NONCODEV6: (i) updated human lncRNA-disease relationships, especially cancer; (ii) lncRNA annotations with tissue expression profiles and predicted function in five common plants; iii) lncRNAs conservation annotation at transcript level for 23 plant species. NONCODEV6 is accessible through http://www.noncode.org/.


Assuntos
Bases de Dados de Ácidos Nucleicos , Neoplasias/genética , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Software , Transcriptoma , Animais , Sequência de Bases , Sequência Conservada , Éxons , Perfilação da Expressão Gênica , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Neoplasias/classificação , Neoplasias/metabolismo , Neoplasias/patologia , Plantas/genética , RNA Longo não Codificante/classificação , RNA Longo não Codificante/metabolismo , RNA Mensageiro/classificação , RNA Mensageiro/metabolismo
8.
Sci China Life Sci ; 58(7): 687-93, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26100010

RESUMO

Influenza virus can rapidly change its antigenicity, via mutation in the hemagglutinin (HA) protein, to evade host immunity. The emergence of the novel human-infecting avian H7N9 virus in China has caused widespread concern. However, evolution of the antigenicity of this virus is not well understood. Here, we inferred the antigenic epitopes of the HA protein from all H7 viruses, based on the five well-characterized HA epitopes of the human H3N2 virus. By comparing the two major H7 phylogenetic lineages, i.e., the Eurasian lineage and the North American lineage, we found that epitopes A and B are more frequently mutated in the Eurasian lineage, while epitopes B and C are more frequently mutated in the North American lineage. Furthermore, we found that the novel H7N9 virus (derived from the Eurasian lineage) isolated in China in the year 2013, contains six frequently mutated sites on epitopes that include site 135, which is located in the receptor binding domain. This indicates that the novel H7N9 virus that infects human may already have been subjected to gradual immune pressure and receptor-binding variation. Our results not only provide insights into the antigenic evolution of the H7 virus but may also help in the selection of suitable vaccine strains.


Assuntos
Antígenos Virais/imunologia , Biologia Computacional , Epitopos/imunologia , Subtipo H7N9 do Vírus da Influenza A/imunologia , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Humanos , Mutação
9.
Bioinformatics ; 30(17): 2440-6, 2014 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-24813541

RESUMO

MOTIVATION: Protein domains are fundamental units of protein structure, function and evolution; thus, it is critical to gain a deep understanding of protein domain organization. Previous works have attempted to identify key residues involved in organization of domain architecture. Because one of the most important characteristics of domain architecture is the arrangement of secondary structure elements (SSEs), here we present a picture of domain organization through an integrated consideration of SSE arrangements and residue contact networks. RESULTS: In this work, by representing SSEs as main-chain scaffolds and side-chain interfaces and through construction of residue contact networks, we have identified the SSE interfaces well packed within protein domains as SSE packing clusters. In total, 17 334 SSE packing clusters were recognized from 9015 Structural Classification of Proteins domains of <40% sequence identity. The similar SSE packing clusters were observed not only among domains of the same folds, but also among domains of different folds, indicating their roles as common scaffolds for organization of protein domains. Further analysis of 14 small single-domain proteins reveals a high correlation between the SSE packing clusters and the folding nuclei. Consistent with their important roles in domain organization, SSE packing clusters were found to be more conserved than other regions within the same proteins. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Biologia Computacional/métodos , Humanos , Modelos Moleculares , Proteínas Quinases/química
11.
PLoS One ; 9(2): e89935, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24587135

RESUMO

Many template-based modeling (TBM) methods have been developed over the recent years that allow for protein structure prediction and for the study of structure-function relationships for proteins. One major problem all TBM algorithms face, however, is their unsatisfactory performance when proteins under consideration are low-homology. To improve the performance of TBM methods for such targets, a novel model evaluation method was developed here, and named MEFTop. Our novel method focuses on evaluating the topology by using two novel groups of features. These novel features included secondary structure element (SSE) contact information and 3-dimensional topology information. By combining MEFTop algorithm with FR-t5, a threading program developed by our group, we found that this modified TBM program, which was named FR-t5-M, exhibited significant improvements in predictive abilities for low-homology protein targets. We further showed that the MEFTop could be a generalized method to improve threading programs for low-homology protein targets. The softwares (FR-t5-M and MEFTop) are available to non-commercial users at our website: http://jianglab.ibp.ac.cn/lims/FRt5M/FRt5M.html.


Assuntos
Algoritmos , Modelos Moleculares , Proteínas/química , Software , Homologia Estrutural de Proteína , Conformação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA