Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 116
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D1315-D1326, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37870452

RESUMO

Human endogenous retroviruses (HERVs), as remnants of ancient exogenous retrovirus infected and integrated into germ cells, comprise ∼8% of the human genome. These HERVs have been implicated in numerous diseases, and extensive research has been conducted to uncover their specific roles. Despite these efforts, a comprehensive source of HERV-disease association still needs to be added. To address this gap, we introduce the HervD Atlas (https://ngdc.cncb.ac.cn/hervd/), an integrated knowledgebase of HERV-disease associations manually curated from all related published literature. In the current version, HervD Atlas collects 60 726 HERV-disease associations from 254 publications (out of 4692 screened literature), covering 21 790 HERVs (21 049 HERV-Terms and 741 HERV-Elements) belonging to six types, 149 diseases and 610 related/affected genes. Notably, an interactive knowledge graph that systematically integrates all the HERV-disease associations and corresponding affected genes into a comprehensive network provides a powerful tool to uncover and deduce the complex interplay between HERVs and diseases. The HervD Atlas also features a user-friendly web interface that allows efficient browsing, searching, and downloading of all association information, research metadata, and annotation information. Overall, the HervD Atlas is an essential resource for comprehensive, up-to-date knowledge on HERV-disease research, potentially facilitating the development of novel HERV-associated diagnostic and therapeutic strategies.


Assuntos
Retrovirus Endógenos , Bases de Conhecimento , Viroses , Humanos , Viroses/genética , Viroses/virologia , Atlas como Assunto , Uso da Internet
2.
Nucleic Acids Res ; 52(D1): D1651-D1660, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37843152

RESUMO

Tropical crops are vital for tropical agriculture, with resource scarcity, functional diversity and extensive market demand, providing considerable economic benefits for the world's tropical agriculture-producing countries. The rapid development of sequencing technology has promoted a milestone in tropical crop research, resulting in the generation of massive amount of data, which urgently needs an effective platform for data integration and sharing. However, the existing databases cannot fully satisfy researchers' requirements due to the relatively limited integration level and untimely update. Here, we present the Tropical Crop Omics Database (TCOD, https://ngdc.cncb.ac.cn/tcod), a comprehensive multi-omics data platform for tropical crops. TCOD integrates diverse omics data from 15 species, encompassing 34 chromosome-level de novo assemblies, 1 255 004 genes with functional annotations, 282 436 992 unique variants from 2048 WGS samples, 88 transcriptomic profiles from 1997 RNA-Seq samples and 13 381 germplasm items. Additionally, TCOD not only employs genes as a bridge to interconnect multi-omics data, enabling cross-species comparisons based on homology relationships, but also offers user-friendly online tools for efficient data mining and visualization. In short, TCOD integrates multi-species, multi-omics data and online tools, which will facilitate the research on genomic selective breeding and trait biology of tropical crops.


Assuntos
Produtos Agrícolas , Bases de Dados Genéticas , Produtos Agrícolas/genética , Transcriptoma , Genoma de Planta
3.
Nucleic Acids Res ; 51(D1): D186-D191, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36330950

RESUMO

LncBook, a comprehensive resource of human long non-coding RNAs (lncRNAs), has been used in a wide range of lncRNA studies across various biological contexts. Here, we present LncBook 2.0 (https://ngdc.cncb.ac.cn/lncbook), with significant updates and enhancements as follows: (i) incorporation of 119 722 new transcripts, 9632 new genes, and gene structure update of 21 305 lncRNAs; (ii) characterization of conservation features of human lncRNA genes across 40 vertebrates; (iii) integration of lncRNA-encoded small proteins; (iv) enrichment of expression and DNA methylation profiles with more biological contexts and (v) identification of lncRNA-protein interactions and improved prediction of lncRNA-miRNA interactions. Collectively, LncBook 2.0 accommodates a high-quality collection of 95 243 lncRNA genes and 323 950 transcripts and incorporates their abundant annotations at different omics levels, thereby enabling users to decipher functional significance of lncRNAs in different biological contexts.


Assuntos
Anotação de Sequência Molecular , Multiômica , RNA Longo não Codificante , Animais , Humanos , MicroRNAs/genética , RNA Longo não Codificante/metabolismo
4.
Nucleic Acids Res ; 51(D1): D767-D776, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36169225

RESUMO

Compared with conventional comparative genomics, the recent studies in pan-genomics have provided further insights into species genomic dynamics, taxonomy and identification, pathogenicity and environmental adaptation. To better understand genome characteristics of species of interest and to fully excavate key metabolic and resistant genes and their conservations and variations, here we present ProPan (https://ngdc.cncb.ac.cn/propan), a public database covering 23 archaeal species and 1,481 bacterial species (in a total of 51,882 strains) for comprehensively profiling prokaryotic pan-genome dynamics. By analyzing and integrating these massive datasets, ProPan offers three major aspects for the pan-genome dynamics of the species of interest: 1) the evaluations of various species' characteristics and composition in pan-genome dynamics; 2) the visualization of map association, the functional annotation and presence/absence variation for all contained species' gene clusters; 3) the typical characteristics of the environmental adaptation, including resistance genes prediction of 126 substances (biocide, antimicrobial drug and metal) and evaluation of 31 metabolic cycle processes. Besides, ProPan develops a very user-friendly interface, flexible retrieval and multi-level real-time statistical visualization. Taken together, ProPan will serve as a weighty resource for the studies of prokaryotic pan-genome dynamics, taxonomy and identification as well as environmental adaptation.


Assuntos
Bases de Dados Genéticas , Genoma , Células Procarióticas , Archaea/genética , Bactérias/genética , Genoma Bacteriano , Genômica
5.
Nucleic Acids Res ; 51(D1): D853-D860, 2023 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-36161321

RESUMO

Single-cell studies have delineated cellular diversity and uncovered increasing numbers of previously uncharacterized cell types in complex tissues. Thus, synthesizing growing knowledge of cellular characteristics is critical for dissecting cellular heterogeneity, developmental processes and tumorigenesis at single-cell resolution. Here, we present Cell Taxonomy (https://ngdc.cncb.ac.cn/celltaxonomy), a comprehensive and curated repository of cell types and associated cell markers encompassing a wide range of species, tissues and conditions. Combined with literature curation and data integration, the current version of Cell Taxonomy establishes a well-structured taxonomy for 3,143 cell types and houses a comprehensive collection of 26,613 associated cell markers in 257 conditions and 387 tissues across 34 species. Based on 4,299 publications and single-cell transcriptomic profiles of ∼3.5 million cells, Cell Taxonomy features multifaceted characterization for cell types and cell markers, involving quality assessment of cell markers and cell clusters, cross-species comparison, cell composition of tissues and cellular similarity based on markers. Taken together, Cell Taxonomy represents a fundamentally useful reference to systematically and accurately characterize cell types and thus lays an important foundation for deeply understanding and exploring cellular biology in diverse species.

6.
Nucleic Acids Res ; 51(D1): D994-D1002, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36318261

RESUMO

Homology is fundamental to infer genes' evolutionary processes and relationships with shared ancestry. Existing homolog gene resources vary in terms of inferring methods, homologous relationship and identifiers, posing inevitable difficulties for choosing and mapping homology results from one to another. Here, we present HGD (Homologous Gene Database, https://ngdc.cncb.ac.cn/hgd), a comprehensive homologs resource integrating multi-species, multi-resources and multi-omics, as a complement to existing resources providing public and one-stop data service. Currently, HGD houses a total of 112 383 644 homologous pairs for 37 species, including 19 animals, 16 plants and 2 microorganisms. Meanwhile, HGD integrates various annotations from public resources, including 16 909 homologs with traits, 276 670 homologs with variants, 398 573 homologs with expression and 536 852 homologs with gene ontology (GO) annotations. HGD provides a wide range of omics gene function annotations to help users gain a deeper understanding of gene function.


Assuntos
Bases de Dados Genéticas , Animais , Anotação de Sequência Molecular
7.
Nucleic Acids Res ; 51(D1): D1179-D1187, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36243959

RESUMO

Transcriptome-wide association studies (TWASs), as a practical and prevalent approach for detecting the associations between genetically regulated genes and traits, are now leading to a better understanding of the complex mechanisms of genetic variants in regulating various diseases and traits. Despite the ever-increasing TWAS outputs, there is still a lack of databases curating massive public TWAS information and knowledge. To fill this gap, here we present TWAS Atlas (https://ngdc.cncb.ac.cn/twas/), an integrated knowledgebase of TWAS findings manually curated from extensive literature. In the current implementation, TWAS Atlas collects 401,266 high-quality human gene-trait associations from 200 publications, covering 22,247 genes and 257 traits across 135 tissue types. In particular, an interactive knowledge graph of the collected gene-trait associations is constructed together with single nucleotide polymorphism (SNP)-gene associations to build up comprehensive regulatory networks at multi-omics levels. In addition, TWAS Atlas, as a user-friendly web interface, efficiently enables users to browse, search and download all association information, relevant research metadata and annotation information of interest. Taken together, TWAS Atlas is of great value for promoting the utility and availability of TWAS results in explaining the complex genetic basis as well as providing new insights for human health and disease research.


Assuntos
Locos de Características Quantitativas , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Bases de Conhecimento , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
8.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088550

RESUMO

Somatic variants act as critical players during cancer occurrence and development. Thus, an accurate and robust method to identify them is the foundation of cutting-edge cancer genome research. However, due to low accessibility and high individual-/sample-specificity of the somatic variants in tumor samples, the detection is, to date, still crammed with challenges, particularly when lacking paired normal samples as control. To solve this burning issue, we developed a tumor-only somatic and germline variant identification method (TSomVar) using the random forest algorithm established on sample-specific variant datasets derived from genotype imputation, reads-mapping level annotation and functional annotation. We trained TSomVar by using genomic variant datasets of three major cancer types: colorectal cancer, hepatocellular carcinoma and skin cutaneous melanoma. Compared with existing tumor-only somatic variant identification tools, TSomVar shows excellent performances in somatic variant detection with higher accuracy and better capability of recalling for test datasets from colorectal cancer and skin cutaneous melanoma. In addition, TSomVar is equipped with the competence of accurately identifying germline variants in tumor samples. Taken together, TSomVar will undoubtedly facilitate and revolutionize somatic variant explorations in cancer research.


Assuntos
Neoplasias Colorretais , Melanoma , Neoplasias , Neoplasias Cutâneas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Melanoma/genética , Neoplasias/genética , Neoplasias Cutâneas/genética , Melanoma Maligno Cutâneo
9.
Curr Microbiol ; 81(5): 122, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38530471

RESUMO

The chromosome structure of different bacteria has its unique organization pattern, which plays an important role in maintaining the spatial location relationship between genes and regulating gene expression. Conversely, transcription also plays a global role in regulating the three-dimensional structure of bacterial chromosomes. Therefore, we combine RNA-Seq and Hi-C technology to explore the relationship between chromosome structure changes and transcriptional regulation in E. coli at different growth stages. Transcriptome analysis indicates that E. coli synthesizes many ribosomes and peptidoglycan in the exponential phase. In contrast, E. coli undergoes more transcriptional regulation and catabolism during the stationary phase, reflecting its adaptability to changes in environmental conditions during growth. Analyzing the Hi-C data shows that E. coli has a higher frequency of global chromosomal interaction in the exponential phase and more defined chromosomal interaction domains (CIDs). Still, the long-distance interactions at the replication termination region are lower than in the stationary phase. Combining transcriptome and Hi-C data analysis, we conclude that highly expressed genes are more likely to be distributed in CID boundary regions during the exponential phase. At the same time, most high-expression genes distributed in the CID boundary regions are ribosomal gene clusters, forming clearer CID boundaries during the exponential phase. The three-dimensional structure of chromosome and expression pattern is altered during the growth of E. coli from the exponential phase to the stationary phase, clarifying the synergy between the two regulatory aspects.


Assuntos
Proteínas de Escherichia coli , Escherichia coli , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Transcriptoma , Cromossomos Bacterianos/metabolismo , Estruturas Cromossômicas/metabolismo , Regulação Bacteriana da Expressão Gênica
10.
Nucleic Acids Res ; 50(D1): D1147-D1155, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34643725

RESUMO

With the proliferating studies of human cancers by single-cell RNA sequencing technique (scRNA-seq), cellular heterogeneity, immune landscape and pathogenesis within diverse cancers have been uncovered successively. The exponential explosion of massive cancer scRNA-seq datasets in the past decade are calling for a burning demand to be integrated and processed for essential investigations in tumor microenvironment of various cancer types. To fill this gap, we developed a database of Cancer Single-cell Expression Map (CancerSCEM, https://ngdc.cncb.ac.cn/cancerscem), particularly focusing on a variety of human cancers. To date, CancerSCE version 1.0 consists of 208 cancer samples across 28 studies and 20 human cancer types. A series of uniformly and multiscale analyses for each sample were performed, including accurate cell type annotation, functional gene expressions, cell interaction network, survival analysis and etc. Plus, we visualized CancerSCEM as a user-friendly web interface for users to browse, search, online analyze and download all the metadata as well as analytical results. More importantly and unprecedentedly, the newly-constructed comprehensive online analyzing platform in CancerSCEM integrates seven analyze functions, where investigators can interactively perform cancer scRNA-seq analyses. In all, CancerSCEM paves an informative and practical way to facilitate human cancer studies, and also provides insights into clinical therapy assessments.


Assuntos
Bases de Dados Genéticas , Neoplasias/genética , Software , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Neoplasias/classificação , RNA-Seq , Análise de Célula Única/normas , Microambiente Tumoral/genética
11.
Nucleic Acids Res ; 50(D1): D1016-D1024, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34591957

RESUMO

Transcriptomic profiling is critical to uncovering functional elements from transcriptional and post-transcriptional aspects. Here, we present Gene Expression Nebulas (GEN, https://ngdc.cncb.ac.cn/gen/), an open-access data portal integrating transcriptomic profiles under various biological contexts. GEN features a curated collection of high-quality bulk and single-cell RNA sequencing datasets by using standardized data processing pipelines and a structured curation model. Currently, GEN houses a large number of gene expression profiles from 323 datasets (157 bulk and 166 single-cell), covering 50 500 samples and 15 540 169 cells across 30 species, which are further categorized into six biological contexts. Moreover, GEN integrates a full range of transcriptomic profiles on expression, RNA editing and alternative splicing for 10 bulk datasets, providing opportunities for users to conduct integrative analysis at both transcriptional and post-transcriptional levels. In addition, GEN provides abundant gene annotations based on value-added curation of transcriptomic profiles and delivers online services for data analysis and visualization. Collectively, GEN presents a comprehensive collection of transcriptomic profiles across multiple species, thus serving as a fundamental resource for better understanding genetic regulatory architecture and functional mechanisms from tissues to cells.


Assuntos
Bases de Dados Genéticas , Regulação da Expressão Gênica/genética , Anotação de Sequência Molecular , Transcriptoma/genética , Animais , Perfilação da Expressão Gênica , Humanos , Análise de Célula Única
12.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34402866

RESUMO

Genotype imputation is a statistical method for estimating missing genotypes from a denser haplotype reference panel. Existing methods usually performed well on common variants, but they may not be ideal for low-frequency and rare variants. Previous studies showed that the population similarity between study and reference panels is one of the key factors influencing the imputation accuracy. Here, we developed an imputation reference panel reconstruction method (RefRGim) using convolutional neural networks (CNNs), which can generate a study-specified reference panel for each input data based on the genetic similarity of individuals from current study and references. The CNNs were pretrained with single nucleotide polymorphism data from the 1000 Genomes Project. Our evaluations showed that genotype imputation with RefRGim can achieve higher accuracies than original reference panel, especially for low-frequency and rare variants. RefRGim will serve as an efficient reference panel reconstruction method for genotype imputation. RefRGim is freely available via GitHub: https://github.com/shishuo16/RefRGim.


Assuntos
Biologia Computacional/métodos , Genótipo , Técnicas de Genotipagem/métodos , Redes Neurais de Computação , Software , Algoritmos , Bases de Dados Genéticas , Aprendizado Profundo , Genética Populacional/métodos , Estudo de Associação Genômica Ampla/métodos , Humanos , Reprodutibilidade dos Testes , Navegador
13.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33319232

RESUMO

Recombination is one of the most important molecular mechanisms of prokaryotic genome evolution, but its exact roles are still in debate. Here we try to infer genome-wide recombination within a species, utilizing a dataset of 149 complete genomes of Escherichia coli from diverse animal hosts and geographic origins, including 45 in-house sequenced with the single-molecular real-time platform. Two major clades identified based on physiological, clinical and ecological characteristics form distinct genetic lineages based on scarcity of interclade gene exchanges. By defining gene-based syntenies for genomic segments within and between the two clades, we build a fine-scale recombination map for this representative global E. coli population. The map suggests extensive within-clade recombination that often breaks physical linkages among individual genes but seldom interrupts the structure of genome organizational frameworks as well as primary metabolic portfolios supported by the framework integrity, possibly due to strong natural selection for both physiological compatibility and ecological fitness. In contrast, the between-clade recombination declines drastically when phylogenetic distance increases to the extent where a 10-fold reduction can be observed, establishing a firm genetic barrier between clades. Our empirical data suggest a critical role for such recombination events in the early stage of speciation where recombination rate is associated with phylogenetic distance in addition to sequence and gene variations. The extensive intraclade recombination binds sister strains into a quasisexual group and optimizes genes or alleles to streamline physiological activities, whereas the sharply declined interclade recombination split the population into clades adaptive to divergent ecological niches.


Assuntos
Escherichia coli/genética , Evolução Molecular , Variação Genética , Genoma Bacteriano , Recombinação Genética , Seleção Genética , Animais , Estudo de Associação Genômica Ampla , Humanos
14.
Nucleic Acids Res ; 49(D1): D962-D968, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33045751

RESUMO

Expression profiles of long non-coding RNAs (lncRNAs) across diverse biological conditions provide significant insights into their biological functions, interacting targets as well as transcriptional reliability. However, there lacks a comprehensive resource that systematically characterizes the expression landscape of human lncRNAs by integrating their expression profiles across a wide range of biological conditions. Here, we present LncExpDB (https://bigd.big.ac.cn/lncexpdb), an expression database of human lncRNAs that is devoted to providing comprehensive expression profiles of lncRNA genes, exploring their expression features and capacities, identifying featured genes with potentially important functions, and building interactions with protein-coding genes across various biological contexts/conditions. Based on comprehensive integration and stringent curation, LncExpDB currently houses expression profiles of 101 293 high-quality human lncRNA genes derived from 1977 samples of 337 biological conditions across nine biological contexts. Consequently, LncExpDB estimates lncRNA genes' expression reliability and capacities, identifies 25 191 featured genes, and further obtains 28 443 865 lncRNA-mRNA interactions. Moreover, user-friendly web interfaces enable interactive visualization of expression profiles across various conditions and easy exploration of featured lncRNAs and their interacting partners in specific contexts. Collectively, LncExpDB features comprehensive integration and curation of lncRNA expression profiles and thus will serve as a fundamental resource for functional studies on human lncRNAs.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , RNA Longo não Codificante/genética , Curadoria de Dados/métodos , Mineração de Dados/métodos , Humanos , Internet , Anotação de Sequência Molecular/métodos
15.
Nucleic Acids Res ; 48(D1): D590-D598, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31620779

RESUMO

Defense systems are vital weapons for prokaryotes to resist heterologous DNA and survive from the constant invasion of viruses, and they are widely used in biochemistry investigation and antimicrobial drug research. So far, numerous types of defense systems have been discovered, but there is no comprehensive defense systems database to organize prokaryotic defense gene datasets. To fill this gap, we unveil the prokaryotic antiviral defense system (PADS) Arsenal (https://bigd.big.ac.cn/padsarsenal), a public database dedicated to gathering, storing, analyzing and visualizing prokaryotic defense gene datasets. The initial version of PADS Arsenal integrates 18 distinctive categories of defense system with the annotation of 6 600 264 genes retrieved from 63,701 genomes across 33 390 species of archaea and bacteria. PADS Arsenal provides various ways to retrieve defense systems related genes information and visualize them with multifarious function modes. Moreover, an online analysis pipeline is integrated into PADS Arsenal to facilitate annotation and evolutionary analysis of defense genes. PADS Arsenal can also visualize the dynamic variation information of defense genes from pan-genome analysis. Overall, PADS Arsenal is a state-of-the-art open comprehensive resource to accelerate the research of prokaryotic defense systems.


Assuntos
Archaea/genética , Bactérias/genética , Bases de Dados Genéticas , Interações Hospedeiro-Patógeno , Software , Archaea/virologia , Vírus de Archaea/patogenicidade , Bactérias/virologia , Bacteriófagos/patogenicidade , Sistemas CRISPR-Cas , Enzimas de Restrição-Modificação do DNA
16.
Nucleic Acids Res ; 48(D1): D1174-D1180, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31665422

RESUMO

Precision medicine calls upon deeper coverage of population-based sequencing and thorough gene-content and phenotype-based analysis, which lead to a population-associated genomic variation map or database. The Chinese Genomic Variation Database (CGVD; https://bigd.big.ac.cn/cgvd/) is such a database that has combined 48.30 million (M) SNVs and 5.77 M small indels, identified from 991 Chinese individuals of the Chinese Academy of Sciences Precision Medicine Initiative Project (CASPMI) and 301 Chinese individuals of the 1000 Genomes Project (1KGP). The CASPMI project includes whole-genome sequencing data (WGS, 25-30×) from ∼1000 healthy individuals of the CASPMI cohort. To facilitate the usage of such variations for pharmacogenomics studies, star-allele frequencies of the drug-related genes in the CASPMI and 1KGP populations are calculated and provided in CGVD. As one of the important database resources in BIG Data Center, CGVD will continue to collect more genomic variations and to curate structural and functional annotations to support population-based healthcare projects and studies in China and worldwide.

17.
Nucleic Acids Res ; 47(D1): D163-D169, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30335176

RESUMO

Dynamics of nucleosome positioning affects chromatin state, transcription and all other biological processes occurring on genomic DNA. While MNase-Seq has been used to depict nucleosome positioning map in eukaryote in the past years, nucleosome positioning data is increasing dramatically. To facilitate the usage of published data across studies, we developed a database named nucleosome positioning map (NucMap, http://bigd.big.ac.cn/nucmap). NucMap includes 798 experimental data from 477 samples across 15 species. With a series of functional modules, users can search profile of nucleosome positioning at the promoter region of each gene across all samples and make enrichment analysis on nucleosome positioning data in all genomic regions. Nucleosome browser was built to visualize the profiles of nucleosome positioning. Users can also visualize multiple sources of omics data with the nucleosome browser and make side-by-side comparisons. All processed data in the database are freely available. NucMap is the first comprehensive nucleosome positioning platform and it will serve as an important resource to facilitate the understanding of chromatin regulation.


Assuntos
Montagem e Desmontagem da Cromatina , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Nucleossomos/metabolismo , Estudo de Associação Genômica Ampla/métodos , Software , Interface Usuário-Computador , Navegador
18.
Nucleic Acids Res ; 47(D1): D793-D800, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30371881

RESUMO

The domestic dog (Canis lupus familiaris) is indisputably one of man's best friends. It is also a fundamental model for many heritable human diseases. Here, we present iDog (http://bigd.big.ac.cn/idog), the first integrated resource dedicated to domestic dogs and wild canids. It incorporates a variety of omics data, including genome sequences assemblies for dhole and wolf, genomic variations extracted from hundreds of dog/wolf whole genomes, phenotype/disease traits curated from dog research communities and public resources, gene expression profiles derived from published RNA-Seq data, gene ontology for functional annotation, homolog gene information for multiple organisms and disease-related literature. Additionally, iDog integrates sequence alignment tools for data analyses and a genome browser for data visualization. iDog will not only benefit the global dog research community, but also provide access to a user-friendly consolidation of dog information to a large number of dog enthusiasts.


Assuntos
Bases de Dados Genéticas , Genoma/genética , Software , Animais , Cães , Genômica , Humanos , Anotação de Sequência Molecular , Filogenia , RNA-Seq/tendências , Lobos/genética
19.
Yi Chuan ; 43(10): 988-993, 2021 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-34702711

RESUMO

The Genome Sequence Archive for Human (GSA-Human) is a data repository specialized for human genetic related data derived from biomedical researches, and also supports the data collection and management of National Key Research and Development Projects. GSA-Human has a data security management strategy according to the national regulations of human genetic resources. It provides two different models of data access: Open-access and Controlled-access. Open-access data are universally and freely accessible for global researchers, while Controlled-access ensures that data are accessed only by authorized users with the permission of the Data Access Committee (DAC). Till July 2021, GSA-Human has housed more than 5.27 PB of data from 750 datasets.

20.
Nucleic Acids Res ; 46(D1): D944-D949, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29069473

RESUMO

The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes.


Assuntos
Animais Domésticos/genética , Bases de Dados Genéticas , Variação Genética , Genoma , Plantas/genética , Acesso à Informação , Animais , Sequência de Bases , Big Data , Curadoria de Dados , Sistemas de Gerenciamento de Base de Dados , Previsões , Genética Populacional , Genoma Humano , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Especificidade da Espécie , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa