Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 7.068
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 184(4): 1098-1109.e9, 2021 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-33606979

RESUMO

Bacteriophages drive evolutionary change in bacterial communities by creating gene flow networks that fuel ecological adaptions. However, the extent of viral diversity and its prevalence in the human gut remains largely unknown. Here, we introduce the Gut Phage Database, a collection of ∼142,000 non-redundant viral genomes (>10 kb) obtained by mining a dataset of 28,060 globally distributed human gut metagenomes and 2,898 reference genomes of cultured gut bacteria. Host assignment revealed that viral diversity is highest in the Firmicutes phyla and that ∼36% of viral clusters (VCs) are not restricted to a single species, creating gene flow networks across phylogenetically distinct bacterial species. Epidemiological analysis uncovered 280 globally distributed VCs found in at least 5 continents and a highly prevalent phage clade with features reminiscent of p-crAssphage. This high-quality, large-scale catalog of phage genomes will improve future virome studies and enable ecological and evolutionary analysis of human gut bacteriophages.


Assuntos
Bacteriófagos/genética , Biodiversidade , Microbioma Gastrointestinal , Bases de Dados de Ácidos Nucleicos , Especificidade de Hospedeiro , Humanos , Filogeografia
2.
Cell ; 177(7): 1888-1902.e21, 2019 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-31178118

RESUMO

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.


Assuntos
Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Análise de Sequência de RNA , Análise de Célula Única , Software , Transcriptoma , Humanos
3.
Cell ; 171(5): 982-986, 2017 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-29149611

RESUMO

The Center for Medical Technology Policy and the Molecular Evidence Development Consortium gathered a diverse group of more than 50 stakeholders to develop consensus on a core set of data elements and values essential to understanding the clinical utility of molecularly targeted therapies in oncology.


Assuntos
Gestão da Informação em Saúde , Neoplasias/genética , Elementos de Dados Comuns , Consenso , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Humanos
4.
Nat Methods ; 21(6): 994-1002, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38755321

RESUMO

Searching vast and rapidly growing nucleotide content in resources, such as runs in the Sequence Read Archive and assemblies for whole-genome shotgun sequencing projects in GenBank, is currently impractical for most researchers. Here we present Pebblescout, a tool that navigates such content by providing indexing and search capabilities. Indexing uses dense sampling of the sequences in the resource. Search finds subjects (runs or assemblies) that have short sequence matches to a user query, with well-defined guarantees and ranks them using informativeness of the matches. We illustrate the functionality of Pebblescout by creating eight databases that index over 3.7 petabases. The web service of Pebblescout can be reached at https://pebblescout.ncbi.nlm.nih.gov . We show that for a wide range of query lengths, Pebblescout provides a data-driven way for finding relevant subsets of large nucleotide resources, reducing the effort for downstream analysis substantially. We also show that Pebblescout results compare favorably to MetaGraph and Sourmash.


Assuntos
Software , Nucleotídeos/genética , Humanos , Bases de Dados Genéticas , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Algoritmos
5.
PLoS Biol ; 22(7): e3002476, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-39074139

RESUMO

Despite the increasing number of 3D RNA structures in the Protein Data Bank, the majority of experimental RNA structures lack thorough functional annotations. As the significance of the functional roles played by noncoding RNAs becomes increasingly apparent, comprehensive annotation of RNA function is becoming a pressing concern. In response to this need, we have developed FURNA (Functions of RNAs), the first database for experimental RNA structures that aims to provide a comprehensive repository of high-quality functional annotations. These include Gene Ontology terms, Enzyme Commission numbers, ligand-binding sites, RNA families, protein-binding motifs, and cross-references to related databases. FURNA is available at https://seq2fun.dcmb.med.umich.edu/furna/ to enable quick discovery of RNA functions from their structures and sequences.


Assuntos
Anotação de Sequência Molecular , Conformação de Ácido Nucleico , RNA , RNA/metabolismo , RNA/química , RNA/genética , Bases de Dados de Ácidos Nucleicos , Sítios de Ligação , Humanos
6.
Nature ; 579(7797): 92-96, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32076267

RESUMO

Colonization, speciation and extinction are dynamic processes that influence global patterns of species richness1-6. Island biogeography theory predicts that the contribution of these processes to the accumulation of species diversity depends on the area and isolation of the island7,8. Notably, there has been no robust global test of this prediction for islands where speciation cannot be ignored9, because neither the appropriate data nor the analytical tools have been available. Here we address both deficiencies to reveal, for island birds, the empirical shape of the general relationships that determine how colonization, extinction and speciation rates co-vary with the area and isolation of islands. We compiled a global molecular phylogenetic dataset of birds on islands, based on the terrestrial avifaunas of 41 oceanic archipelagos worldwide (including 596 avian taxa), and applied a new analysis method to estimate the sensitivity of island-specific rates of colonization, speciation and extinction to island features (area and isolation). Our model predicts-with high explanatory power-several global relationships. We found a decline in colonization with isolation, a decline in extinction with area and an increase in speciation with area and isolation. Combining the theoretical foundations of island biogeography7,8 with the temporal information contained in molecular phylogenies10 proves a powerful approach to reveal the fundamental relationships that govern variation in biodiversity across the planet.


Assuntos
Biodiversidade , Aves/classificação , Ilhas , Modelos Biológicos , Animais , Bases de Dados de Ácidos Nucleicos , Extinção Biológica , Especiação Genética , Filogenia , Filogeografia
7.
Nucleic Acids Res ; 52(D1): D919-D928, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37986229

RESUMO

Long non-coding RNAs (lncRNAs) possess a wide range of biological functions, and research has demonstrated their significance in regulating major biological processes such as development, differentiation, and immune response. The accelerating accumulation of lncRNA research has greatly expanded our understanding of lncRNA functions. Here, we introduce LncSEA 2.0 (http://bio.liclab.net/LncSEA/index.php), aiming to provide a more comprehensive set of functional lncRNAs and enhanced enrichment analysis capabilities. Compared with LncSEA 1.0, we have made the following improvements: (i) We updated the lncRNA sets for 11 categories and extremely expanded the lncRNA scopes for each set. (ii) We newly introduced 15 functional lncRNA categories from multiple resources. This update not only included a significant amount of downstream regulatory data for lncRNAs, but also covered numerous epigenetic regulatory data sets, including lncRNA-related transcription co-factor binding, chromatin regulator binding, and chromatin interaction data. (iii) We incorporated two new lncRNA set enrichment analysis functions based on GSEA and GSVA. (iv) We adopted the snakemake analysis pipeline to track data processing and analysis. In summary, LncSEA 2.0 offers a more comprehensive collection of lncRNA sets and a greater variety of enrichment analysis modules, assisting researchers in a more comprehensive study of the functional mechanisms of lncRNAs.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA Longo não Codificante , Bases de Dados de Ácidos Nucleicos/normas , RNA Longo não Codificante/genética , Análise de Dados
8.
Nucleic Acids Res ; 52(D1): D265-D272, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37855663

RESUMO

Riboswitches are regulatory elements found in the untranslated regions (UTRs) of certain mRNA molecules. They typically comprise two distinct domains: an aptamer domain that can bind to specific small molecules, and an expression platform that controls gene expression. Riboswitches work by undergoing a conformational change upon binding to their specific ligand, thus activating or repressing the genes downstream. This mechanism allows gene expression regulation in response to metabolites or small molecules. To systematically summarise riboswitch structures and their related ligand binding functions, we present Ribocentre-switch, a comprehensive database of riboswitches, including the information as follows: sequences, structures, functions, ligand binding pockets and biological applications. It encompasses 56 riboswitches and 26 orphan riboswitches from over 430 references, with a total of 89 591 sequences. It serves as a good resource for comparing different riboswitches and facilitating the identification of potential riboswitch candidates. Therefore, it may facilitate the understanding of RNA structural conformational changes in response to ligand signaling. The database is publicly available at https://riboswitch.ribocentre.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Riboswitch , Ligantes , Conformação de Ácido Nucleico , Sequências Reguladoras de Ácido Nucleico , Transdução de Sinais
9.
Nucleic Acids Res ; 52(D1): D134-D137, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37889039

RESUMO

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 25 trillion base pairs from over 3.7 billion nucleotide sequences for 557 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include policies for including spatio-temporal metadata, clarified documentation for GenBank data processing, enhanced foreign contamination screening tools, new processes in the Submission Portal, migration of Entrez Genome and Assembly displays into NCBI Datasets, and the impending retirement of tbl2asn, replaced by table2asn.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Sequência de Bases , Internet , Humanos
10.
Nucleic Acids Res ; 52(D1): D351-D359, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37904593

RESUMO

A growing interest in aptamer research, as evidenced by the increase in aptamer publications over the years, has led to calls for a go-to site for aptamer information. A comprehensive, publicly available aptamer dataset, which may be a repository for aptamer data, standardize aptamer reporting, and generate opportunities to expand current research in the field, could meet such a demand. There have been several attempts to create aptamer databases; however, most have been abandoned or removed entirely from public view. Inspired by previous efforts, we have published the UTexas Aptamer Database, https://sites.utexas.edu/aptamerdatabase, which includes a publicly available aptamer dataset and a searchable database containing a subset of all aptamer data collected to date (1990-2022). The dataset contains aptamer sequences, binding and selection information. The information is regularly reviewed internally to ensure accuracy and consistency across all entries. To support the continued curation and review of aptamer sequence information, we have implemented sustaining mechanisms, including researcher training protocols, an aptamer submission form, data stored separately from the database platform, and a growing team of researchers committed to updating the database. Currently, the UTexas Aptamer Database is the largest in terms of the number of aptamer sequences with 1,443 internally reviewed aptamer records.


Assuntos
Aptâmeros de Nucleotídeos , Bases de Dados de Ácidos Nucleicos , Conjuntos de Dados como Assunto
11.
Nucleic Acids Res ; 52(D1): D1-D9, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38035367

RESUMO

The 2024 Nucleic Acids Research database issue contains 180 papers from across biology and neighbouring disciplines. There are 90 papers reporting on new databases and 83 updates from resources previously published in the Issue. Updates from databases most recently published elsewhere account for a further seven. Nucleic acid databases include the new NAKB for structural information and updates from Genbank, ENA, GEO, Tarbase and JASPAR. The Issue's Breakthrough Article concerns NMPFamsDB for novel prokaryotic protein families and the AlphaFold Protein Structure Database has an important update. Metabolism is covered by updates from Reactome, Wikipathways and Metabolights. Microbes are covered by RefSeq, UNITE, SPIRE and P10K; viruses by ViralZone and PhageScope. Medically-oriented databases include the familiar COSMIC, Drugbank and TTD. Genomics-related resources include Ensembl, UCSC Genome Browser and Monarch. New arrivals cover plant imaging (OPIA and PlantPAD) and crop plants (SoyMD, TCOD and CropGS-Hub). The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). Over the last year the NAR online Molecular Biology Database Collection has been updated, reviewing 1060 entries, adding 97 new resources and eliminating 388 discontinued URLs bringing the current total to 1959 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.


Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos/tendências , Genômica , Internet , Biologia Molecular/tendências
12.
Nucleic Acids Res ; 52(D1): D1597-D1613, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37831097

RESUMO

The scope and function of RNA modifications in model plant systems have been extensively studied, resulting in the identification of an increasing number of novel RNA modifications in recent years. Researchers have gradually revealed that RNA modifications, especially N6-methyladenosine (m6A), which is one of the most abundant and commonly studied RNA modifications in plants, have important roles in physiological and pathological processes. These modifications alter the structure of RNA, which affects its molecular complementarity and binding to specific proteins, thereby resulting in various of physiological effects. The increasing interest in plant RNA modifications has necessitated research into RNA modifications and associated datasets. However, there is a lack of a convenient and integrated database with comprehensive annotations and intuitive visualization of plant RNA modifications. Here, we developed the Plant RNA Modification Database (PRMD; http://bioinformatics.sc.cn/PRMD and http://rnainformatics.org.cn/PRMD) to facilitate RNA modification research. This database contains information regarding 20 plant species and provides an intuitive interface for displaying information. Moreover, PRMD offers multiple tools, including RMlevelDiff, RMplantVar, RNAmodNet and Blast (for functional analyses), and mRNAbrowse, RNAlollipop, JBrowse and Integrative Genomics Viewer (for displaying data). Furthermore, PRMD is freely available, making it useful for the rapid development and promotion of research on plant RNA modifications.


Assuntos
Bases de Dados de Ácidos Nucleicos , Plantas , RNA de Plantas , Gerenciamento de Dados , Genômica , Plantas/genética , RNA de Plantas/genética
13.
Nucleic Acids Res ; 52(D1): D345-D350, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37811890

RESUMO

tRFtarget 1.0 (http://trftarget.net/) is a platform consolidating both computationally predicted and experimentally validated binding sites between transfer RNA-derived fragments (tRFs) and target genes (or transcripts) across multiple organisms. Here, we introduce a newly released version of tRFtarget 2.0, in which we integrated 6 additional tRF sources, resulting in a comprehensive collection of 2614 high-quality tRF sequences spanning across 9 species, including 1944 Homo sapiens tRFs and one newly incorporated species Rattus norvegicus. We also expanded target genes by including ribosomal RNAs, long non-coding RNAs, and coding genes >50 kb in length. The predicted binding sites have surged up to approximately 6 billion, a 20.5-fold increase than that in tRFtarget 1.0. The manually curated publications relevant to tRF targets have increased to 400 and the gene-level experimental evidence has risen to 232. tRFtarget 2.0 introduces several new features, including a web-based tool that identifies potential binding sites of tRFs in user's own datasets, integration of standardized tRF IDs, and inclusion of external links to contents within the database. Additionally, we enhanced website framework and user interface. With these improvements, tRFtarget 2.0 is more user-friendly, providing researchers a streamlined and comprehensive platform to accelerate their research progress.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA de Transferência , Animais , Humanos , Ratos , RNA de Transferência/metabolismo
14.
Nucleic Acids Res ; 52(D1): D92-D97, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37956313

RESUMO

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.


Assuntos
Genômica , Nucleotídeos , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Internet , Reprodutibilidade dos Testes , Europa (Continente)
15.
Nucleic Acids Res ; 52(D1): D33-D43, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37994677

RESUMO

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados Genéticas , National Library of Medicine (U.S.) , Biotecnologia/instrumentação , Bases de Dados de Ácidos Nucleicos , Internet , Estados Unidos
16.
Nucleic Acids Res ; 52(D1): D762-D769, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37962425

RESUMO

The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains over 315 000 bacterial and archaeal genomes and 236 million proteins with up-to-date and consistent annotation. In the past 3 years, we have expanded the diversity of the RefSeq collection by including the best quality metagenome-assembled genomes (MAGs) submitted to INSDC (DDBJ, ENA and GenBank), while maintaining its quality by adding validation checks. Assemblies are now more stringently evaluated for contamination and for completeness of annotation prior to acceptance into RefSeq. MAGs now account for over 17000 assemblies in RefSeq, split over 165 orders and 362 families. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP), which is used to annotate nearly all RefSeq assemblies include better detection of protein-coding genes. Nearly 83% of RefSeq proteins are now named by a curated Protein Family Model, a 4.7% increase in the past three years ago. In addition to literature citations, Enzyme Commission numbers, and gene symbols, Gene Ontology terms are now assigned to 48% of RefSeq proteins, allowing for easier multi-genome comparison. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/. PGAP is available as a stand-alone tool able to produce GenBank-ready files at https://github.com/ncbi/pgap.


Assuntos
Archaea , Bactérias , Bases de Dados de Ácidos Nucleicos , Metagenoma , Archaea/genética , Bactérias/genética , Bases de Dados de Ácidos Nucleicos/normas , Bases de Dados de Ácidos Nucleicos/tendências , Genoma Arqueal/genética , Genoma Bacteriano/genética , Internet , Anotação de Sequência Molecular , Proteínas/genética
17.
Nucleic Acids Res ; 52(D1): D229-D238, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37843123

RESUMO

We describe the Mitochondrial and Nuclear rRNA fragment database (MINRbase), a knowledge repository aimed at facilitating the study of ribosomal RNA-derived fragments (rRFs). MINRbase provides interactive access to the profiles of 130 238 expressed rRFs arising from the four human nuclear rRNAs (18S, 5.8S, 28S, 5S), two mitochondrial rRNAs (12S, 16S) or four spacers of 45S pre-rRNA. We compiled these profiles by analyzing 11 632 datasets, including the GEUVADIS and The Cancer Genome Atlas (TCGA) repositories. MINRbase offers a user-friendly interface that lets researchers issue complex queries based on one or more criteria, such as parental rRNA identity, nucleotide sequence, rRF minimum abundance and metadata keywords (e.g. tissue type, disease). A 'summary' page for each rRF provides a granular breakdown of its expression by tissue type, disease, sex, ancestry and other variables; it also allows users to create publication-ready plots at the click of a button. MINRbase has already allowed us to generate support for three novel observations: the internal spacers of 45S are prolific producers of abundant rRFs; many abundant rRFs straddle the known boundaries of rRNAs; rRF production is regimented and depends on 'personal attributes' (sex, ancestry) and 'context' (tissue type, tissue state, disease). MINRbase is available at https://cm.jefferson.edu/MINRbase/.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA Mitocondrial , RNA Ribossômico , Humanos , Sequência de Bases , Mitocôndrias/genética , Ribossomos , RNA Mitocondrial/genética , RNA Ribossômico/genética
18.
Nucleic Acids Res ; 52(D1): D203-D212, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37811871

RESUMO

With recent progress in mapping N7-methylguanosine (m7G) RNA methylation sites, tens of thousands of experimentally validated m7G sites have been discovered in various species, shedding light on the significant role of m7G modification in regulating numerous biological processes including disease pathogenesis. An integrated resource that enables the sharing, annotation and customized analysis of m7G data will greatly facilitate m7G studies under various physiological contexts. We previously developed the m7GHub database to host mRNA m7G sites identified in the human transcriptome. Here, we present m7GHub v.2.0, an updated resource for a comprehensive collection of m7G modifications in various types of RNA across multiple species: an m7GDB database containing 430 898 putative m7G sites identified in 23 species, collected from both widely applied next-generation sequencing (NGS) and the emerging Oxford Nanopore direct RNA sequencing (ONT) techniques; an m7GDiseaseDB hosting 156 206 m7G-associated variants (involving addition or removal of an m7G site), including 3238 disease-relevant m7G-SNPs that may function through epitranscriptome disturbance; and two enhanced analysis modules to perform interactive analyses on the collections of m7G sites (m7GFinder) and functional variants (m7GSNPer). We expect that m7Ghub v.2.0 should serve as a valuable centralized resource for studying m7G modification. It is freely accessible at: www.rnamd.org/m7GHub2.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , Humanos , Interpretação Estatística de Dados , Guanosina/genética
19.
Nucleic Acids Res ; 52(D1): D1327-D1332, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37650649

RESUMO

MicroRNAs (miRNAs) are a class of important small non-coding RNAs with critical molecular functions in almost all biological processes, and thus, they play important roles in disease diagnosis and therapy. Human MicroRNA Disease Database (HMDD) represents an important and comprehensive resource for biomedical researchers in miRNA-related medicine. Here, we introduce HMDD v4.0, which curates 53530 miRNA-disease association entries from literatures. In comparison to HMDD v3.0 released five years ago, HMDD v4.0 contains 1.5 times more entries. In addition, some new categories have been curated, including exosomal miRNAs implicated in diseases, virus-encoded miRNAs involved in human diseases, and entries containing miRNA-circRNA interactions. We also curated sex-biased miRNAs in diseases. Furthermore, in a case study, disease similarity analysis successfully revealed that sex-biased miRNAs related to developmental anomalies are associated with a number of human diseases with sex bias. HMDD can be freely visited at http://www.cuilab.cn/hmdd.


Assuntos
Bases de Dados de Ácidos Nucleicos , Doença , MicroRNAs , Humanos , MicroRNAs/genética , Doença/genética
20.
Nucleic Acids Res ; 52(D1): D52-D60, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37739414

RESUMO

Recent studies have demonstrated the important regulatory role of circRNAs, but an in-depth understanding of the comprehensive landscape of circRNAs across various species still remains unexplored. The current circRNA databases are often species-restricted or based on outdated datasets. To address this challenge, we have developed the circAtlas 3.0 database, which contains a rich collection of 2674 circRNA sequencing datasets, curated to delineate the landscape of circRNAs within 33 distinct tissues spanning 10 vertebrate species. Notably, circAtlas 3.0 represents a substantial advancement over its precursor, circAtlas 2.0, with the number of cataloged circRNAs escalating from 1 007 087 to 3 179 560, with 2 527 528 of them being reconstructed into full-length isoforms. circAtlas 3.0 also introduces several notable enhancements, including: (i) integration of both Illumina and Nanopore sequencing datasets to detect circRNAs of extended lengths; (ii) employment of a standardized nomenclature scheme for circRNAs, providing information of the host gene and full-length circular exons; (iii) inclusion of clinical cancer samples to explore the biological function of circRNAs within the context of cancer and (iv) links to other useful resources to enable user-friendly analysis of target circRNAs. The updated circAtlas 3.0 provides an important platform for exploring the evolution and biological implications of vertebrate circRNAs, and is freely available at http://circatlas.biols.ac.cn and https://ngdc.cncb.ac.cn/circatlas.


Assuntos
Bases de Dados de Ácidos Nucleicos , Neoplasias , RNA Circular , Animais , Humanos , Neoplasias/genética , Vertebrados/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA