Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
Mais filtros

País/Região como assunto
Intervalo de ano de publicação
1.
Appl Environ Microbiol ; 89(1): e0167022, 2023 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-36519847

RESUMO

Metagenomic sequencing is a swift and powerful tool to ascertain the presence of an organism of interest in a sample. However, sequencing coverage of the organism of interest can be insufficient due to an inundation of reads from irrelevant organisms in the sample. Here, we report a nuclease-based approach to rapidly enrich for DNA from certain organisms, including enterobacteria, based on their differential endogenous modification patterns. We exploit the ability of taxon-specific methylated motifs to resist the action of cognate methylation-sensitive restriction endonucleases that thereby digest unwanted, unmethylated DNA. Subsequently, we use a distributive exonuclease or electrophoretic separation to deplete or exclude the digested fragments, thus enriching for undigested DNA from the organism of interest. As a proof of concept, we apply this method to enrich for the enterobacteria Escherichia coli and Salmonella enterica by 11- to 142-fold from mock metagenomic samples and validate this approach as a versatile means to enrich for genomes of interest in metagenomic samples. IMPORTANCE Pathogens that contaminate the food supply or spread through other means can cause outbreaks that bring devastating repercussions to the health of a populace. Investigations to trace the source of these outbreaks are initiated rapidly but can be drawn out due to the labored methods of pathogen isolation. Metagenomic sequencing can alleviate this hurdle but is often insufficiently sensitive. The approach and implementations detailed here provide a rapid means to enrich for many pathogens involved in foodborne outbreaks, thereby improving the utility of metagenomic sequencing as a tool in outbreak investigations. Additionally, this approach provides a means to broadly enrich for otherwise minute levels of modified DNA, which may escape unnoticed in metagenomic samples.


Assuntos
Enzimas de Restrição do DNA , DNA Bacteriano , Escherichia coli , Metagenômica , Salmonella enterica , DNA , Escherichia coli/genética , Escherichia coli/isolamento & purificação , Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Metagenômica/métodos , Salmonella enterica/genética , Salmonella enterica/isolamento & purificação , DNA Bacteriano/genética
2.
Nucleic Acids Res ; 45(D1): D37-D42, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899564

RESUMO

GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Animais , Metilação de DNA , Genoma Bacteriano , Genômica , Humanos , RNA Ribossômico 16S/genética , beta-Lactamases/genética
3.
Nucleic Acids Res ; 44(D1): D67-72, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26590407

RESUMO

GenBank(®) (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 340 000 formally described species. Recent developments include a new starting page for submitters, a shift toward using accession.version identifiers rather than GI numbers, a wizard for submitting 16S rRNA sequences, and an Identical Protein Report to address growing issues of data redundancy. GenBank organizes the sequence data received from individual laboratories and large-scale sequencing projects into 18 divisions, and GenBank staff assign unique accession.version identifiers upon data receipt. Most submitters use the web-based BankIt or standalone Sequin programs. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the nuccore, nucest, and nucgss databases of the Entrez retrieval system, which integrates these records with a variety of other data including taxonomy nodes, genomes, protein structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Proteínas/genética , RNA Ribossômico 16S/genética
4.
Nucleic Acids Res ; 43(Database issue): D30-5, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414350

RESUMO

GenBank(®) (http://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 300 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.


Assuntos
Bases de Dados de Ácidos Nucleicos , Bactérias/classificação , Genômica , Internet , Análise de Sequência de DNA , Análise de Sequência de Proteína
5.
Nucleic Acids Res ; 42(Database issue): D32-7, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24217914

RESUMO

GenBank is a comprehensive database that contains publicly available nucleotide sequences for over 280,000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Bactérias/classificação , Bactérias/genética , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Anotação de Sequência Molecular
6.
Nucleic Acids Res ; 41(Database issue): D36-42, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193287

RESUMO

GenBank® (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Assuntos
Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Anotação de Sequência Molecular , Análise de Sequência de DNA
7.
J Virol ; 87(3): 1400-10, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23115287

RESUMO

Individuals <60 years of age had the lowest incidence of infection, with ~25% of these people having preexisting, cross-reactive antibodies to novel 2009 H1N1 influenza. Many people >60 years old also had preexisting antibodies to novel H1N1. These observations are puzzling because the seasonal H1N1 viruses circulating during the last 60 years were not antigenically similar to novel H1N1. We therefore hypothesized that a sequence of exposures to antigenically different seasonal H1N1 viruses can elicit an antibody response that protects against novel 2009 H1N1. Ferrets were preinfected with seasonal H1N1 viruses and assessed for cross-reactive antibodies to novel H1N1. Serum from infected ferrets was assayed for cross-reactivity to both seasonal and novel 2009 H1N1 strains. These results were compared to those of ferrets that were sequentially infected with H1N1 viruses isolated prior to 1957 or more-recently isolated viruses. Following seroconversion, ferrets were challenged with novel H1N1 influenza virus and assessed for viral titers in the nasal wash, morbidity, and mortality. There was no hemagglutination inhibition (HAI) cross-reactivity in ferrets infected with any single seasonal H1N1 influenza viruses, with limited protection to challenge. However, sequential H1N1 influenza infections reduced the incidence of disease and elicited cross-reactive antibodies to novel H1N1 isolates. The amount and duration of virus shedding and the frequency of transmission following novel H1N1 challenge were reduced. Exposure to multiple seasonal H1N1 influenza viruses, and not to any single H1N1 influenza virus, elicits a breadth of antibodies that neutralize novel H1N1 even though the host was never exposed to the novel H1N1 influenza viruses.


Assuntos
Vírus da Influenza A Subtipo H1N1/imunologia , Infecções por Orthomyxoviridae/imunologia , Infecções por Orthomyxoviridae/virologia , Animais , Anticorpos Antivirais/sangue , Reações Cruzadas , Modelos Animais de Doenças , Furões , Testes de Inibição da Hemaglutinação , Cavidade Nasal/virologia , Infecções por Orthomyxoviridae/mortalidade , Infecções por Orthomyxoviridae/patologia , Análise de Sobrevida , Carga Viral , Eliminação de Partículas Virais
9.
Nucleic Acids Res ; 40(Database issue): D48-53, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22144687

RESUMO

GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 250,00 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados de Ácidos Nucleicos , Análise de Sequência de DNA , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Anotação de Sequência Molecular , Análise de Sequência de RNA , Interface Usuário-Computador
10.
Nucleic Acids Res ; 40(Database issue): D13-25, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22140104

RESUMO

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados como Assunto , Bases de Dados Genéticas , Bases de Dados de Proteínas , Expressão Gênica , Genômica , Internet , Modelos Moleculares , National Library of Medicine (U.S.) , Publicações Periódicas como Assunto , PubMed , Alinhamento de Sequência , Análise de Sequência de DNA , Análise de Sequência de Proteína , Análise de Sequência de RNA , Bibliotecas de Moléculas Pequenas , Estados Unidos
11.
medRxiv ; 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38903069

RESUMO

Whole-genome sequencing of bacterial pathogens is used by public health agencies to link cases of food poisoning caused by the same source of contamination. The vast majority of these appear to be sporadic cases associated with small contamination episodes and do not trigger investigations. We analyzed clusters of sequenced clinical isolates of Salmonella, Escherichia coli, Campylobacter, and Listeria that differ by only a small number of mutations to provide a new understanding of the underlying contamination episodes. These analyses provide new evidence that the youngest age groups have greater susceptibility to infection from Salmonella, Escherichia coli, and Campylobacter than older age groups. This age bias is weaker for the common Salmonella serovar Enteritidis than Salmonella in general. Analysis of these clusters reveals significant regional variations in relative frequencies of Salmonella serovars across the United States. A large fraction of the contamination episodes causing sickness appear to have long duration. For example, 50% of the Salmonella cases are in clusters that persist for almost three years. For all four pathogen species, the majority of the cases were part of genetic clusters with illnesses in multiple states and likely to be caused by contaminated commercially distributed foods. The vast majority of Salmonella cases among infants < 6 months of age appear to be caused by cross-contamination from foods consumed by older age groups or by environmental bacteria rather than infant formula contaminated at production sites.

12.
Nucleic Acids Res ; 39(Database issue): D32-7, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21071399

RESUMO

GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados de Ácidos Nucleicos , Etiquetas de Sequências Expressas , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Metagenômica , Anotação de Sequência Molecular , Software
13.
Nucleic Acids Res ; 39(Database issue): D38-51, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21097890

RESUMO

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados Genéticas , Bases de Dados de Proteínas , Expressão Gênica , Genômica , National Library of Medicine (U.S.) , Estrutura Terciária de Proteína , PubMed , Alinhamento de Sequência , Análise de Sequência de DNA , Análise de Sequência de RNA , Software , Integração de Sistemas , Estados Unidos
14.
Eur J Sport Sci ; 23(12): 2340-2348, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37424300

RESUMO

Using a large database of continuous glucose monitoring (CGM) data, this study aimed to gain insights into the association between pre-exercise food ingestion timing and reactive hypoglycemia. A group of 6,761 users self-reported 48,799 pre-exercise food ingestion events and logged minute-by-minute CGM, which was used to detect reactive hypoglycemia (<70 mg/dL) in the first 30 min of exercise. A linear and a non-linear binomial logistic regression model was used to investigate the association between food ingestion timing and the probability of experiencing reactive hypoglycemia. An analysis of variance was conducted to compare the predictive ability of the models. On average, reactive hypoglycemia was detected in 8.34 ± 3.04% of the total events, with <15% of individuals experiencing hypoglycemia in >20% of their events. The majority of the reactive hypoglycemia events were found with pre-exercise food timing between ∼30 and ∼90 min, with a peak at ∼60 min. The superior accuracy (62.05 vs 45.1%) and F-score (0.75 vs 0.59) of the non-linear vs the linear model were statistically superior (P < 0.0001). These results support the notion of an unfavourable 30-to-90 min pre-exercise food ingestion time window which can significantly impact the likelihood of reactive hypoglycemia in some individuals.


Large datasets of self-reported continuous glucose monitoring and food events are used here for the first time to get insights into reactive hypoglycemia, a condition often regarded as negative for endurance performance eventsUsing a binomial non-linear logistic regression model, the association between pre-exercise food ingestion timing and reactive hypoglycemia revealed the presence of an unfavourable window, when reactive hypoglycemia is more likely to occur.Results confirm an individual predisposition to reactive hypoglycemia and, for 8 in 100 individuals, the pre-exercise food ingestion timing can meaningfully impact the likelihood of experiencing reactive hypoglycemia.


Assuntos
Diabetes Mellitus Tipo 1 , Hipoglicemia , Humanos , Glicemia , Automonitorização da Glicemia/métodos , Ingestão de Alimentos
15.
Front Microbiol ; 14: 1212863, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37396378

RESUMO

Outbreaks of cyclosporiasis, an enteric illness caused by the parasite Cyclospora cayetanensis, have been associated with consumption of various types of fresh produce. Although a method is in use for genotyping C. cayetanensis from clinical specimens, the very low abundance of C. cayetanensis in food and environmental samples presents a greater challenge. To complement epidemiological investigations, a molecular surveillance tool is needed for use in genetic linkage of food vehicles to cyclosporiasis illnesses, estimation of the scope of outbreaks or clusters of illness, and determination of geographical areas involved. We developed a targeted amplicon sequencing (TAS) assay that incorporates a further enrichment step to gain the requisite sensitivity for genotyping C. cayetanensis contaminating fresh produce samples. The TAS assay targets 52 loci, 49 of which are located in the nuclear genome, and encompasses 396 currently known SNP sites. The performance of the TAS assay was evaluated using lettuce, basil, cilantro, salad mix, and blackberries inoculated with C. cayetanensis oocysts. A minimum of 24 markers were haplotyped even at low contamination levels of 10 oocysts in 25 g leafy greens. The artificially contaminated fresh produce samples were included in a genetic distance analysis based on haplotype presence/absence with publicly available C. cayetanensis whole genome sequence assemblies. Oocysts from two different sources were used for inoculation, and samples receiving the same oocyst preparation clustered together, but separately from the other group, demonstrating the utility of the assay for genetically linking samples. Clinical fecal samples with low parasite loads were also successfully genotyped. This work represents a significant advance in the ability to genotype C. cayetanensis contaminating fresh produce along with greatly expanding the genomic diversity included for genetic clustering of clinical specimens.

16.
Nucleic Acids Res ; 38(Database issue): D46-51, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19910366

RESUMO

GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Algoritmos , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Etiquetas de Sequências Expressas , Genoma Bacteriano , Genoma de Planta , Genoma Viral , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , National Institutes of Health (U.S.) , National Library of Medicine (U.S.) , Software , Estados Unidos
17.
Nucleic Acids Res ; 38(Database issue): D5-16, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19910364

RESUMO

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Algoritmos , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Genoma Bacteriano , Genoma Viral , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , National Institutes of Health (U.S.) , National Library of Medicine (U.S.) , Software , Estados Unidos
18.
Proc Natl Acad Sci U S A ; 106(18): 7273-80, 2009 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-19351897

RESUMO

The evolutionary rates of protein-coding genes in an organism span, approximately, 3 orders of magnitude and show a universal, approximately log-normal distribution in a broad variety of species from prokaryotes to mammals. This universal distribution implies a steady-state process, with identical distributions of evolutionary rates among genes that are gained and genes that are lost. A mathematical model of such process is developed under the single assumption of the constancy of the distributions of the propensities for gene loss (PGL). This model predicts that genes of different ages, that is, genes with homologs detectable at different phylogenetic depths, substantially differ in those variables that correlate with PGL. We computationally partition protein-coding genes from humans, flies, and Aspergillus fungus into age classes, and show that genes of different ages retain the universal log-normal distribution of evolutionary rates, with a shift toward higher rates in "younger" classes but also with a substantial overlap. The only exception involves human primate-specific genes that show a heavy tail of rapidly evolving genes, probably owing to gene annotation artifacts. As predicted, the gene age classes differ in characteristics correlated with PGL. Compared with "young" genes (e.g., mammal-specific human ones), "old" genes (e.g., eukaryote-specific), on average, are longer, are expressed at a higher level, possess a higher intron density, evolve slower on the short time scale, and are subject to stronger purifying selection. Thus, genome evolution fits a simple model with approximately uniform rates of gene gain and loss, without major bursts of genomic innovation.


Assuntos
Evolução Molecular , Genes , Modelos Genéticos , Proteínas/genética , Animais , Células Eucarióticas/metabolismo , Transferência Genética Horizontal , Genoma , Humanos
19.
Nature ; 437(7062): 1162-6, 2005 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-16208317

RESUMO

Influenza viruses are remarkably adept at surviving in the human population over a long timescale. The human influenza A virus continues to thrive even among populations with widespread access to vaccines, and continues to be a major cause of morbidity and mortality. The virus mutates from year to year, making the existing vaccines ineffective on a regular basis, and requiring that new strains be chosen for a new vaccine. Less-frequent major changes, known as antigenic shift, create new strains against which the human population has little protective immunity, thereby causing worldwide pandemics. The most recent pandemics include the 1918 'Spanish' flu, one of the most deadly outbreaks in recorded history, which killed 30-50 million people worldwide, the 1957 'Asian' flu, and the 1968 'Hong Kong' flu. Motivated by the need for a better understanding of influenza evolution, we have developed flexible protocols that make it possible to apply large-scale sequencing techniques to the highly variable influenza genome. Here we report the results of sequencing 209 complete genomes of the human influenza A virus, encompassing a total of 2,821,103 nucleotides. In addition to increasing markedly the number of publicly available, complete influenza virus genomes, we have discovered several anomalies in these first 209 genomes that demonstrate the dynamic nature of influenza transmission and evolution. This new, large-scale sequencing effort promises to provide a more comprehensive picture of the evolution of influenza viruses and of their pattern of transmission through human and animal populations. All data from this project are being deposited, without delay, in public archives.


Assuntos
Evolução Molecular , Genoma Viral , Vírus da Influenza A/genética , Influenza Humana/virologia , Mutagênese/genética , Animais , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Glicoproteínas de Hemaglutininação de Vírus da Influenza/imunologia , História do Século XX , História do Século XXI , Humanos , Vírus da Influenza A/classificação , Vírus da Influenza A/isolamento & purificação , Vírus da Influenza A/fisiologia , Vacinas contra Influenza/história , Vacinas contra Influenza/imunologia , Influenza Humana/epidemiologia , Influenza Humana/transmissão , Influenza Humana/veterinária , Mutação/genética , Neuraminidase/genética , Neuraminidase/metabolismo , New York/epidemiologia , Filogenia , Setor Público , Vírus Reordenados/genética , Análise de Sequência , Fatores de Tempo , Replicação Viral
20.
Nucleic Acids Res ; 37(20): 6799-810, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-19745054

RESUMO

In a wide range of genomes, it was observed that the usage of synonymous codons is biased toward specific codons and codon patterns. Factors that are implicated in the selection for codon usage include facilitation of fast and accurate translation. There are two types of translational errors: missense errors and processivity errors. There is considerable evidence in support of the hypothesis that codon usage is optimized to minimize missense errors. In contrast, little is known about the relationship between codon usage and frameshifting errors, an important form of processivity errors, which appear to occur at frequencies comparable to the frequencies of missense errors. Based on the recently proposed pause-and-slip model of frameshifting, we developed Frameshifting Robustness Score (FRS). We used this measure to test if the pattern of codon usage indicates optimization against frameshifting errors. We found that the FRS values of protein-coding sequences from four analyzed genomes (the bacteria Bacillus subtilis and Escherichia coli, and the yeasts Saccharomyces cerevisiae and Schizosaccharomyce pombe) were typically higher than expected by chance. Other properties of FRS patterns observed in B. subtilis, S. cerevisiae and S. pombe, such as the tendency of FRS to increase from the 5'- to 3'-end of protein-coding sequences, were also consistent with the hypothesis of optimization against frameshifting errors in translation. For E. coli, the results of different tests were less consistent, suggestive of a much weaker optimization, if any. Collectively, the results fit the concept of selection against mistranslation-induced protein misfolding being one of the factors shaping the evolution of both coding and non-coding sequences.


Assuntos
Códon , Evolução Molecular , Biossíntese de Proteínas , Mutação da Fase de Leitura , Modelos Genéticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA