Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
PLoS One ; 13(10): e0202513, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30339683

RESUMEN

Overlapping genes represent a fascinating evolutionary puzzle, since they encode two functionally unrelated proteins from the same DNA sequence. They originate by a mechanism of overprinting, in which point mutations in an existing frame allow the expression (the "birth") of a completely new protein from a second frame. In viruses, in which overlapping genes are abundant, these new proteins often play a critical role in infection, yet they are frequently overlooked during genome annotation. This results in erroneous interpretation of mutational studies and in a significant waste of resources. Therefore, overlapping genes need to be correctly detected, especially since they are now thought to be abundant also in eukaryotes. Developing better detection methods and conducting systematic evolutionary studies require a large, reliable benchmark dataset of known cases. We thus assembled a high-quality dataset of 80 viral overlapping genes whose expression is experimentally proven. Many of them were not present in databases. We found that overall, overlapping genes differ significantly from non-overlapping genes in their nucleotide and amino acid composition. In particular, the proteins they encode are enriched in high-degeneracy amino acids and depleted in low-degeneracy ones, which may alleviate the evolutionary constraints acting on overlapping genes. Principal component analysis revealed that the vast majority of overlapping genes follow a similar composition bias, despite their heterogeneity in length and function. Six proven mammalian overlapping genes also followed this bias. We propose that this apparently near-universal composition bias may either favour the birth of overlapping genes, or/and result from selection pressure acting on them.


Asunto(s)
Evolución Molecular , Genes Sobrepuestos/genética , Proteínas/genética , Secuencia de Aminoácidos/genética , Animales , Genes Virales/genética , Mamíferos/genética , Mutación , Sistemas de Lectura Abierta/genética , Análisis de Componente Principal
2.
Nucleic Acids Res ; 45(D1): D482-D490, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899678

RESUMEN

The Virus Variation Resource is a value-added viral sequence data resource hosted by the National Center for Biotechnology Information. The resource is located at http://www.ncbi.nlm.nih.gov/genome/viruses/variation/ and includes modules for seven viral groups: influenza virus, Dengue virus, West Nile virus, Ebolavirus, MERS coronavirus, Rotavirus A and Zika virus Each module is supported by pipelines that scan newly released GenBank records, annotate genes and proteins and parse sample descriptors and then map them to controlled vocabulary. These processes in turn support a purpose-built search interface where users can select sequences based on standardized gene, protein and metadata terms. Once sequences are selected, a suite of tools for downloading data, multi-sequence alignment and tree building supports a variety of user directed activities. This manuscript describes a series of features and functionalities recently added to the Virus Variation Resource.


Asunto(s)
Biología Computacional/métodos , Brotes de Enfermedades , Variación Genética , Programas Informáticos , Virosis/epidemiología , Virosis/virología , Virus/genética , Bases de Datos Genéticas
3.
Nucleic Acids Res ; 44(D1): D733-45, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26553804

RESUMEN

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.


Asunto(s)
Bases de Datos Genéticas , Genómica , Animales , Bovinos , Perfilación de la Expresión Génica , Genoma Fúngico , Genoma Humano , Genoma Microbiano , Genoma de Planta , Genoma Viral , Genómica/normas , Humanos , Invertebrados/genética , Ratones , Anotación de Secuencia Molecular , Nematodos/genética , Filogenia , ARN Largo no Codificante/genética , Ratas , Estándares de Referencia , Análisis de Secuencia de Proteína , Análisis de Secuencia de ARN , Vertebrados/genética
4.
Nucleic Acids Res ; 43(Database issue): D571-7, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25428358

RESUMEN

Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genoma Viral , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Anotación de Secuencia Molecular , Programas Informáticos , Virus/clasificación
5.
Viruses ; 6(11): 4760-99, 2014 Nov 24.
Artículo en Inglés | MEDLINE | ID: mdl-25421896

RESUMEN

In 2014, Ebola virus (EBOV) was identified as the etiological agent of a large and still expanding outbreak of Ebola virus disease (EVD) in West Africa and a much more confined EVD outbreak in Middle Africa. Epidemiological and evolutionary analyses confirmed that all cases of both outbreaks are connected to a single introduction each of EBOV into human populations and that both outbreaks are not directly connected. Coding-complete genomic sequence analyses of isolates revealed that the two outbreaks were caused by two novel EBOV variants, and initial clinical observations suggest that neither of them should be considered strains. Here we present consensus decisions on naming for both variants (West Africa: "Makona", Middle Africa: "Lomela") and provide database-compatible full, shortened, and abbreviated names that are in line with recently established filovirus sub-species nomenclatures.


Asunto(s)
Ebolavirus/clasificación , Fiebre Hemorrágica Ebola/virología , Terminología como Asunto , República Democrática del Congo/epidemiología , Brotes de Enfermedades , Ebolavirus/genética , Ebolavirus/aislamiento & purificación , Guinea/epidemiología , Fiebre Hemorrágica Ebola/epidemiología , Humanos , Filogenia , ARN Viral/genética , Análisis de Secuencia de ADN
6.
Viruses ; 6(9): 3663-82, 2014 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-25256396

RESUMEN

Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information's (NCBI's) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [ ()////-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Filoviridae/genética , Evolución Molecular , Filoviridae/clasificación , Humanos , Selección Genética
7.
Virus Res ; 160(1-2): 256-63, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21762736

RESUMEN

Viruses are most frequently discovered because they cause disease in organisms of importance to humans. To expand knowledge of plant-associated viruses beyond these narrow constraints, non-cultivated plants of the Tallgrass Prairie Preserve, Osage County, Oklahoma, USA were systematically surveyed for evidence of the presence of viruses. This report discusses viruses of the family Tombusviridae putatively identified by the survey. Evidence of two carmoviruses, a tombusvirus, a panicovirus and an unclassifiable tombusvirid was found. The complete genome sequence was obtained for putative TGP carmovirus 1 from the legume Lespedeza procumbens, and the virus was detected in several other plant species including the fern Pellaea atropurpurea. Phylogenetic analysis of the sequence and partial sequence of a related virus supported strongly the placement of these viruses in the genus Carmovirus. Polymorphisms in the sequences suggested existence of two populations of TGP carmovirus 1 in the study area and year-to-year variations in infection by TGP carmovirus 3.


Asunto(s)
Enfermedades de las Plantas/virología , Tombusviridae/clasificación , Tombusviridae/aislamiento & purificación , Análisis por Conglomerados , Lespedeza/virología , Modelos Moleculares , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Oklahoma , Filogenia , Pteridaceae/virología , ARN Viral/genética , Análisis de Secuencia de ADN , Tombusviridae/genética
8.
J Gen Virol ; 91(Pt 1): 74-86, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19759238

RESUMEN

Viral particles in stool samples from wild-living chimpanzees were analysed using random PCR amplification and sequencing. Sequences encoding proteins distantly related to the replicase protein of single-stranded circular DNA viruses were identified. Inverse PCR was used to amplify and sequence multiple small circular DNA viral genomes. The viral genomes were related in size and genome organization to vertebrate circoviruses and plant geminiviruses but with a different location for the stem-loop structure involved in rolling circle DNA replication. The replicase genes of these viruses were most closely related to those of the much smaller (approximately 1 kb) plant nanovirus circular DNA chromosomes. Because the viruses have characteristics of both animal and plant viruses, we named them chimpanzee stool-associated circular viruses (ChiSCV). Further metagenomic studies of animal samples will greatly increase our knowledge of viral diversity and evolution.


Asunto(s)
Animales Salvajes/virología , Infecciones por Virus ADN/veterinaria , Virus ADN/aislamiento & purificación , ADN Circular/genética , ADN Viral/genética , Heces/virología , Pan troglodytes/virología , Secuencia de Aminoácidos , Animales , Circovirus/genética , Infecciones por Virus ADN/virología , Virus ADN/genética , Geminiviridae/genética , Genes Virales , Modelos Moleculares , Datos de Secuencia Molecular , Nanovirus/genética , Conformación de Ácido Nucleico , Filogenia , Reacción en Cadena de la Polimerasa/métodos , Alineación de Secuencia , Análisis de Secuencia de ADN , Homología de Secuencia
9.
J Virol ; 83(22): 12002-6, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-19759142

RESUMEN

A novel picornavirus genome was sequenced, showing 42.6%, 35.2%, and 44.6% of deduced amino acid identities corresponding to the P1, P2, and P3 regions, respectively, of the Aichi virus. Divergent strains of this new virus, which we named salivirus, were detected in 18 stool samples from Nigeria, Tunisia, Nepal, and the United States. A statistical association was seen between virus shedding and unexplained cases of gastroenteritis in Nepal (P = 0.0056). Viruses with approximately 90% nucleotide similarity, named klassevirus, were also recently reported in three cases of unexplained diarrhea from the United States and Australia and in sewage from Spain, reflecting a global distribution and supporting a pathogenic role for this new group of picornaviruses.


Asunto(s)
Gastroenteritis/virología , Infecciones por Picornaviridae/virología , Picornaviridae/genética , Secuencia de Aminoácidos , Secuencia de Bases , Genoma Viral/genética , Humanos , Datos de Secuencia Molecular , Filogenia , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Proteínas Virales/genética
10.
J Virol ; 83(9): 4642-51, 2009 May.
Artículo en Inglés | MEDLINE | ID: mdl-19211756

RESUMEN

We analyzed viral nucleic acids in stool samples collected from 35 South Asian children with nonpolio acute flaccid paralysis (AFP). Sequence-independent reverse transcription and PCR amplification of capsid-protected, nuclease-resistant viral nucleic acids were followed by DNA sequencing and sequence similarity searches. Limited Sanger sequencing (35 to 240 subclones per sample) identified an average of 1.4 distinct eukaryotic viruses per sample, while pyrosequencing yielded 2.6 viruses per sample. In addition to bacteriophage and plant viruses, we detected known enteric viruses, including rotavirus, adenovirus, picobirnavirus, and human enterovirus species A (HEV-A) to HEV-C, as well as numerous other members of the Picornaviridae family, including parechovirus, Aichi virus, rhinovirus, and human cardiovirus. The viruses with the most divergent sequences relative to those of previously reported viruses included members of a novel Picornaviridae genus and four new viral species (members of the Dicistroviridae, Nodaviridae, and Circoviridae families and the Bocavirus genus). Samples from six healthy contacts of AFP patients were similarly analyzed and also contained numerous viruses, particularly HEV-C, including a potentially novel Enterovirus genotype. Determining the prevalences and pathogenicities of the novel genotypes, species, genera, and potential new viral families identified in this study in different demographic groups will require further studies with different demographic and patient groups, now facilitated by knowledge of these viral genomes.


Asunto(s)
Heces/virología , Genoma Viral/genética , Neurosífilis/virología , Enfermedad Aguda , Adolescente , Asia/epidemiología , Estudios de Casos y Controles , Niño , Preescolar , Enterovirus/clasificación , Enterovirus/genética , Infecciones por Enterovirus/epidemiología , Infecciones por Enterovirus/virología , Femenino , Salud , Humanos , Lactante , Masculino , Neurosífilis/sangre , Neurosífilis/epidemiología , Filogenia , Análisis de Secuencia de ADN
11.
J Virol ; 83(9): 4631-41, 2009 May.
Artículo en Inglés | MEDLINE | ID: mdl-19193786

RESUMEN

Cardioviruses cause enteric infections in mice and rats which when disseminated have been associated with myocarditis, type 1 diabetes, encephalitis, and multiple sclerosis-like symptoms. Cardioviruses have also been detected at lower frequencies in other mammals. The Cardiovirus genus within the Picornaviridae family is currently made up of two viral species, Theilovirus and Encephalomyocarditis virus. Until recently, only a single strain of cardioviruses (Vilyuisk virus within the Theilovirus species) associated with a geographically restricted and prevalent encephalitis-like condition had been reported to occur in humans. A second theilovirus-related cardiovirus (Saffold virus [SAFV]) was reported in 2007 and subsequently found in respiratory secretions from children with respiratory problems and in stools of both healthy and diarrheic children. Using viral metagenomics, we identified RNA fragments related to SAFV in the stools of Pakistani and Afghani children with nonpolio acute flaccid paralysis (AFP). We sequenced three near-full-length genomes, showing the presence of divergent strains of SAFV and preliminary evidence of a distant recombination event between the ancestors of the Theiler-like viruses of rats and those of human SAFV. Further VP1 sequencing showed the presence of five new SAFV genotypes, doubling the reported genetic diversity of human and animal theiloviruses combined. Both AFP patients and healthy children in Pakistan were found to be excreting SAFV at high frequencies of 9 and 12%, respectively. Further studies are needed to examine the roles of these highly common and diverse SAFV genotypes in nonpolio AFP and other human diseases.


Asunto(s)
Infecciones por Cardiovirus/epidemiología , Infecciones por Cardiovirus/virología , Cardiovirus/genética , Cardiovirus/aislamiento & purificación , Variación Genética/genética , Enfermedades Intestinales/epidemiología , Enfermedades Intestinales/virología , Enfermedad Aguda , Secuencia de Aminoácidos , Animales , Asia/epidemiología , Proteínas de la Cápside/química , Proteínas de la Cápside/clasificación , Proteínas de la Cápside/genética , Proteínas de la Cápside/metabolismo , Cardiovirus/clasificación , Cardiovirus/metabolismo , Estudios de Casos y Controles , Preescolar , Genoma Viral/genética , Genotipo , Salud , Humanos , Datos de Secuencia Molecular , Hipotonía Muscular/virología , Filogenia , Recombinación Genética/genética , Alineación de Secuencia , Análisis de Secuencia , Homología de Secuencia de Aminoácido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...