Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 51
Filter
Add more filters











Publication year range
1.
Nucleic Acids Res ; 47(D1): D84-D88, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30395270

ABSTRACT

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid , Genomics/methods , Europe , Genome , Humans , Molecular Sequence Annotation , Search Engine , Software , Transcriptome , User-Computer Interface , Web Browser
2.
Nucleic Acids Res ; 46(D1): D36-D40, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29140475

ABSTRACT

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.


Subject(s)
Databases, Nucleic Acid , Computational Biology , Databases, Nucleic Acid/trends , Europe , High-Throughput Nucleotide Sequencing , Humans , Information Storage and Retrieval , Internet , Molecular Sequence Annotation
3.
Plasmid ; 91: 61-67, 2017 05.
Article in English | MEDLINE | ID: mdl-28365184

ABSTRACT

Good annotation of plasmid genomes is essential to maximise the value of the rapidly increasing volume of plasmid sequences. This short review highlights some of the current issues and suggests some ways forward. Where a well-studied related plasmid system exists we recommend that new annotation adheres to the convention already established for that system, so long as it is based on sound principles and solid experimental evidence, even if some of the new genes are more similar to homologues in different systems. Where a well-established model does not exist we provide generic gene names that reflect likely biochemical activity rather than overall purpose particularly, for example, where genes clearly belong to a type IV secretion system but it is not known whether they function in conjugative transfer or virulence. We also recommend that annotators use a whole system naming approach to avoid ending up with an illogical mixture of names from other systems based on the highest scoring match from a BLAST search. In addition, where function has not been experimentally established we recommend using just the locus tag, rather than a function-related gene name, while recording possible functions as notes rather than in a provisional name.


Subject(s)
Conjugation, Genetic , DNA, Bacterial/genetics , Molecular Sequence Annotation/methods , Plasmids/chemistry , Plasmids/classification , Chromosome Mapping , DNA Replication , DNA Transposable Elements , DNA, Bacterial/metabolism , Drug Resistance, Microbial/genetics , Gram-Negative Bacteria/drug effects , Gram-Negative Bacteria/genetics , Gram-Negative Bacteria/metabolism , Gram-Positive Bacteria/drug effects , Gram-Positive Bacteria/genetics , Gram-Positive Bacteria/metabolism , Plasmids/metabolism , Sequence Analysis, DNA , Terminology as Topic
4.
Nucleic Acids Res ; 45(D1): D32-D36, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899630

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Sequence Analysis, RNA , Genomics , Internet , Molecular Sequence Annotation
5.
Article in English | MEDLINE | ID: mdl-26861660

ABSTRACT

Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena.


Subject(s)
Computational Biology/economics , Databases, Nucleic Acid/economics , Metagenomics , Data Collection , Ecosystem , Europe , Geography , Humans , Microbiota , Molecular Sequence Annotation , Semantics , Sequence Analysis
6.
Nucleic Acids Res ; 44(D1): D58-66, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26615190

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the submission, maintenance and presentation of nucleotide sequence data and related sample and experimental information. In this article we report on ENA in 2015 regarding general activity, notable published data sets and major achievements. This is followed by a focus on sustainable biocuration of functional annotation, an area which has particularly felt the pressure of sequencing growth. The importance of functional annotation, how it can be submitted and the shifting role of the biocurator in the context of increasing volumes of data are all discussed.


Subject(s)
Databases, Nucleic Acid , Molecular Sequence Annotation , Sequence Analysis, DNA , Sequence Analysis, RNA , Data Curation
7.
J Med Microbiol ; 64(8): 869-878, 2015 Aug.
Article in English | MEDLINE | ID: mdl-26272054

ABSTRACT

Plasmid-mediated quinolone resistance (PMQR) refers to a family of closely related genes that confer decreased susceptibility to fluoroquinolones. PMQR genes are generally associated with integrons and/or plasmids that carry additional antimicrobial resistance genes active against a range of antimicrobials. In Ho Chi Minh City (HCMC), Vietnam, we have previously shown a high frequency of PMQR genes within commensal Enterobacteriaceae. However, there are limited available sequence data detailing the genetic context in which the PMQR genes reside, and a lack of understanding of how these genes spread across the Enterobacteriaceae. Here, we aimed to determine the genetic background facilitating the spread and maintenance of qnrS1, the dominant PMQR gene circulating in HCMC. We sequenced three qnrS1-carrying plasmids in their entirety to understand the genetic context of these qnrS1-embedded plasmids and also the association of qnrS1-mediated quinolone resistance with other antimicrobial resistance phenotypes. Annotation of the three qnrS1-containing plasmids revealed a qnrS1-containing transposon with a closely related structure. We screened 112 qnrS1-positive commensal Enterobacteriaceae isolated in the community and in a hospital in HCMC to detect the common transposon structure. We found the same transposon structure to be present in 71.4 % (45/63) of qnrS1-positive hospital isolates and in 36.7 % (18/49) of qnrS1-positive isolates from the community. The resulting sequence analysis of the qnrS1 environment suggested that qnrS1 genes are widely distributed and are mobilized on elements with a common genetic background. Our data add additional insight into mechanisms that facilitate resistance to multiple antimicrobials in Gram-negative bacteria in Vietnam.


Subject(s)
DNA Transposable Elements , Drug Resistance, Multiple, Bacterial , Enterobacteriaceae/genetics , Plasmids , Anti-Bacterial Agents/pharmacology , Community-Acquired Infections/microbiology , Cross Infection/microbiology , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Enterobacteriaceae/drug effects , Enterobacteriaceae/isolation & purification , Enterobacteriaceae Infections/microbiology , Humans , Interspersed Repetitive Sequences , Molecular Sequence Data , Quinolones/pharmacology , Sequence Analysis, DNA , Vietnam
8.
Nucleic Acids Res ; 43(Database issue): D23-9, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25404130

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.


Subject(s)
Databases, Nucleic Acid , Base Sequence , Genomics , Molecular Sequence Annotation , Sequence Analysis
9.
Genome Announc ; 2(3)2014 May 29.
Article in English | MEDLINE | ID: mdl-24874665

ABSTRACT

The complete genomes of two virulent phages infecting Citrobacter rodentium are reported here for the first time. Both bacteriophages were isolated from local sewage treatment plant effluents. Genome analyses revealed a close relationship between both phages and allowed their classification as members of the Autographivirinae subfamily in the T7-like genus.

10.
Nucleic Acids Res ; 42(Database issue): D38-43, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24214989

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.


Subject(s)
Databases, Nucleic Acid , Genomics , Europe , Internet
11.
ISME J ; 7(1): 148-60, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22955231

ABSTRACT

Bacteriovorax marinus SJ is a predatory delta-proteobacterium isolated from a marine environment. The genome sequence of this strain provides an interesting contrast to that of the terrestrial predatory bacterium Bdellovibrio bacteriovorus HD100. Based on their predatory lifestyle, Bacteriovorax were originally designated as members of the genus Bdellovibrio but subsequently were re-assigned to a new genus and family based on genetic and phenotypic differences. B. marinus attaches to gram-negative bacteria, penetrates through the cell wall to form a bdelloplast, in which it replicates, as shown using microscopy. Bacteriovorax is distinct, as it shares only 30% of its gene products with its closest sequenced relatives. Remarkably, 34% of predicted genes over 500 nt in length were completely unique with no significant matches in the databases. As expected, Bacteriovorax shares several characteristic loci with the other delta-proteobacteria. A geneset shared between Bacteriovorax and Bdellovibrio that is not conserved among other delta-proteobacteria such as Myxobacteria (which destroy prey bacteria externally via lysis), or the non-predatory Desulfo-bacteria and Geobacter species was identified. These 291 gene orthologues common to both Bacteriovorax and Bdellovibrio may be the key indicators of host-interaction predatory-specific processes required for prey entry. The locus from Bdellovibrio bacteriovorus is implicated in the switch from predatory to prey/host-independent growth. Although the locus is conserved in B. marinus, the sequence has only limited similarity. The results of this study advance understanding of both the similarities and differences between Bdellovibrio and Bacteriovorax and confirm the distant relationship between the two and their separation into different families.


Subject(s)
Bdellovibrio/genetics , Deltaproteobacteria/genetics , Seawater/microbiology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Bdellovibrio/classification , Bdellovibrio/physiology , Deltaproteobacteria/classification , Deltaproteobacteria/physiology , Food Chain , Gene Expression Regulation, Bacterial , Genome, Bacterial , Molecular Sequence Data , Phylogeny , Plasmids
12.
Nucleic Acids Res ; 41(Database issue): D30-5, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203883

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.


Subject(s)
Base Sequence , Databases, Nucleic Acid , Data Compression , Genomics , High-Throughput Nucleotide Sequencing , Internet , User-Computer Interface
13.
Nucleic Acids Res ; 40(Database issue): D43-7, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22080548

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), Europe's primary nucleotide sequence resource, captures and presents globally comprehensive nucleic acid sequence and associated information. Covering the spectrum from raw data to assembled and functionally annotated genomes, the ENA has witnessed a dramatic growth resulting from advances in sequencing technology and ever broadening application of the methodology. During 2011, we have continued to operate and extend the broad range of ENA services. In particular, we have released major new functionality in our interactive web submission system, Webin, through developments in template-based submissions for annotated sequences and support for raw next-generation sequence read submissions.


Subject(s)
Databases, Nucleic Acid , Sequence Analysis, DNA , Sequence Analysis, RNA , Genomics , High-Throughput Nucleotide Sequencing , Internet , Molecular Sequence Annotation , Software , User-Computer Interface
14.
Nucleic Acids Res ; 39(Database issue): D28-31, 2011 Jan.
Article in English | MEDLINE | ID: mdl-20972220

ABSTRACT

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimental research platform by providing data submission, archive, search and download services. In this article, we outline these services and describe major changes and improvements introduced during 2010. These include extended EMBL-Bank and SRA-data submission services, extended ENA Browser functionality, support for submitting data to the European Genome-phenome Archive (EGA) through SRA, and the launch of a new sequence similarity search service.


Subject(s)
Base Sequence , Databases, Nucleic Acid , Europe , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation
15.
Microbiology (Reading) ; 156(Pt 11): 3255-3269, 2010 Nov.
Article in English | MEDLINE | ID: mdl-20829291

ABSTRACT

Comparison of the complete genome sequence of Bacteroides fragilis 638R, originally isolated in the USA, was made with two previously sequenced strains isolated in the UK (NCTC 9343) and Japan (YCH46). The presence of 10 loci containing genes associated with polysaccharide (PS) biosynthesis, each including a putative Wzx flippase and Wzy polymerase, was confirmed in all three strains, despite a lack of cross-reactivity between NCTC 9343 and 638R surface PS-specific antibodies by immunolabelling and microscopy. Genomic comparisons revealed an exceptional level of PS biosynthesis locus diversity. Of the 10 divergent PS-associated loci apparent in each strain, none is similar between NCTC 9343 and 638R. YCH46 shares one locus with NCTC 9343, confirmed by mAb labelling, and a second different locus with 638R, making a total of 28 divergent PS biosynthesis loci amongst the three strains. The lack of expression of the phase-variable large capsule (LC) in strain 638R, observed in NCTC 9343, is likely to be due to a point mutation that generates a stop codon within a putative initiating glycosyltransferase, necessary for the expression of the LC in NCTC 9343. Other major sequence differences were observed to arise from different numbers and variety of inserted extra-chromosomal elements, in particular prophages. Extensive horizontal gene transfer has occurred within these strains, despite the presence of a significant number of divergent DNA restriction and modification systems that act to prevent acquisition of foreign DNA. The level of amongst-strain diversity in PS biosynthesis loci is unprecedented.


Subject(s)
Bacterial Capsules/genetics , Bacteroides fragilis/genetics , Genetic Variation , Genome, Bacterial , Bacterial Capsules/biosynthesis , Bacteroides fragilis/isolation & purification , Comparative Genomic Hybridization , DNA, Bacterial/genetics , Humans , Molecular Sequence Data , Sequence Analysis, DNA
16.
PLoS Negl Trop Dis ; 4(6): e702, 2010 Jun 08.
Article in English | MEDLINE | ID: mdl-20544028

ABSTRACT

BACKGROUND: Plasmid mediated antimicrobial resistance in the Enterobacteriaceae is a global problem. The rise of CTX-M class extended spectrum beta lactamases (ESBLs) has been well documented in industrialized countries. Vietnam is representative of a typical transitional middle income country where the spectrum of infectious diseases combined with the spread of drug resistance is shifting and bringing new healthcare challenges. METHODOLOGY: We collected hospital admission data from the pediatric population attending the hospital for tropical diseases in Ho Chi Minh City with Shigella infections. Organisms were cultured from all enrolled patients and subjected to antimicrobial susceptibility testing. Those that were ESBL positive were subjected to further investigation. These investigations included PCR amplification for common ESBL genes, plasmid investigation, conjugation, microarray hybridization and DNA sequencing of a bla(CTX-M) encoding plasmid. PRINCIPAL FINDINGS: We show that two different bla(CTX-M) genes are circulating in this bacterial population in this location. Sequence of one of the ESBL plasmids shows that rather than the gene being integrated into a preexisting MDR plasmid, the bla(CTX-M) gene is located on relatively simple conjugative plasmid. The sequenced plasmid (pEG356) carried the bla(CTX-M-24) gene on an ISEcp1 element and demonstrated considerable sequence homology with other IncFI plasmids. SIGNIFICANCE: The rapid dissemination, spread of antimicrobial resistance and changing population of Shigella spp. concurrent with economic growth are pertinent to many other countries undergoing similar development. Third generation cephalosporins are commonly used empiric antibiotics in Ho Chi Minh City. We recommend that these agents should not be considered for therapy of dysentery in this setting.


Subject(s)
Bacterial Proteins/genetics , Drug Resistance, Bacterial/genetics , Dysentery, Bacillary/epidemiology , Plasmids/genetics , Shigella/genetics , beta-Lactamases/genetics , Adolescent , Ceftriaxone/pharmacology , Child , Child, Preschool , Cluster Analysis , Humans , Infant , Infant, Newborn , Polymerase Chain Reaction , Sequence Analysis, DNA , Shigella/drug effects , Shigella/pathogenicity , Vietnam
17.
J Bacteriol ; 192(2): 525-38, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19897651

ABSTRACT

Citrobacter rodentium (formally Citrobacter freundii biotype 4280) is a highly infectious pathogen that causes colitis and transmissible colonic hyperplasia in mice. In common with enteropathogenic and enterohemorrhagic Escherichia coli (EPEC and EHEC, respectively), C. rodentium exploits a type III secretion system (T3SS) to induce attaching and effacing (A/E) lesions that are essential for virulence. Here, we report the fully annotated genome sequence of the 5.3-Mb chromosome and four plasmids harbored by C. rodentium strain ICC168. The genome sequence revealed key information about the phylogeny of C. rodentium and identified 1,585 C. rodentium-specific (without orthologues in EPEC or EHEC) coding sequences, 10 prophage-like regions, and 17 genomic islands, including the locus for enterocyte effacement (LEE) region, which encodes a T3SS and effector proteins. Among the 29 T3SS effectors found in C. rodentium are all 22 of the core effectors of EPEC strain E2348/69. In addition, we identified a novel C. rodentium effector, named EspS. C. rodentium harbors two type VI secretion systems (T6SS) (CTS1 and CTS2), while EHEC contains only one T6SS (EHS). Our analysis suggests that C. rodentium and EPEC/EHEC have converged on a common host infection strategy through access to a common pool of mobile DNA and that C. rodentium has lost gene functions associated with a previous pathogenic niche.


Subject(s)
Citrobacter rodentium/genetics , Escherichia coli/genetics , Evolution, Molecular , Genome, Bacterial/genetics , Animals , Citrobacter rodentium/classification , Computational Biology , Humans , Male , Mice , Molecular Sequence Data , Phylogeny
18.
Nat Rev Microbiol ; 7(6): 408-9, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19444245

ABSTRACT

This month's Genome Watch looks at the publication of four hyperthermophilic archaeal genomes, three of which belong to the Crenarchaeota phylum and one of which belongs to the newly defined Nanoarchaeota phylum.


Subject(s)
Genome, Archaeal/genetics , Crenarchaeota/genetics , Crenarchaeota/growth & development , Crenarchaeota/physiology , Crenarchaeota/ultrastructure , Nanoarchaeota/genetics , Nanoarchaeota/growth & development , Nanoarchaeota/physiology , Nanoarchaeota/ultrastructure
19.
Genome Biol ; 10(5): R51, 2009.
Article in English | MEDLINE | ID: mdl-19432983

ABSTRACT

BACKGROUND: Pseudomonas fluorescens are common soil bacteria that can improve plant health through nutrient cycling, pathogen antagonism and induction of plant defenses. The genome sequences of strains SBW25 and Pf0-1 were determined and compared to each other and with P. fluorescens Pf-5. A functional genomic in vivo expression technology (IVET) screen provided insight into genes used by P. fluorescens in its natural environment and an improved understanding of the ecological significance of diversity within this species. RESULTS: Comparisons of three P. fluorescens genomes (SBW25, Pf0-1, Pf-5) revealed considerable divergence: 61% of genes are shared, the majority located near the replication origin. Phylogenetic and average amino acid identity analyses showed a low overall relationship. A functional screen of SBW25 defined 125 plant-induced genes including a range of functions specific to the plant environment. Orthologues of 83 of these exist in Pf0-1 and Pf-5, with 73 shared by both strains. The P. fluorescens genomes carry numerous complex repetitive DNA sequences, some resembling Miniature Inverted-repeat Transposable Elements (MITEs). In SBW25, repeat density and distribution revealed 'repeat deserts' lacking repeats, covering approximately 40% of the genome. CONCLUSIONS: P. fluorescens genomes are highly diverse. Strain-specific regions around the replication terminus suggest genome compartmentalization. The genomic heterogeneity among the three strains is reminiscent of a species complex rather than a single species. That 42% of plant-inducible genes were not shared by all strains reinforces this conclusion and shows that ecological success requires specialized and core functions. The diversity also indicates the significant size of genetic information within the Pseudomonas pan genome.


Subject(s)
Ecosystem , Genome, Bacterial , Plants/microbiology , Pseudomonas fluorescens/genetics , Plants/metabolism , Pseudomonas fluorescens/classification , Pseudomonas fluorescens/metabolism
20.
Methods Mol Biol ; 502: 57-89, 2009.
Article in English | MEDLINE | ID: mdl-19082552

ABSTRACT

One of the most satisfying aspects of a genome sequencing project is the identification of the genes contained within it.These are of two types: those which encode tRNAs and those which produce proteins. After a general introduction on the properties of protein-encoding genes and the utility of the Basic Local Alignment Search Tool (BLASTX) to identify genes through homologs, a variety of tools are discussed by their creators. These include for genome annotation: GeneMark, Artemis, and BASys; and, for genome comparisons: Artemis Comparison Tool (ACT), Mauve, CoreGenes, and GeneOrder.


Subject(s)
Bacteriophages/genetics , Computational Biology/methods , DNA, Viral/genetics , DNA, Viral/analysis , Software
SELECTION OF CITATIONS
SEARCH DETAIL