Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 72
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nature ; 584(7821): 403-409, 2020 08.
Article in English | MEDLINE | ID: mdl-32760000

ABSTRACT

The tuatara (Sphenodon punctatus)-the only living member of the reptilian order Rhynchocephalia (Sphenodontia), once widespread across Gondwana1,2-is an iconic species that is endemic to New Zealand2,3. A key link to the now-extinct stem reptiles (from which dinosaurs, modern reptiles, birds and mammals evolved), the tuatara provides key insights into the ancestral amniotes2,4. Here we analyse the genome of the tuatara, which-at approximately 5 Gb-is among the largest of the vertebrate genomes yet assembled. Our analyses of this genome, along with comparisons with other vertebrate genomes, reinforce the uniqueness of the tuatara. Phylogenetic analyses indicate that the tuatara lineage diverged from that of snakes and lizards around 250 million years ago. This lineage also shows moderate rates of molecular evolution, with instances of punctuated evolution. Our genome sequence analysis identifies expansions of proteins, non-protein-coding RNA families and repeat elements, the latter of which show an amalgam of reptilian and mammalian features. The sequencing of the tuatara genome provides a valuable resource for deep comparative analyses of tetrapods, as well as for tuatara biology and conservation. Our study also provides important insights into both the technical challenges and the cultural obligations that are associated with genome sequencing.


Subject(s)
Evolution, Molecular , Genome/genetics , Phylogeny , Reptiles/genetics , Animals , Conservation of Natural Resources/trends , Female , Genetics, Population , Lizards/genetics , Male , Molecular Sequence Annotation , New Zealand , Sex Characteristics , Snakes/genetics , Synteny
3.
Nucleic Acids Res ; 49(W1): W654-W661, 2021 07 02.
Article in English | MEDLINE | ID: mdl-33744969

ABSTRACT

Experiments that are planned using accurate prediction algorithms will mitigate failures in recombinant protein production. We have developed TISIGNER (https://tisigner.com) with the aim of addressing technical challenges to recombinant protein production. We offer three web services, TIsigner (Translation Initiation coding region designer), SoDoPE (Soluble Domain for Protein Expression) and Razor, which are specialised in synonymous optimisation of recombinant protein expression, solubility and signal peptide analysis, respectively. Importantly, TIsigner, SoDoPE and Razor are linked, which allows users to switch between the tools when optimising genes of interest.


Subject(s)
Recombinant Proteins/biosynthesis , Software , Internet , Peptide Chain Initiation, Translational , Protein Sorting Signals , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Solubility
4.
PLoS Comput Biol ; 17(10): e1009461, 2021 10.
Article in English | MEDLINE | ID: mdl-34610008

ABSTRACT

Recombinant protein production is a key process in generating proteins of interest in the pharmaceutical industry and biomedical research. However, about 50% of recombinant proteins fail to be expressed in a variety of host cells. Here we show that the accessibility of translation initiation sites modelled using the mRNA base-unpairing across the Boltzmann's ensemble significantly outperforms alternative features. This approach accurately predicts the successes or failures of expression experiments, which utilised Escherichia coli cells to express 11,430 recombinant proteins from over 189 diverse species. On this basis, we develop TIsigner that uses simulated annealing to modify up to the first nine codons of mRNAs with synonymous substitutions. We show that accessibility captures the key propensity beyond the target region (initiation sites in this case), as a modest number of synonymous changes is sufficient to tune the recombinant protein expression levels. We build a stochastic simulation model and show that higher accessibility leads to higher protein production and slower cell growth, supporting the idea of protein cost, where cell growth is constrained by protein circuits during overexpression.


Subject(s)
Codon, Initiator/genetics , Codon, Terminator/genetics , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Silent Mutation/genetics , Computational Biology
5.
Environ Microbiol ; 23(9): 5289-5304, 2021 09.
Article in English | MEDLINE | ID: mdl-33989447

ABSTRACT

Some Serratia entomophila isolates have been successfully exploited in biopesticides due to their ability to cause amber disease in larvae of the Aotearoa (New Zealand) endemic pasture pest, Costelytra giveni. Anti-feeding prophage and ABC toxin complex virulence determinants are encoded by a 153-kb single-copy conjugative plasmid (pADAP; amber disease-associated plasmid). Despite growing understanding of the S. entomophila pADAP model plasmid, little is known about the wider plasmid family. Here, we sequence and analyse mega-plasmids from 50 Serratia isolates that induce variable disease phenotypes in the C. giveni insect host. Mega-plasmids are highly conserved within S. entomophila, but show considerable divergence in Serratia proteamaculans with other variants in S. liquefaciens and S. marcescens, likely reflecting niche adaption. In this study to reconstruct ancestral relationships for a complex mega-plasmid system, strong co-evolution between Serratia species and their plasmids were found. We identify 12 distinct mega-plasmid genotypes, all sharing a conserved gene backbone, but encoding highly variable accessory regions including virulence factors, secondary metabolite biosynthesis, Nitrogen fixation genes and toxin-antitoxin systems. We show that the variable pathogenicity of Serratia isolates is largely caused by presence/absence of virulence clusters on the mega-plasmids, but notably, is augmented by external chromosomally encoded factors.


Subject(s)
Coleoptera , Animals , Larva , Plasmids/genetics , Prophages/genetics , Virulence/genetics
6.
Bioinformatics ; 36(18): 4691-4698, 2020 09 15.
Article in English | MEDLINE | ID: mdl-32559287

ABSTRACT

MOTIVATION: Recombinant protein production is a widely used technique in the biotechnology and biomedical industries, yet only a quarter of target proteins are soluble and can therefore be purified. RESULTS: We have discovered that global structural flexibility, which can be modeled by normalized B-factors, accurately predicts the solubility of 12 216 recombinant proteins expressed in Escherichia coli. We have optimized these B-factors, and derived a new set of values for solubility scoring that further improves prediction accuracy. We call this new predictor the 'Solubility-Weighted Index' (SWI). Importantly, SWI outperforms many existing protein solubility prediction tools. Furthermore, we have developed 'SoDoPE' (Soluble Domain for Protein Expression), a web interface that allows users to choose a protein region of interest for predicting and maximizing both protein expression and solubility. AVAILABILITY AND IMPLEMENTATION: The SoDoPE web server and source code are freely available at https://tisigner.com/sodope and https://github.com/Gardner-BinfLab/TISIGNER-ReactJS, respectively. The code and data for reproducing our analysis can be found at https://github.com/Gardner-BinfLab/SoDoPE_paper_2020. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Proteins , Software , Computers , Escherichia coli/genetics , Solubility
7.
PLoS Genet ; 14(5): e1007333, 2018 05.
Article in English | MEDLINE | ID: mdl-29738521

ABSTRACT

Emerging pathogens are a major threat to public health, however understanding how pathogens adapt to new niches remains a challenge. New methods are urgently required to provide functional insights into pathogens from the massive genomic data sets now being generated from routine pathogen surveillance for epidemiological purposes. Here, we measure the burden of atypical mutations in protein coding genes across independently evolved Salmonella enterica lineages, and use these as input to train a random forest classifier to identify strains associated with extraintestinal disease. Members of the species fall along a continuum, from pathovars which cause gastrointestinal infection and low mortality, associated with a broad host-range, to those that cause invasive infection and high mortality, associated with a narrowed host range. Our random forest classifier learned to perfectly discriminate long-established gastrointestinal and invasive serovars of Salmonella. Additionally, it was able to discriminate recently emerged Salmonella Enteritidis and Typhimurium lineages associated with invasive disease in immunocompromised populations in sub-Saharan Africa, and within-host adaptation to invasive infection. We dissect the architecture of the model to identify the genes that were most informative of phenotype, revealing a common theme of degradation of metabolic pathways in extraintestinal lineages. This approach accurately identifies patterns of gene degradation and diversifying selection specific to invasive serovars that have been captured by more labour-intensive investigations, but can be readily scaled to larger analyses.


Subject(s)
Adaptation, Physiological/genetics , Bacterial Proteins/genetics , Machine Learning , Salmonella enterica/genetics , Animals , Host Specificity , Humans , Mutation , Phylogeny , Salmonella Infections/microbiology , Salmonella Infections, Animal/microbiology , Salmonella enterica/classification , Salmonella enterica/pathogenicity , Virulence/genetics
8.
Mol Biol Evol ; 35(6): 1451-1462, 2018 06 01.
Article in English | MEDLINE | ID: mdl-29617896

ABSTRACT

Mammalian diversification has coincided with a rapid proliferation of various types of noncoding RNAs, including members of both snRNAs and snoRNAs. The significance of this expansion however remains obscure. While some ncRNA copy-number expansions have been linked to functionally tractable effects, such events may equally likely be neutral, perhaps as a result of random retrotransposition. Hindering progress in our understanding of such observations is the difficulty in establishing function for the diverse features that have been identified in our own genome. Projects such as ENCODE and FANTOM have revealed a hidden world of genomic expression patterns, as well as a host of other potential indicators of biological function. However, such projects have been criticized, particularly from practitioners in the field of molecular evolution, where many suspect these data provide limited insight into biological function. The molecular evolution community has largely taken a skeptical view, thus it is important to establish tests of function. We use a range of data, including data drawn from ENCODE and FANTOM, to examine the case for function for the recent copy number expansion in mammals of six evolutionarily ancient RNA families involved in splicing and rRNA maturation. We use several criteria to assess evidence for function: conservation of sequence and structure, genomic synteny, evidence for transposition, and evidence for species-specific expression. Applying these criteria, we find that only a minority of loci show strong evidence for function and that, for the majority, we cannot reject the null hypothesis of no function.


Subject(s)
DNA Transposable Elements , Gene Dosage , Gene Expression , Mammals/genetics , RNA, Small Nuclear , Animals , Databases, Genetic , Evolution, Molecular , Genomics , Multigene Family , RNA Splicing
9.
Biochem Soc Trans ; 47(2): 527-539, 2019 04 30.
Article in English | MEDLINE | ID: mdl-30837318

ABSTRACT

Understanding how new genes originate and integrate into cellular networks is key to understanding evolution. Bacteria present unique opportunities for both the natural history and experimental study of gene origins, due to their large effective population sizes, rapid generation times, and ease of genetic manipulation. Bacterial small non-coding RNAs (sRNAs), in particular, many of which operate through a simple antisense regulatory logic, may serve as tractable models for exploring processes of gene origin and adaptation. Understanding how and on what timescales these regulatory molecules arise has important implications for understanding the evolution of bacterial regulatory networks, in particular, for the design of comparative studies of sRNA function. Here, we introduce relevant concepts from evolutionary biology and review recent work that has begun to shed light on the timescales and processes through which non-functional transcriptional noise is co-opted to provide regulatory functions. We explore possible scenarios for sRNA origin, focusing on the co-option, or exaptation, of existing genomic structures which may provide protected spaces for sRNA evolution.


Subject(s)
Bacteria/genetics , RNA, Bacterial/genetics , RNA, Small Untranslated/genetics , Evolution, Molecular , Gene Expression Regulation, Bacterial/genetics , Genome, Bacterial/genetics , Phylogeny
10.
Bioinformatics ; 33(7): 988-996, 2017 04 01.
Article in English | MEDLINE | ID: mdl-27993777

ABSTRACT

Motivation: The aim of this study is to assess the performance of RNA-RNA interaction prediction tools for all domains of life. Results: Minimum free energy (MFE) and alignment methods constitute most of the current RNA interaction prediction algorithms. The MFE tools that include accessibility (i.e. RNAup, IntaRNA and RNAplex) to the final predicted binding energy have better true positive rates (TPRs) with a high positive predictive values (PPVs) in all datasets than other methods. They can also differentiate almost half of the native interactions from background. The algorithms that include effects of internal binding energies to their model and alignment methods seem to have high TPR but relatively low associated PPV compared to accessibility based methods. Availability and Implementation: We shared our wrapper scripts and datasets at Github (github.com/UCanCompBio/RNA_Interactions_Benchmark). All parameters are documented for personal use. Contact: sinan.umu@pg.canterbury.ac.nz. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Benchmarking , RNA/metabolism , Bacteria/genetics , Databases, Genetic , Models, Theoretical , RNA/chemistry , Sequence Analysis, RNA
11.
BMC Genomics ; 18(1): 795, 2017 Oct 17.
Article in English | MEDLINE | ID: mdl-29041914

ABSTRACT

BACKGROUND: The New Zealand collembolan genus Holacanthella contains the largest species of springtails (Collembola) in the world. Using Illumina technology we have sequenced and assembled a draft genome and transcriptome from Holacanthella duospinosa (Salmon). We have used this annotated assembly to investigate the genetic basis of a range of traits critical to the evolution of the Hexapoda, the phylogenetic position of H. duospinosa and potential horizontal gene transfer events. RESULTS: Our genome assembly was ~375 Mbp in size with a scaffold N50 of ~230 Kbp and sequencing coverage of ~180×. DNA elements, LTRs and simple repeats and LINEs formed the largest components and SINEs were very rare. Phylogenomics (370,877 amino acids) placed H. duospinosa within the Neanuridae. We recovered orthologs of the conserved sex determination genes thought to play a role in sex determination. Analysis of CpG content suggested the absence of DNA methylation, and consistent with this we were unable to detect orthologs of the DNA methyltransferase enzymes. The small subunit rRNA gene contained a possible retrotransposon. The Hox gene complex was broken over two scaffolds. For chemosensory ability, at least 15 and 18 ionotropic glutamate and gustatory receptors were identified, respectively. However, we were unable to identify any odorant receptors or their obligate co-receptor Orco. Twenty-three chitinase-like genes were identified from the assembly. Members of this multigene family may play roles in the digestion of fungal cell walls, a common food source for these saproxylic organisms. We also detected 59 and 96 genes that blasted to bacteria and fungi, respectively, but were located on scaffolds that otherwise contained arthropod genes. CONCLUSIONS: The genome of H. duospinosa contains some unusual features including a Hox complex broken over two scaffolds, in a different manner to other arthropod species, a lack of odorant receptor genes and an apparent lack of environmentally responsive DNA methylation, unlike many other arthropods. Our detection of candidate horizontal gene transfer candidates confirms that this phenomenon is occurring across Collembola. These findings allow us to narrow down the regions of the arthropod phylogeny where key innovations have occurred that have facilitated the evolutionary success of Hexapoda.


Subject(s)
Arthropods/genetics , Evolution, Molecular , Genomics , Animals , Arthropods/growth & development , Arthropods/metabolism , Chitinases/genetics , DNA Methylation , Gene Expression Profiling , Gene Transfer, Horizontal , Molecular Sequence Annotation , Phylogeny , Sex Determination Processes/genetics
12.
Bioinformatics ; 32(23): 3566-3574, 2016 12 01.
Article in English | MEDLINE | ID: mdl-27503221

ABSTRACT

MOTIVATION: Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics. RESULTS: We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms. AVAILABILITY AND IMPLEMENTATION: A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS CONTACT: nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.deSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Genome, Bacterial , Genomics/methods , Algorithms , Bacterial Proteins/genetics , Markov Chains , Models, Theoretical , Proteome , Salmonella enterica/genetics , Software
13.
RNA Biol ; 14(3): 275-280, 2017 03 04.
Article in English | MEDLINE | ID: mdl-28067598

ABSTRACT

Toxin-antitoxin (TA) systems are gene modules that appear to be horizontally mobile across a wide range of prokaryotes. It has been proposed that type I TA systems, with an antisense RNA-antitoxin, are less mobile than other TAs that rely on direct toxin-antitoxin binding but no direct comparisons have been made. We searched for type I, II and III toxin families using iterative searches with profile hidden Markov models across phyla and replicons. The distribution of type I toxin families were comparatively narrow, but these patterns weakened with recently discovered families. We discuss how the function and phenotypes of TA systems as well as biases in our search methods may account for differences in their distribution.


Subject(s)
Antitoxins/genetics , Bacterial Toxins/genetics , Gene Expression Regulation, Bacterial , RNA, Antisense/genetics , Databases, Genetic , Gene Transfer, Horizontal , Interspersed Repetitive Sequences , Multigene Family , Operon , Phylogeny
14.
Nucleic Acids Res ; 43(2): 691-8, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25520192

ABSTRACT

RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure-function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs--RMfam--and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam.


Subject(s)
Databases, Nucleic Acid , Molecular Sequence Annotation , RNA/chemistry , Evolution, Molecular , Models, Statistical , Nucleotide Motifs , Sequence Alignment , Sequence Analysis, RNA
15.
Nucleic Acids Res ; 43(Database issue): D130-7, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25392425

ABSTRACT

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , Genomics , Internet , Molecular Sequence Annotation , Nucleic Acid Conformation , Nucleotide Motifs , RNA, Long Noncoding/chemistry , RNA, Untranslated/classification , Software
16.
PLoS Comput Biol ; 10(10): e1003907, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25357249

ABSTRACT

Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression quantification. However, a major challenge is to robustly distinguish functional outputs from transcriptional noise. To establish whether annotation of existing transcriptome data has effectively captured all functional outputs, we analysed over 400 publicly available RNA-seq datasets spanning 37 different Archaea and Bacteria. Using comparative tools, we identify close to a thousand highly-expressed candidate noncoding RNAs. However, our analyses reveal that capacity to identify noncoding RNA outputs is strongly dependent on phylogenetic sampling. Surprisingly, and in stark contrast to protein-coding genes, the phylogenetic window for effective use of comparative methods is perversely narrow: aggregating public datasets only produced one phylogenetic cluster where these tools could be used to robustly separate unannotated noncoding RNAs from a null hypothesis of transcriptional noise. Our results show that for the full potential of transcriptomics data to be realized, a change in experimental design is paramount: effective transcriptomics requires phylogeny-aware sampling.


Subject(s)
Gene Expression Profiling/methods , RNA, Untranslated/classification , RNA, Untranslated/genetics , Transcriptome/genetics , Archaea/genetics , Bacteria/genetics , Cluster Analysis , Computational Biology , Databases, Genetic , Phylogeny , RNA, Archaeal/chemistry , RNA, Archaeal/classification , RNA, Archaeal/genetics , RNA, Bacterial/chemistry , RNA, Bacterial/classification , RNA, Bacterial/genetics , RNA, Untranslated/chemistry
17.
Nucleic Acids Res ; 41(8): 4549-64, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23470992

ABSTRACT

Salmonella Typhi and Typhimurium diverged only ∼50 000 years ago, yet have very different host ranges and pathogenicity. Despite the availability of multiple whole-genome sequences, the genetic differences that have driven these changes in phenotype are only beginning to be understood. In this study, we use transposon-directed insertion-site sequencing to probe differences in gene requirements for competitive growth in rich media between these two closely related serovars. We identify a conserved core of 281 genes that are required for growth in both serovars, 228 of which are essential in Escherichia coli. We are able to identify active prophage elements through the requirement for their repressors. We also find distinct differences in requirements for genes involved in cell surface structure biogenesis and iron utilization. Finally, we demonstrate that transposon-directed insertion-site sequencing is not only applicable to the protein-coding content of the cell but also has sufficient resolution to generate hypotheses regarding the functions of non-coding RNAs (ncRNAs) as well. We are able to assign probable functions to a number of cis-regulatory ncRNA elements, as well as to infer likely differences in trans-acting ncRNA regulatory networks.


Subject(s)
DNA Transposable Elements , Mutagenesis, Insertional , Salmonella typhi/genetics , Salmonella typhimurium/genetics , Bacterial Proteins/genetics , Gene Library , Genes, Bacterial , RNA, Small Untranslated/genetics , RNA, Untranslated/genetics , Salmonella typhi/growth & development , Salmonella typhimurium/growth & development
18.
Nucleic Acids Res ; 41(Database issue): D226-32, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23125362

ABSTRACT

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.


Subject(s)
Databases, Nucleic Acid , RNA, Untranslated/chemistry , RNA, Untranslated/classification , Base Sequence , Genomics , Internet , Molecular Sequence Annotation , Nucleic Acid Conformation , RNA, Untranslated/genetics , Sequence Alignment , User-Computer Interface
19.
Nucleic Acids Res ; 40(Database issue): D9-12, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22144683

ABSTRACT

Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 articles that describe the use of a wiki in relation to a biological database. In this commentary, we discuss how biological databases can be integrated with Wikipedia, thereby utilising the pre-existing infrastructure, tools and above all, large community of authors (or Wikipedians). The limitations to the content that can be included in Wikipedia are highlighted, with examples drawn from articles found in this issue and other wiki-based resources, indicating why other wiki solutions are necessary. We discuss the merits of using open wikis, like Wikipedia, versus other models, with particular reference to potential vandalism. Finally, we raise the question about the future role of dedicated database biocurators in context of the thousands of crowdsourced, community annotations that are now being stored in wikis.


Subject(s)
Databases, Factual , Encyclopedias as Topic , Internet , Systems Integration
20.
RNA ; 17(11): 1941-6, 2011 Nov.
Article in English | MEDLINE | ID: mdl-21940779

ABSTRACT

During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor.


Subject(s)
Databases, Nucleic Acid , RNA/chemistry , Animals , Base Sequence , Humans
SELECTION OF CITATIONS
SEARCH DETAIL