Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Nucleic Acids Res ; 51(D1): D753-D759, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36477304

RESUMEN

The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.


Asunto(s)
Microbiota , Análisis de Secuencia , Genómica/métodos , Metagenoma , Metagenómica/métodos , Microbiota/genética , Programas Informáticos , Análisis de Secuencia/métodos
2.
Brief Bioinform ; 22(2): 642-663, 2021 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-33147627

RESUMEN

SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:evbc@unj-jena.de.


Asunto(s)
COVID-19/prevención & control , Biología Computacional , SARS-CoV-2/aislamiento & purificación , Investigación Biomédica , COVID-19/epidemiología , COVID-19/virología , Genoma Viral , Humanos , Pandemias , SARS-CoV-2/genética
3.
Nucleic Acids Res ; 49(D1): D412-D419, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33125078

RESUMEN

The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family. The new Pfam-B is based on a clustering by the MMseqs2 software. We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families. Pfam is freely available for browsing and download at http://pfam.xfam.org/.


Asunto(s)
Biología Computacional/estadística & datos numéricos , Bases de Datos de Proteínas , Proteínas/metabolismo , Proteoma/metabolismo , Animales , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Biología Computacional/métodos , Epidemias , Humanos , Internet , Modelos Moleculares , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/genética , Proteoma/clasificación , Proteoma/genética , Secuencias Repetitivas de Aminoácido/genética , SARS-CoV-2/genética , SARS-CoV-2/fisiología , Análisis de Secuencia de Proteína/métodos
4.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33156333

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Secuencia de Aminoácidos , COVID-19/metabolismo , Internet , Anotación de Secuencia Molecular , Dominios Proteicos , Mapas de Interacción de Proteínas , SARS-CoV-2/metabolismo , Alineación de Secuencia
5.
Genomics ; 114(1): 9-22, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34798282

RESUMEN

Genomic knowledge of the tree of life is biased to specific groups of organisms. For example, only six full genomes are currently available in the rhizaria clade. Here, we have applied metagenomic techniques enabling the assembly of the genome of Polymyxa betae (Rhizaria, Plasmodiophorida) RES F41 isolate from unpurified zoospore holobiont and comparison with the A26-41 isolate. Furthermore, the first P. betae mitochondrial genome was assembled. The two P. betae nuclear genomes were highly similar, each with just ~10.2 k predicted protein coding genes, ~3% of which were unique to each isolate. Extending genomic comparisons revealed a greater overlap with Spongospora subterranea than with Plasmodiophora brassicae, including orthologs of the mammalian cation channel sperm-associated proteins, raising some intriguing questions about zoospore physiology. This work validates our metagenomics pipeline for eukaryote genome assembly from unpurified samples and enriches plasmodiophorid genomics; providing the first full annotation of the P. betae genome.


Asunto(s)
Genoma Mitocondrial , Plasmodiophorida , Genómica , Metagenómica , Plasmodiophorida/genética
6.
Nucleic Acids Res ; 48(D1): D570-D578, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31696235

RESUMEN

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.


Asunto(s)
Metagenoma , Microbiota , Filogenia , Programas Informáticos , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , ADN Espaciador Ribosómico/genética , Bases de Datos Genéticas , Metagenómica/métodos
7.
Nucleic Acids Res ; 47(D1): D564-D572, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30364992

RESUMEN

Automatic annotation of protein function is routinely applied to newly sequenced genomes. While this provides a fine-grained view of an organism's functional protein repertoire, proteins, more commonly function in a coordinated manner, such as in pathways or multimeric complexes. Genome Properties (GPs) define such functional entities as a series of steps, originally described by either TIGRFAMs or Pfam entries. To increase the scope of coverage, we have migrated GPs to function as a companion resource utilizing InterPro entries. Having introduced GPs-specific versioned releases, we provide software and data via a GitHub repository, and have developed a new web interface to GPs (available at https://www.ebi.ac.uk/interpro/genomeproperties). In addition to exploring each of the 1286 GPs, the website contains GPs pre-calculated for a representative set of proteomes; these results can be used to profile GPs phylogenetically via an interactive viewer. Users can upload novel data to the viewer for comparison with the pre-calculated results. Over the last year, we have added ∼700 new GPs, increasing the coverage of eukaryotic systems, as well as increasing general coverage through automatic generation of GPs from related resources. All data are freely available via the website and the GitHub repository.


Asunto(s)
Bases de Datos de Proteínas , Genoma , Proteínas/genética , Genoma Microbiano , Redes y Vías Metabólicas/genética , Complejos Multiproteicos/genética , Proteínas/metabolismo , Proteoma
8.
Nucleic Acids Res ; 47(D1): D427-D432, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357350

RESUMEN

The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/clasificación , Anotación de Secuencia Molecular , Dominios Proteicos , Proteínas/química , Secuencias Repetitivas de Aminoácido
9.
Nucleic Acids Res ; 47(D1): D351-D360, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30398656

RESUMEN

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Animales , Bases de Datos Genéticas , Ontología de Genes , Humanos , Internet , Familia de Multigenes , Dominios Proteicos/genética , Homología de Secuencia de Aminoácido , Programas Informáticos , Interfaz Usuario-Computador
10.
Nucleic Acids Res ; 45(D1): D190-D199, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899635

RESUMEN

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Dominios y Motivos de Interacción de Proteínas , Programas Informáticos , Humanos , Anotación de Secuencia Molecular , Filogenia
11.
Dev Biol ; 423(1): 1-11, 2017 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-28161522

RESUMEN

The eMouseAtlas resource is an online database of 3D digital models of mouse development, an ontology of mouse embryo anatomy and a gene-expression database with about 30K spatially mapped gene-expression patterns. It is closely linked with the MGI/GXD database at the Jackson Laboratory and holds links to almost all available image-based gene-expression data for the mouse embryo. In this resource article we describe the novel web-based tools we have developed for 3D visualisation of embryo anatomy and gene expression. We show how mapping of gene expression data onto spatial models delivers a framework for capturing gene expression that enhances our understanding of development, and we review the exploratory tools utilised by the EMAGE gene expression database as a means of defining co-expression of in situ hybridisation, immunohistochemistry, and lacZ-omic expression patterns. We report on recent developments of the eHistology atlas and our use of web-services to support embedding of the online 'The Atlas of Mouse Development' in the context of other resources such as the DMDD mouse phenotype database. In addition, we discuss new developments including a cellular-resolution placental atlas, third-party atlas models, clonal analysis data and a new interactive eLearning resource for developmental processes.


Asunto(s)
Atlas como Asunto , Embrión de Mamíferos/metabolismo , Desarrollo Embrionario , Anatomía Artística , Animales , Regulación del Desarrollo de la Expresión Génica , Internet , Ratones
12.
Development ; 142(14): 2545, 2015 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-26199410

RESUMEN

There was an error published in Development 142, 1909-1911. Author Yogmatee Roochun was omitted. The corrected author list appears above. The authors apologise to readers for this mistake.

13.
Development ; 142(11): 1909-11, 2015 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-26015534

RESUMEN

The Atlas of Mouse Development by Professor Mathew Kaufman is an essential text for understanding mouse developmental anatomy. This definitive and authoritative atlas is still in production and is essential for any biologist working with the mouse embryo, although the last revision dates back to 1994. Here, we announce the eHistology online resource that provides free access to high-resolution colour images digitized from the original histological sections (www.emouseatlas.org/emap/eHistology/index.php) used by Kaufman for the Atlas. The images are provided with the original annotations and plate numbering of the paper atlas and enable viewing the material to cellular resolution.


Asunto(s)
Desarrollo Embrionario , Histología , Internet , Animales , Ratones
14.
Int J Palliat Nurs ; 24(2): 92-95, 2018 Feb 02.
Artículo en Inglés | MEDLINE | ID: mdl-29469643

RESUMEN

BACKGROUND: There is a paucity of evidence supporting the benefits of palliative care day therapy services for patients with non-malignant diseases. Outcome measures in this setting are also lacking. AIM: To evaluate the use of the modified Measure Yourself Medical Outcome Profile 2 (MYMOP2) tool in tailoring day therapy services toward the needs of patients with non-malignant conditions Method: A single system, 'before and after' design quality improvement study was conducted. Data were collected regarding outcome measures, re-referral rates and mortality. RESULT: After the introduction of the modified MYMOP2 tool, there was an improvement in the mean outcome scores for patients with non-malignant disease. Re-referral rates for these patients dropped by 28% during the follow up period, with no change in mortality. IMPLICATIONS FOR PRACTICE: These findings suggest that using the modified MYMOP2 tool to tailor and measure the outcome of holistic day therapy services results in a more sustained improvement for patients with non-malignant disease.


Asunto(s)
Centros de Día , Evaluación de Resultado en la Atención de Salud/métodos , Cuidados Paliativos , Humanos
15.
Nucleic Acids Res ; 42(Database issue): D835-44, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24265223

RESUMEN

EMAGE (http://www.emouseatlas.org/emage/) is a freely available database of in situ gene expression patterns that allows users to perform online queries of mouse developmental gene expression. EMAGE is unique in providing both text-based descriptions of gene expression plus spatial maps of gene expression patterns. This mapping allows spatial queries to be accomplished alongside more traditional text-based queries. Here, we describe our recent progress in spatial mapping and data integration. EMAGE has developed a method of spatially mapping 3D embryo images captured using optical projection tomography, and through the use of an IIP3D viewer allows users to view arbitrary sections of raw and mapped 3D image data in the context of a web browser. EMAGE now includes enhancer data, and we have spatially mapped images from a comprehensive screen of transgenic reporter mice that detail the expression of mouse non-coding genomic DNA fragments with enhancer activity. We have integrated the eMouseAtlas anatomical atlas and the EMAGE database so that a user of the atlas can query the EMAGE database easily. In addition, we have extended the atlas framework to enable EMAGE to spatially cross-index EMBRYS whole mount in situ hybridization data. We additionally report on recent developments to the EMAGE web interface, including new query and analysis capabilities.


Asunto(s)
Bases de Datos Genéticas , Embrión de Mamíferos/metabolismo , Expresión Génica , Ratones/genética , Animales , Gráficos por Computador , Imagenología Tridimensional , Internet , Ratones/embriología , Ratones/metabolismo , Modelos Animales , Tomografía/métodos
16.
Mamm Genome ; 26(9-10): 431-40, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26296321

RESUMEN

A significant proportion of developmental biology data is presented in the form of images at morphologically diverse stages of development. The curation of these datasets presents different challenges to that of sequence/text-based data. Towards this end, the eMouseAtlas project created a digital atlas of mouse embryo development as a means of understanding developmental anatomy and exploring the relationship between genes and development in a spatial context. Using the morphological staging system pioneered by Karl Theiler, the project has generated 3D models of post-implantation mouse development and used them as a spatial framework for the delineation of anatomical components and for archiving in situ gene expression data in the EMAGE database. This has allowed us to develop a unique online resource for mouse developmental biology. We describe here the underlying structure of the resource, as well as some of the tools that have been developed to allow users to mine the curated image data. These tools include our IIP3D/X3DOM viewer that allows 3D visualisation of anatomy and/or gene expression in the context of a web browser, and the eHistology resource that extends this functionality to allow visualisation of high-resolution cellular level images of histology sections. Furthermore, we review some of the informatics aspects of eMouseAtlas to provide a deeper insight into the use of the atlas and gene expression database.


Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Desarrollo Embrionario , Animales , Embrión de Mamíferos , Regulación del Desarrollo de la Expresión Génica/genética , Internet , Ratones , Programas Informáticos
17.
Nucleic Acids Res ; 39(Database issue): D7-10, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21097465

RESUMEN

The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.


Asunto(s)
Bases de Datos Factuales/normas , Difusión de la Información
18.
Soc Sci Med ; 326: 115889, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37121071

RESUMEN

We focused in this study on how the private experience of pain is made public through online discourse by sufferers of endometriosis. Empirically, we analyse two highly active endometriosis communities on the online social platform Reddit. Drawing on a mixed-methods design, we leverage large-scale social data, and a combination of computational and interpretive approaches for text analysis to study the role and shape of interactions relating to 'pain' for the formation of epistemic community online around endometriosis. The dataset, consisting of 70,817 forum posts and comments, was collected in May of 2021. Our study shows how pain becomes meaningful for endometriosis sufferers in relation to a multidimensional discursive space of words and concepts that are used to express it. Pain was frequently disguised, underplayed or hidden altogether, from fears of misunderstanding, medical dismissal, and embarrassment. Clearly, peer validation can be found in the relative anonymity of Reddit discussions. While the experience of pain is individual and subjective, when communities share similar experiences this reinforces patient ownership of the pain, which in turn supports the epistemic authority of the patient collective. A detailed understanding of how and why pain is discussed in online spaces has much to contribute more broadly to discussions of experiential collective knowledge production among individuals with endometriosis and other chronic illnesses.


Asunto(s)
Endometriosis , Medios de Comunicación Sociales , Femenino , Humanos , Endometriosis/complicaciones , Dolor , Grupo Paritario , Miedo
19.
J Mol Biol ; 435(14): 168016, 2023 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-36806692

RESUMEN

An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands of MAGs now described in the literature. However, the discovery and comparison of these MAG collections is hampered by the lack of uniformity in their generation, annotation and storage. To address this, we have developed MGnify Genomes, a growing collection of biome-specific non-redundant microbial genome catalogues generated using MAGs and publicly available isolate genomes. Genomes within a biome-specific catalogue are organised into species clusters. For species that contain multiple conspecific genomes, the highest quality genome is selected as the representative, always prioritising an isolate genome over a MAG. The species representative sequences and annotations can be visualised on the MGnify website and the full catalogue and associated analysis outputs can be downloaded from MGnify servers. A suite of online search tools is provided allowing users to compare their own sequences, ranging from a gene to sets of genomes, against the catalogues. Seven biomes are available currently, comprising over 300,000 genomes that represent 11,048 non-redundant species, and include 36 taxonomic classes not currently represented by cultured genomes. MGnify Genomes is available at https://www.ebi.ac.uk/metagenomics/browse/genomes/.


Asunto(s)
Genoma Microbiano , Metagenoma , Metagenoma/genética , Metagenómica
20.
Mamm Genome ; 23(9-10): 514-24, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22847374

RESUMEN

eMouseAtlas (www.emouseatlas.org) is a comprehensive online resource to visualise mouse development and investigate gene expression in the mouse embryo. We have recently deployed a completely redesigned Mouse Anatomy Atlas website (www.emouseatlas.org/emap/ema) that allows users to view 3D embryo reconstructions, delineated anatomy, and high-resolution histological sections. A new feature of the website is the IIP3D web tool that allows a user to view arbitrary sections of 3D embryo reconstructions using a web browser. This feature provides interactive access to very high-volume 3D images via a tiled pan-and-zoom style interface and circumvents the need to download large image files for visualisation. eMouseAtlas additionally includes EMAGE (Edinburgh Mouse Atlas of Gene Expression) (www.emouseatlas.org/emage), a freely available, curated online database of in situ gene expression patterns, where gene expression domains extracted from raw data images are spatially mapped into atlas embryo models. In this way, EMAGE introduces a spatial dimension to transcriptome data and allows exploration of the spatial similarity between gene expression patterns. New features of the EMAGE interface allow complex queries to be built, and users can view and compare multiple gene expression patterns. EMAGE now includes mapping of 3D gene expression domains captured using the imaging technique optical projection tomography. 3D mapping uses WlzWarp, an open-source software tool developed by eMouseAtlas.


Asunto(s)
Atlas como Asunto , Ratones/genética , Transcriptoma , Animales
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA