Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
PLoS Comput Biol ; 18(2): e1009870, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35196325

RESUMEN

Protozoan parasites cause diverse diseases with large global impacts. Research on the pathogenesis and biology of these organisms is limited by economic and experimental constraints. Accordingly, studies of one parasite are frequently extrapolated to infer knowledge about another parasite, across and within genera. Model in vitro or in vivo systems are frequently used to enhance experimental manipulability, but these systems generally use species related to, yet distinct from, the clinically relevant causal pathogen. Characterization of functional differences among parasite species is confined to post hoc or single target studies, limiting the utility of this extrapolation approach. To address this challenge and to accelerate parasitology research broadly, we present a functional comparative analysis of 192 genomes, representing every high-quality, publicly-available protozoan parasite genome including Plasmodium, Toxoplasma, Cryptosporidium, Entamoeba, Trypanosoma, Leishmania, Giardia, and other species. We generated an automated metabolic network reconstruction pipeline optimized for eukaryotic organisms. These metabolic network reconstructions serve as biochemical knowledgebases for each parasite, enabling qualitative and quantitative comparisons of metabolic behavior across parasites. We identified putative differences in gene essentiality and pathway utilization to facilitate the comparison of experimental findings and discovered that phylogeny is not the sole predictor of metabolic similarity. This knowledgebase represents the largest collection of genome-scale metabolic models for both pathogens and eukaryotes; with this resource, we can predict species-specific functions, contextualize experimental results, and optimize selection of experimental systems for fastidious species.


Asunto(s)
Criptosporidiosis , Cryptosporidium , Parásitos , Plasmodium , Animales , Criptosporidiosis/genética , Cryptosporidium/genética , Eucariontes/genética , Genoma de Protozoos/genética , Parásitos/genética , Plasmodium/genética
2.
BMC Genomics ; 23(1): 299, 2022 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-35413804

RESUMEN

BACKGROUND: Epigenome analysis relies on defined sets of genomic regions output by widely used assays such as ChIP-seq and ATAC-seq. Statistical analysis and visualization of genomic region sets is essential to answer biological questions in gene regulation. As the epigenomics community continues generating data, there will be an increasing need for software tools that can efficiently deal with more abundant and larger genomic region sets. Here, we introduce GenomicDistributions, an R package for fast and easy summarization and visualization of genomic region data. RESULTS: GenomicDistributions offers a broad selection of functions to calculate properties of genomic region sets, such as feature distances, genomic partition overlaps, and more. GenomicDistributions functions are meticulously optimized for best-in-class speed and generally outperform comparable functions in existing R packages. GenomicDistributions also offers plotting functions that produce editable ggplot objects. All GenomicDistributions functions follow a uniform naming scheme and can handle either single or multiple region set inputs. CONCLUSIONS: GenomicDistributions offers a fast and scalable tool for exploratory genomic region set analysis and visualization. GenomicDistributions excels in user-friendliness, flexibility of outputs, breadth of functions, and computational performance. GenomicDistributions is available from Bioconductor ( https://bioconductor.org/packages/release/bioc/html/GenomicDistributions.html ).


Asunto(s)
Genómica , Programas Informáticos , Secuenciación de Inmunoprecipitación de Cromatina , Epigenómica , Genoma
3.
BMC Bioinformatics ; 19(1): 300, 2018 08 14.
Artículo en Inglés | MEDLINE | ID: mdl-30107777

RESUMEN

BACKGROUND: Here, we present an R package for entropy/variability analysis that facilitates prompt and convenient data extraction, manipulation and visualization of protein features from multiple sequence alignments. BALCONY can work with residues dispersed across a protein sequence and map them on the corresponding alignment of homologous protein sequences. Additionally, it provides several entropy and variability scores that indicate the conservation of each residue. RESULTS: Our package allows the user to visualize evolutionary variability by locating the positions most likely to vary and to assess mutation candidates in protein engineering. CONCLUSION: In comparison to other R packages BALCONY allows conservation/variability analysis in context of protein structure with linkage of the appropriate metrics with physicochemical features of user choice. AVAILABILITY: CRAN project page: https://cran.r-project.org/package=BALCONY and our website: http://www.tunnelinggroup.pl/software/ for major platforms: Linux/Unix, Windows and Mac OS X.


Asunto(s)
Proteínas/química , Alineación de Secuencia/métodos , Programas Informáticos , Secuencia de Aminoácidos , Entropía , Evolución Molecular , Humanos
4.
Bioinformatics ; 33(13): 2045-2046, 2017 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-28334160

RESUMEN

MOTIVATION: The identification and tracking of molecules which enter active site cavity requires screening the positions of thousands of single molecules along several thousand molecular dynamic steps. To fill the existing gap between tools searching for tunnels and pathways and advanced tools employed for accelerated water flux investigations, we have developed AQUA-DUCT. RESULTS: AQUA-DUCT is an easy-to-use tool that facilitates analysis of the behaviour of molecules that penetrate any selected region in a protein. It can be used for any type of molecules, e.g. water, oxygen, carbon dioxide, organic solvents, ions. AVAILABILITY AND IMPLEMENTATION: Linux, Windows, macOS, OpenBSD, http://www.aquaduct.pl . CONTACT: a.gora@tunnelinggroup.pl or info@aquaduct.pl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Dominio Catalítico , Biología Computacional/métodos , Simulación por Computador , Ligandos , Modelos Moleculares , Programas Informáticos
5.
Bioinform Adv ; 2(1): vbac030, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35669346

RESUMEN

Summary: Properly and effectively managing reference datasets is an important task for many bioinformatics analyses. Refgenie is a reference asset management system that allows users to easily organize, retrieve and share such datasets. Here, we describe the integration of refgenie into the Galaxy platform. Server administrators are able to configure Galaxy to make use of reference datasets made available on a refgenie instance. In addition, a Galaxy Data Manager tool has been developed to provide a graphical interface to refgenie's remote reference retrieval functionality. A large collection of reference datasets has also been made available using the CVMFS (CernVM File System) repository from GalaxyProject.org, with mirrors across the USA, Canada, Europe and Australia, enabling easy use outside of Galaxy. Availability and implementation: The ability of Galaxy to use refgenie assets was added to the core Galaxy framework in version 22.01, which is available from https://github.com/galaxyproject/galaxy under the Academic Free License version 3.0. The refgenie Data Manager tool can be installed via the Galaxy ToolShed, with source code managed at https://github.com/BlankenbergLab/galaxy-tools-blankenberg/tree/main/data_managers/data_manager_refgenie_pull and released using an MIT license. Access to existing data is also available through CVMFS, with instructions at https://galaxyproject.org/admin/reference-data-repo/. No new data were generated or analyzed in support of this research.

6.
NAR Genom Bioinform ; 3(2): lqab036, 2021 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34017945

RESUMEN

Genome analysis relies on reference data like sequences, feature annotations, and aligner indexes. These data can be found in many versions from many sources, making it challenging to identify and assess compatibility among them. For example, how can you determine which indexes are derived from identical raw sequence files, or which annotations share a compatible coordinate system? Here, we describe a novel approach to establish identity and compatibility of reference genome resources. We approach this with three advances: first, we derive unique identifiers for each resource; second, we record parent-child relationships among resources; and third, we describe recursive identifiers that determine identity as well as compatibility of coordinate systems and sequence names. These advances facilitate portability, reproducibility, and re-use of genome reference data. Available athttps://refgenie.databio.org.

7.
Gigascience ; 10(12)2021 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-34890448

RESUMEN

BACKGROUND: Organizing and annotating biological sample data is critical in data-intensive bioinformatics. Unfortunately, metadata formats from a data provider are often incompatible with requirements of a processing tool. There is no broadly accepted standard to organize metadata across biological projects and bioinformatics tools, restricting the portability and reusability of both annotated datasets and analysis software. RESULTS: To address this, we present the Portable Encapsulated Project (PEP) specification, a formal specification for biological sample metadata structure. The PEP specification accommodates typical features of data-intensive bioinformatics projects with many biological samples. In addition to standardization, the PEP specification provides descriptors and modifiers for project-level and sample-level metadata, which improve portability across both computing environments and data processing tools. PEPs include a schema validator framework, allowing formal definition of required metadata attributes for data analysis broadly. We have implemented packages for reading PEPs in both Python and R to provide a language-agnostic interface for organizing project metadata. CONCLUSIONS: The PEP specification is an important step toward unifying data annotation and processing tools in data-intensive biological research projects. Links to tools and documentation are available at http://pep.databio.org/.


Asunto(s)
Metadatos , Programas Informáticos , Biología Computacional , Documentación
8.
Gigascience ; 9(2)2020 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-31995185

RESUMEN

BACKGROUND: Reference genome assemblies are essential for high-throughput sequencing analysis projects. Typically, genome assemblies are stored on disk alongside related resources; e.g., many sequence aligners require the assembly to be indexed. The resulting indexes are broadly applicable for downstream analysis, so it makes sense to share them. However, there is no simple tool to do this. RESULTS: Here, we introduce refgenie, a reference genome assembly asset manager. Refgenie makes it easier to organize, retrieve, and share genome analysis resources. In addition to genome indexes, refgenie can manage any files related to reference genomes, including sequences and annotation files. Refgenie includes a command line interface and a server application that provides a RESTful API, so it is useful for both tool development and analysis. CONCLUSIONS: Refgenie streamlines sharing genome analysis resources among groups and across computing environments. Refgenie is available at https://refgenie.databio.org.


Asunto(s)
Genoma/genética , Estándares de Referencia , Programas Informáticos , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Anotación de Secuencia Molecular/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA