Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Proteome Res ; 20(4): 2056-2061, 2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33625229

RESUMEN

BioContainers is an open-source project that aims to create, store, and distribute bioinformatics software containers and packages. The BioContainers community has developed a set of guidelines to standardize software containers including the metadata, versions, licenses, and software dependencies. BioContainers supports multiple packaging and container technologies such as Conda, Docker, and Singularity. The BioContainers provide over 9000 bioinformatics tools, including more than 200 proteomics and mass spectrometry tools. Here we introduce the BioContainers Registry and Restful API to make containerized bioinformatics tools more findable, accessible, interoperable, and reusable (FAIR). The BioContainers Registry provides a fast and convenient way to find and retrieve bioinformatics tool packages and containers. By doing so, it will increase the use of bioinformatics packages and containers while promoting replicability and reproducibility in research.


Asunto(s)
Biología Computacional , Proteómica , Sistema de Registros , Reproducibilidad de los Resultados , Programas Informáticos
2.
Bioinformatics ; 34(12): 2116-2122, 2018 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-29385404

RESUMEN

Motivation: At the same time that toxicologists express increasing concern about reproducibility in this field, the development of dedicated databases has already smoothed the path toward improving the storage and exchange of raw toxicogenomic data. Nevertheless, none provides access to analyzed and interpreted data as originally reported in scientific publications. Given the increasing demand for access to this information, we developed TOXsIgN, a repository for TOXicogenomic sIgNatures. Results: The TOXsIgN repository provides a flexible environment that facilitates online submission, storage and retrieval of toxicogenomic signatures by the scientific community. It currently hosts 754 projects that describe more than 450 distinct chemicals and their 8491 associated signatures. It also provides users with a working environment containing a powerful search engine as well as bioinformatics/biostatistics modules that enable signature comparisons or enrichment analyses. Availability and implementation: The TOXsIgN repository is freely accessible at http://toxsign.genouest.org. Website implemented in Python, JavaScript and MongoDB, with all major browsers supported. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bases de Datos Factuales , Programas Informáticos , Toxicogenética/métodos , Animales , Humanos
3.
F1000Res ; 72018.
Artículo en Inglés | MEDLINE | ID: mdl-31543945

RESUMEN

Software Containers are changing the way scientists and researchers develop, deploy and exchange scientific software. They allow labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. However, containers and software packages should be produced under certain rules and standards in order to be reusable, compatible and easy to integrate into pipelines and analysis workflows. Here, we presented a set of recommendations developed by the BioContainers Community to produce standardized bioinformatics packages and containers. These recommendations provide practical guidelines to make bioinformatics software more discoverable, reusable and transparent.  They are aimed to guide developers, organisations, journals and funders to increase the quality and sustainability of research software.


Asunto(s)
Biología Computacional , Programas Informáticos , Humanos , Investigadores , Flujo de Trabajo
4.
Artículo en Inglés | MEDLINE | ID: mdl-27173522

RESUMEN

Among the 20 000 human gene products predicted from genome annotation, about 3000 still lack validation at protein level. We developed PepPSy, a user-friendly gene expression-based prioritization system, to help investigators to determine in which human tissues they should look for an unseen protein. PepPSy can also be used by biocurators to revisit the annotation of specific categories of proteins based on the 'omics' data housed by the system. In this study, it was used to prioritize 21 dubious protein-coding genes among the 616 annotated in neXtProt for reannotation. PepPSy is freely available at http://peppsy.genouest.orgDatabase URL: http://peppsy.genouest.org.


Asunto(s)
Bases de Datos Genéticas , Internet , Proteínas/genética , Interfaz Usuario-Computador , Biología Computacional , Sistemas de Administración de Bases de Datos , Humanos , Anotación de Secuencia Molecular , Flujo de Trabajo
5.
Nucleic Acids Res ; 43(W1): W109-16, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25883147

RESUMEN

We report the development of the ReproGenomics Viewer (RGV), a multi- and cross-species working environment for the visualization, mining and comparison of published omics data sets for the reproductive science community. The system currently embeds 15 published data sets related to gametogenesis from nine model organisms. Data sets have been curated and conveniently organized into broad categories including biological topics, technologies, species and publications. RGV's modular design for both organisms and genomic tools enables users to upload and compare their data with that from the data sets embedded in the system in a cross-species manner. The RGV is freely available at http://rgv.genouest.org.


Asunto(s)
Gametogénesis/genética , Programas Informáticos , Animales , Minería de Datos , Femenino , Genómica , Humanos , Internet , Masculino , Ratones , Ratas , Espermatogénesis/genética
6.
RNA ; 21(5): 1005-17, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25805861

RESUMEN

An overflow of regulatory RNAs (sRNAs) was identified in a wide range of bacteria. We designed and implemented a new resource for the hundreds of sRNAs identified in Staphylococci, with primary focus on the human pathogen Staphylococcus aureus. The "Staphylococcal Regulatory RNA Database" (SRD, http://srd.genouest.org/) compiled all published data in a single interface including genetic locations, sequences and other features. SRD proposes novel and simplified identifiers for Staphylococcal regulatory RNAs (srn) based on the sRNA's genetic location in S. aureus strain N315 which served as a reference. From a set of 894 sequences and after an in-depth cleaning, SRD provides a list of 575 srn exempt of redundant sequences. For each sRNA, their experimental support(s) is provided, allowing the user to individually assess their validity and significance. RNA-seq analysis performed on strains N315, NCTC8325, and Newman allowed us to provide further details, upgrade the initial annotation, and identified 159 RNA-seq independent transcribed sRNAs. The lists of 575 and 159 sRNAs sequences were used to predict the number and location of srns in 18 S. aureus strains and 10 other Staphylococci. A comparison of the srn contents within 32 Staphylococcal genomes revealed a poor conservation between species. In addition, sRNA structure predictions obtained with MFold are accessible. A BLAST server and the intaRNA program, which is dedicated to target prediction, were implemented. SRD is the first sRNA database centered on a genus; it is a user-friendly and scalable device with the possibility to submit new sequences that should spread in the literature.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Regulación Bacteriana de la Expresión Génica/genética , ARN Bacteriano/genética , Staphylococcus aureus/genética , Secuencia de Bases , Mapeo Cromosómico , Biología Computacional , Genoma Bacteriano , Filogenia , ARN Pequeño no Traducido , Análisis de Secuencia de ARN , Programas Informáticos
7.
F1000Res ; 4: 1443, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26913191

RESUMEN

Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

8.
BMC Bioinformatics ; 15 Suppl 14: S7, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25472764

RESUMEN

BACKGROUND: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together. RESULTS: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities. CONCLUSIONS: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects.


Asunto(s)
Biología Computacional , Conducta Cooperativa , Programas Informáticos , Comunicación , Internet
9.
Biol Reprod ; 91(1): 5, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24740603

RESUMEN

Mammalian spermatogenesis is a complex and highly orchestrated combination of processes in which male germline proliferation and differentiation result in the production of mature spermatozoa. If recent genome-wide studies have contributed to the in-depth analysis of the male germline protein-encoding transcriptome, little effort has yet been devoted to the systematic identification of novel unannotated transcribed regions expressed during mammalian spermatogenesis. We report high-resolution expression profiling of male germ cells in rat, using next-generation sequencing technology and highly enriched testicular cell populations. Among 20 424 high-confidence transcripts reconstructed, we defined a stringent set of 1419 long multi-exonic unannotated transcripts expressed in the testis (testis-expressed unannotated transcripts [TUTs]). TUTs were divided into 7 groups with different expression patterns. Most TUTs share many of the characteristics of vertebrate long noncoding RNAs (lncRNAs). We also markedly reinforced the finding that TUTs and known lncRNAs accumulate during the meiotic and postmeiotic stages of spermatogenesis in mammals and that X-linked meiotic TUTs do not escape the silencing effects of meiotic sex chromosome inactivation. Importantly, we discovered that TUTs and known lncRNAs with a peak expression during meiosis define a distinct class of noncoding transcripts that exhibit exons twice as long as those of other transcripts. Our study provides new insights in transcriptional profiling of the male germline and represents a high-quality resource for novel loci expressed during spermatogenesis that significantly contributes to rat genome annotation.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Espermatogénesis/genética , Espermatozoides/citología , Testículo/citología , Animales , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos , Ratas , Ratas Sprague-Dawley , Espermatozoides/metabolismo , Testículo/metabolismo , Transcripción Genética
10.
PLoS One ; 7(11): e50653, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23209799

RESUMEN

BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD) was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.


Asunto(s)
Bases de Datos Genéticas , Genes Duplicados/genética , Internet
11.
BMC Bioinformatics ; 13: 175, 2012 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-22827839

RESUMEN

BACKGROUND: Seqcrawler takes its roots in software like SRS or Lucegene. It provides an indexing platform to ease the search of data and meta-data in biological banks and it can scale to face the current flow of data. While many biological bank search tools are available on the Internet, mainly provided by large organizations to search their data, there is a lack of free and open source solutions to browse one's own set of data with a flexible query system and able to scale from a single computer to a cloud system. A personal index platform will help labs and bioinformaticians to search their meta-data but also to build a larger information system with custom subsets of data. RESULTS: The software is scalable from a single computer to a cloud-based infrastructure. It has been successfully tested in a private cloud with 3 index shards (pieces of index) hosting ~400 millions of sequence information (whole GenBank, UniProt, PDB and others) for a total size of 600 GB in a fault tolerant architecture (high-availability). It has also been successfully integrated with software to add extra meta-data from blast results to enhance users' result analysis. CONCLUSIONS: Seqcrawler provides a complete open source search and store solution for labs or platforms needing to manage large amount of data/meta-data with a flexible and customizable web interface. All components (search engine, visualization and data storage), though independent, share a common and coherent data system that can be queried with a simple HTTP interface. The solution scales easily and can also provide a high availability infrastructure.


Asunto(s)
Indización y Redacción de Resúmenes , Programas Informáticos , Internet , Motor de Búsqueda , Interfaz Usuario-Computador
12.
Nucleic Acids Res ; 40(Web Server issue): W458-65, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22570409

RESUMEN

We present gene prioritization system (GPSy), a cross-species gene prioritization system that facilitates the arduous but critical task of prioritizing genes for follow-up functional analyses. GPSy's modular design with regard to species, data sets and scoring strategies enables users to formulate queries in a highly flexible manner. Currently, the system encompasses 20 topics related to conserved biological processes including male gamete development discussed in this article. The web server-based tool is freely available at http://gpsy.genouest.org.


Asunto(s)
Genes , Programas Informáticos , Espermatogénesis/genética , Animales , Caenorhabditis elegans/genética , Caenorhabditis elegans/fisiología , Expresión Génica , Genómica/métodos , Internet , Masculino , Modelos Animales , Anotación de Secuencia Molecular , Interferencia de ARN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...