Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 38(6): 1741-1742, 2022 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-34962976

RESUMEN

SUMMARY: The assessment of novel phylogenetic models and inference methods is routinely being conducted via experiments on simulated as well as empirical data. When generating synthetic data it is often unclear how to set simulation parameters for the models and generate trees that appropriately reflect empirical model parameter distributions and tree shapes. As a solution, we present and make available a new database called 'RAxML Grove' currently comprising more than 60 000 inferred trees and respective model parameter estimates from fully anonymized empirical datasets that were analyzed using RAxML and RAxML-NG on two web servers. We also describe and make available two simple applications of RAxML Grove to exemplify its usage and highlight its utility for designing realistic simulation studies and analyzing empirical model parameter and tree shape distributions. AVAILABILITY AND IMPLEMENTATION: RAxML Grove is freely available at https://github.com/angtft/RAxMLGrove. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Computadores , Programas Informáticos , Filogenia , Simulación por Computador , Bases de Datos Factuales
2.
Nucleic Acids Res ; 49(W1): W216-W227, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-33849055

RESUMEN

The SIB Swiss Institute of Bioinformatics (https://www.sib.swiss) creates, maintains and disseminates a portfolio of reliable and state-of-the-art bioinformatics services and resources for the storage, analysis and interpretation of biological data. Through Expasy (https://www.expasy.org), the Swiss Bioinformatics Resource Portal, the scientific community worldwide, freely accesses more than 160 SIB resources supporting a wide range of life science and biomedical research areas. In 2020, Expasy was redesigned through a user-centric approach, known as User-Centred Design (UCD), whose aim is to create user interfaces that are easy-to-use, efficient and targeting the intended community. This approach, widely used in other fields such as marketing, e-commerce, and design of mobile applications, is still scarcely explored in bioinformatics. In total, around 50 people were actively involved, including internal stakeholders and end-users. In addition to an optimised interface that meets users' needs and expectations, the new version of Expasy provides an up-to-date and accurate description of high-quality resources based on a standardised ontology, allowing to connect functionally-related resources.


Asunto(s)
Biología Computacional , Bases de Datos Factuales , Programas Informáticos , Interfaz Usuario-Computador
3.
Stud Health Technol Inform ; 270: 1170-1174, 2020 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-32570566

RESUMEN

The BioMedIT project is funded by the Swiss government as an integral part of the Swiss Personalized Health Network (SPHN), aiming to provide researchers with access to a secure, powerful and versatile IT infrastructure for doing data-driven research on sensitive biomedical data while ensuring data privacy protection. The BioMedIT network gives researchers the ability to securely transfer, store, manage and process sensitive research data. The underlying BioMedIT nodes provide compute and storage capacity that can be used locally or through a federated environment. The network operates under a common Information Security Policy using state-of-the-art security techniques. It utilizes cloud computing, virtualization, compute accelerators (GPUs), big data storage as well as federation technologies to lower computational boundaries for researchers and to guarantee that sensitive data can be processed in a secure and lawful way. Building on existing expertise and research infrastructure at the partnering Swiss institutions, the BioMedIT network establishes a competitive Swiss private-cloud - a secure national infrastructure resource that can be used by researchers of Swiss universities, hospitals and other research institutions.


Asunto(s)
Almacenamiento y Recuperación de la Información , Macrodatos , Nube Computacional , Seguridad Computacional , Privacidad
5.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-31697362

RESUMEN

MOTIVATION: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases. RESULTS: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface.


Asunto(s)
Ontologías Biológicas , Biología Computacional , Bases de Datos Factuales , Web Semántica
6.
Genome Biol ; 20(1): 164, 2019 08 12.
Artículo en Inglés | MEDLINE | ID: mdl-31405382

RESUMEN

Bioinformaticians and biologists rely increasingly upon workflows for the flexible utilization of the many life science tools that are needed to optimally convert data into knowledge. We outline a pan-European enterprise to provide a catalogue ( https://bio.tools ) of tools and databases that can be used in these workflows. bio.tools not only lists where to find resources, but also provides a wide variety of practical information.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Bases de Datos Factuales , Programas Informáticos , Internet
7.
Nat Biotechnol ; 37(4): 480, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30894680

RESUMEN

In the version of this article initially published, Lena Dolman's second affiliation was given as Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. The correct second affiliation is Ontario Institute for Cancer Research, Toronto, Ontario, Canada. The error has been corrected in the HTML and PDF versions of the article.

9.
F1000Res ; 52016.
Artículo en Inglés | MEDLINE | ID: mdl-27803796

RESUMEN

The core mission of ELIXIR is to build a stable and sustainable infrastructure for biological information across Europe. At the heart of this are the data resources, tools and services that ELIXIR offers to the life-sciences community, providing stable and sustainable access to biological data. ELIXIR aims to ensure that these resources are available long-term and that the life-cycles of these resources are managed such that they support the scientific needs of the life-sciences, including biological research. ELIXIR Core Data Resources are defined as a set of European data resources that are of fundamental importance to the wider life-science community and the long-term preservation of biological data. They are complete collections of generic value to life-science, are considered an authority in their field with respect to one or more characteristics, and show high levels of scientific quality and service. Thus, ELIXIR Core Data Resources are of wide applicability and usage. This paper describes the structures, governance and processes that support the identification and evaluation of ELIXIR Core Data Resources. It identifies key indicators which reflect the essence of the definition of an ELIXIR Core Data Resource and support the promotion of excellence in resource development and operation. It describes the specific indicators in more detail and explains their application within ELIXIR's sustainability strategy and science policy actions, and in capacity building, life-cycle management and technical actions. The identification process is currently being implemented and tested for the first time. The findings and outcome will be evaluated by the ELIXIR Scientific Advisory Board in March 2017. Establishing the portfolio of ELIXIR Core Data Resources and ELIXIR Services is a key priority for ELIXIR and publicly marks the transition towards a cohesive infrastructure.

10.
F1000Res ; 52016.
Artículo en Inglés | MEDLINE | ID: mdl-28232860

RESUMEN

ISMARA ( ismara.unibas.ch) automatically infers the key regulators and regulatory interactions from high-throughput gene expression or chromatin state data. However, given the large sizes of current next generation sequencing (NGS) datasets, data uploading times are a major bottleneck. Additionally, for proprietary data, users may be uncomfortable with uploading entire raw datasets to an external server. Both these problems could be alleviated by providing a means by which users could pre-process their raw data locally, transferring only a small summary file to the ISMARA server. We developed a stand-alone client application that pre-processes large input files (RNA-seq or ChIP-seq data) on the user's computer for performing ISMARA analysis in a completely automated manner, including uploading of small processed summary files to the ISMARA server. This reduces file sizes by up to a factor of 1000, and upload times from many hours to mere seconds. The client application is available from ismara.unibas.ch/ISMARA/client.

11.
BMC Bioinformatics ; 16: 394, 2015 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-26597459

RESUMEN

BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.


Asunto(s)
Aminoácidos/metabolismo , Biología Computacional/métodos , Evolución Molecular , Internet , Filogenia , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Teorema de Bayes , Humanos
12.
Bioinformatics ; 31(18): 3051-3, 2015 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-25987568

RESUMEN

MOTIVATION: Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, partly complementary, partly conflicting results. RESULTS: We present TRAL, a tandem repeat annotation library that allows running and parsing of various detection outputs, clustering of redundant or overlapping annotations, several statistical frameworks for filtering false positive annotations, and importantly a tandem repeat annotation and refinement module based on circular profile hidden Markov models (cpHMMs). Using TRAL, we evaluated the performance of a multi-step tandem repeat annotation workflow on 547 085 sequences in UniProtKB/Swiss-Prot. The researcher can use these results to predict run-times for specific datasets, and to choose annotation complexity accordingly. AVAILABILITY AND IMPLEMENTATION: TRAL is an open-source Python 3 library and is available, together with documentation and tutorials via http://www.vital-it.ch/software/tral. CONTACT: elke.schaper@isb-sib.ch.


Asunto(s)
Bases de Datos de Proteínas , Bases del Conocimiento , Anotación de Secuencia Molecular , Programas Informáticos , Secuencias Repetidas en Tándem/genética , Secuencia de Aminoácidos , Análisis por Conglomerados , Documentación , Biblioteca de Genes , Humanos , Datos de Secuencia Molecular
13.
Nucleic Acids Res ; 42(Web Server issue): W436-41, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24792157

RESUMEN

The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.


Asunto(s)
Biología Computacional , Bases de Datos de Compuestos Químicos , Programas Informáticos , Evolución Biológica , Bioestadística , Diseño de Fármacos , Genómica , Humanos , Internet , Conformación Proteica , Proteómica , Biología de Sistemas
14.
Bioinformatics ; 30(8): 1129-1137, 2014 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-24389654

RESUMEN

MOTIVATION: The detection of positive selection is widely used to study gene and genome evolution, but its application remains limited by the high computational cost of existing implementations. We present a series of computational optimizations for more efficient estimation of the likelihood function on large-scale phylogenetic problems. We illustrate our approach using the branch-site model of codon evolution. RESULTS: We introduce novel optimization techniques that substantially outperform both CodeML from the PAML package and our previously optimized sequential version SlimCodeML. These techniques can also be applied to other likelihood-based phylogeny software. Our implementation scales well for large numbers of codons and/or species. It can therefore analyse substantially larger datasets than CodeML. We evaluated FastCodeML on different platforms and measured average sequential speedups of FastCodeML (single-threaded) versus CodeML of up to 5.8, average speedups of FastCodeML (multi-threaded) versus CodeML on a single node (shared memory) of up to 36.9 for 12 CPU cores, and average speedups of the distributed FastCodeML versus CodeML of up to 170.9 on eight nodes (96 CPU cores in total). AVAILABILITY AND IMPLEMENTATION: ftp://ftp.vital-it.ch/tools/FastCodeML/ CONTACT: selectome@unil.ch or nicolas.salamin@unil.ch.


Asunto(s)
Evolución Molecular , Filogenia , Selección Genética , Programas Informáticos , Algoritmos , Codón , Biología Computacional , Funciones de Verosimilitud
15.
Nucleic Acids Res ; 42(Database issue): D917-21, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24225318

RESUMEN

Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Selección Genética , Variación Genética , Genómica/normas , Humanos , Internet , Control de Calidad , Alineación de Secuencia
16.
Stud Health Technol Inform ; 175: 59-68, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22941988

RESUMEN

One of the important questions in biological evolution is to know if certain changes along protein coding genes have contributed to the adaptation of species. This problem is known to be biologically complex and computationally very expensive. It, therefore, requires efficient Grid or cluster solutions to overcome the computational challenge. We have developed a Grid-enabled tool (gcodeml) that relies on the PAML (codeml) package to help analyse large phylogenetic datasets on both Grids and computational clusters. Although we report on results for gcodeml, our approach is applicable and customisable to related problems in biology or other scientific domains.


Asunto(s)
Algoritmos , ADN/genética , Minería de Datos/métodos , Bases de Datos Genéticas , Evolución Molecular , Proteínas/genética , Análisis de Secuencia/métodos , Programas Informáticos , Interfaz Usuario-Computador
17.
Nucleic Acids Res ; 40(Web Server issue): W597-603, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22661580

RESUMEN

ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a 'decentralized' way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across 'selected' resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.


Asunto(s)
Biología Computacional , Proteómica , Programas Informáticos , Gráficos por Computador , Genómica , Internet , Integración de Sistemas , Interfaz Usuario-Computador
18.
Proc Natl Acad Sci U S A ; 108(14): 5679-84, 2011 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-21282665

RESUMEN

Ants have evolved very complex societies and are key ecosystem members. Some ants, such as the fire ant Solenopsis invicta, are also major pests. Here, we present a draft genome of S. invicta, assembled from Roche 454 and Illumina sequencing reads obtained from a focal haploid male and his brothers. We used comparative genomic methods to obtain insight into the unique features of the S. invicta genome. For example, we found that this genome harbors four adjacent copies of vitellogenin. A phylogenetic analysis revealed that an ancestral vitellogenin gene first underwent a duplication that was followed by possibly independent duplications of each of the daughter vitellogenins. The vitellogenin genes have undergone subfunctionalization with queen- and worker-specific expression, possibly reflecting differential selection acting on the queen and worker castes. Additionally, we identified more than 400 putative olfactory receptors of which at least 297 are intact. This represents the largest repertoire reported so far in insects. S. invicta also harbors an expansion of a specific family of lipid-processing genes, two putative orthologs to the transformer/feminizer sex differentiation gene, a functional DNA methylation system, and a single putative telomerase ortholog. EST data indicate that this S. invicta telomerase ortholog has at least four spliceforms that differ in their use of two sets of mutually exclusive exons. Some of these and other unique aspects of the fire ant genome are likely linked to the complex social behavior of this species.


Asunto(s)
Hormigas/genética , Evolución Molecular , Genoma de los Insectos/genética , Genómica/métodos , Filogenia , Animales , Secuencia de Bases , Biología Computacional , Metilación de ADN , Etiquetas de Secuencia Expresada , Jerarquia Social , Masculino , Datos de Secuencia Molecular , Receptores Odorantes/genética , Análisis de Secuencia de ADN , Vitelogeninas/genética
19.
Stud Health Technol Inform ; 159: 55-63, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20543426

RESUMEN

Cloud computing has recently become very popular, and several bioinformatics applications exist already in that domain. The aim of this article is to analyse a current cloud system with respect to usability, benchmark its performance and compare its user friendliness with a conventional cluster job submission system. Given the current hype on the theme, user expectations are rather high, but current results show that neither the price/performance ratio nor the usage model is very satisfactory for large-scale embarrassingly parallel applications. However, for small to medium scale applications that require CPU time at certain peak times the cloud is a suitable alternative.


Asunto(s)
Metodologías Computacionales , Aplicaciones de la Informática Médica , Filogenia , Diseño de Software , Biología Computacional
20.
Nucleic Acids Res ; 38(Web Server issue): W683-8, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20462862

RESUMEN

The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.embraceregistry.net) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions.


Asunto(s)
Biología Computacional , Programas Informáticos , Disciplinas de las Ciencias Biológicas , Difusión de la Información , Internet , Sistema de Registros , Integración de Sistemas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...