Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Bioinformatics ; 38(6): 1741-1742, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34962976

RESUMO

SUMMARY: The assessment of novel phylogenetic models and inference methods is routinely being conducted via experiments on simulated as well as empirical data. When generating synthetic data it is often unclear how to set simulation parameters for the models and generate trees that appropriately reflect empirical model parameter distributions and tree shapes. As a solution, we present and make available a new database called 'RAxML Grove' currently comprising more than 60 000 inferred trees and respective model parameter estimates from fully anonymized empirical datasets that were analyzed using RAxML and RAxML-NG on two web servers. We also describe and make available two simple applications of RAxML Grove to exemplify its usage and highlight its utility for designing realistic simulation studies and analyzing empirical model parameter and tree shape distributions. AVAILABILITY AND IMPLEMENTATION: RAxML Grove is freely available at https://github.com/angtft/RAxMLGrove. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Computadores , Software , Filogenia , Simulação por Computador , Bases de Dados Factuais
2.
Nucleic Acids Res ; 49(W1): W216-W227, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-33849055

RESUMO

The SIB Swiss Institute of Bioinformatics (https://www.sib.swiss) creates, maintains and disseminates a portfolio of reliable and state-of-the-art bioinformatics services and resources for the storage, analysis and interpretation of biological data. Through Expasy (https://www.expasy.org), the Swiss Bioinformatics Resource Portal, the scientific community worldwide, freely accesses more than 160 SIB resources supporting a wide range of life science and biomedical research areas. In 2020, Expasy was redesigned through a user-centric approach, known as User-Centred Design (UCD), whose aim is to create user interfaces that are easy-to-use, efficient and targeting the intended community. This approach, widely used in other fields such as marketing, e-commerce, and design of mobile applications, is still scarcely explored in bioinformatics. In total, around 50 people were actively involved, including internal stakeholders and end-users. In addition to an optimised interface that meets users' needs and expectations, the new version of Expasy provides an up-to-date and accurate description of high-quality resources based on a standardised ontology, allowing to connect functionally-related resources.


Assuntos
Biologia Computacional , Bases de Dados Factuais , Software , Interface Usuário-Computador
4.
Bioinformatics ; 31(18): 3051-3, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-25987568

RESUMO

MOTIVATION: Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, partly complementary, partly conflicting results. RESULTS: We present TRAL, a tandem repeat annotation library that allows running and parsing of various detection outputs, clustering of redundant or overlapping annotations, several statistical frameworks for filtering false positive annotations, and importantly a tandem repeat annotation and refinement module based on circular profile hidden Markov models (cpHMMs). Using TRAL, we evaluated the performance of a multi-step tandem repeat annotation workflow on 547 085 sequences in UniProtKB/Swiss-Prot. The researcher can use these results to predict run-times for specific datasets, and to choose annotation complexity accordingly. AVAILABILITY AND IMPLEMENTATION: TRAL is an open-source Python 3 library and is available, together with documentation and tutorials via http://www.vital-it.ch/software/tral. CONTACT: elke.schaper@isb-sib.ch.


Assuntos
Bases de Dados de Proteínas , Bases de Conhecimento , Anotação de Sequência Molecular , Software , Sequências de Repetição em Tandem/genética , Sequência de Aminoácidos , Análise por Conglomerados , Documentação , Biblioteca Gênica , Humanos , Dados de Sequência Molecular
5.
Nucleic Acids Res ; 42(Database issue): D917-21, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24225318

RESUMO

Selectome (http://selectome.unil.ch/) is a database of positive selection, based on a branch-site likelihood test. This model estimates the number of nonsynonymous substitutions (dN) and synonymous substitutions (dS) to evaluate the variation in selective pressure (dN/dS ratio) over branches and over sites. Since the original release of Selectome, we have benchmarked and implemented a thorough quality control procedure on multiple sequence alignments, aiming to provide minimum false-positive results. We have also improved the computational efficiency of the branch-site test implementation, allowing larger data sets and more frequent updates. Release 6 of Selectome includes all gene trees from Ensembl for Primates and Glires, as well as a large set of vertebrate gene trees. A total of 6810 gene trees have some evidence of positive selection. Finally, the web interface has been improved to be more responsive and to facilitate searches and browsing.


Assuntos
Bases de Dados de Ácidos Nucleicos , Seleção Genética , Variação Genética , Genômica/normas , Humanos , Internet , Controle de Qualidade , Alinhamento de Sequência
6.
Nucleic Acids Res ; 42(Web Server issue): W436-41, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24792157

RESUMO

The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.


Assuntos
Biologia Computacional , Bases de Dados de Compostos Químicos , Software , Evolução Biológica , Bioestatística , Desenho de Fármacos , Genômica , Humanos , Internet , Conformação Proteica , Proteômica , Biologia de Sistemas
7.
BMC Bioinformatics ; 16: 394, 2015 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-26597459

RESUMO

BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.


Assuntos
Aminoácidos/metabolismo , Biologia Computacional/métodos , Evolução Molecular , Internet , Filogenia , Análise de Sequência de DNA/métodos , Software , Algoritmos , Teorema de Bayes , Humanos
8.
Bioinformatics ; 30(8): 1129-1137, 2014 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-24389654

RESUMO

MOTIVATION: The detection of positive selection is widely used to study gene and genome evolution, but its application remains limited by the high computational cost of existing implementations. We present a series of computational optimizations for more efficient estimation of the likelihood function on large-scale phylogenetic problems. We illustrate our approach using the branch-site model of codon evolution. RESULTS: We introduce novel optimization techniques that substantially outperform both CodeML from the PAML package and our previously optimized sequential version SlimCodeML. These techniques can also be applied to other likelihood-based phylogeny software. Our implementation scales well for large numbers of codons and/or species. It can therefore analyse substantially larger datasets than CodeML. We evaluated FastCodeML on different platforms and measured average sequential speedups of FastCodeML (single-threaded) versus CodeML of up to 5.8, average speedups of FastCodeML (multi-threaded) versus CodeML on a single node (shared memory) of up to 36.9 for 12 CPU cores, and average speedups of the distributed FastCodeML versus CodeML of up to 170.9 on eight nodes (96 CPU cores in total). AVAILABILITY AND IMPLEMENTATION: ftp://ftp.vital-it.ch/tools/FastCodeML/ CONTACT: selectome@unil.ch or nicolas.salamin@unil.ch.


Assuntos
Evolução Molecular , Filogenia , Seleção Genética , Software , Algoritmos , Códon , Biologia Computacional , Funções Verossimilhança
9.
Nucleic Acids Res ; 40(Web Server issue): W597-603, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22661580

RESUMO

ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a 'decentralized' way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across 'selected' resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.


Assuntos
Biologia Computacional , Proteômica , Software , Gráficos por Computador , Genômica , Internet , Integração de Sistemas , Interface Usuário-Computador
10.
Proc Natl Acad Sci U S A ; 108(14): 5679-84, 2011 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-21282665

RESUMO

Ants have evolved very complex societies and are key ecosystem members. Some ants, such as the fire ant Solenopsis invicta, are also major pests. Here, we present a draft genome of S. invicta, assembled from Roche 454 and Illumina sequencing reads obtained from a focal haploid male and his brothers. We used comparative genomic methods to obtain insight into the unique features of the S. invicta genome. For example, we found that this genome harbors four adjacent copies of vitellogenin. A phylogenetic analysis revealed that an ancestral vitellogenin gene first underwent a duplication that was followed by possibly independent duplications of each of the daughter vitellogenins. The vitellogenin genes have undergone subfunctionalization with queen- and worker-specific expression, possibly reflecting differential selection acting on the queen and worker castes. Additionally, we identified more than 400 putative olfactory receptors of which at least 297 are intact. This represents the largest repertoire reported so far in insects. S. invicta also harbors an expansion of a specific family of lipid-processing genes, two putative orthologs to the transformer/feminizer sex differentiation gene, a functional DNA methylation system, and a single putative telomerase ortholog. EST data indicate that this S. invicta telomerase ortholog has at least four spliceforms that differ in their use of two sets of mutually exclusive exons. Some of these and other unique aspects of the fire ant genome are likely linked to the complex social behavior of this species.


Assuntos
Formigas/genética , Evolução Molecular , Genoma de Inseto/genética , Genômica/métodos , Filogenia , Animais , Sequência de Bases , Biologia Computacional , Metilação de DNA , Etiquetas de Sequências Expressas , Hierarquia Social , Masculino , Dados de Sequência Molecular , Receptores Odorantes/genética , Análise de Sequência de DNA , Vitelogeninas/genética
11.
Nucleic Acids Res ; 38(Web Server issue): W683-8, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20462862

RESUMO

The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.embraceregistry.net) that collects together around 1000 services developed by the consortium partners. This article presents the current status of the collection and its associated recommendations and standards definitions.


Assuntos
Biologia Computacional , Software , Disciplinas das Ciências Biológicas , Disseminação de Informação , Internet , Sistema de Registros , Integração de Sistemas
12.
Stud Health Technol Inform ; 175: 59-68, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22941988

RESUMO

One of the important questions in biological evolution is to know if certain changes along protein coding genes have contributed to the adaptation of species. This problem is known to be biologically complex and computationally very expensive. It, therefore, requires efficient Grid or cluster solutions to overcome the computational challenge. We have developed a Grid-enabled tool (gcodeml) that relies on the PAML (codeml) package to help analyse large phylogenetic datasets on both Grids and computational clusters. Although we report on results for gcodeml, our approach is applicable and customisable to related problems in biology or other scientific domains.


Assuntos
Algoritmos , DNA/genética , Mineração de Dados/métodos , Bases de Dados Genéticas , Evolução Molecular , Proteínas/genética , Análise de Sequência/métodos , Software , Interface Usuário-Computador
13.
Brief Bioinform ; 9(6): 493-505, 2008 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18621748

RESUMO

Programmatic access to data and tools through the web using so-called web services has an important role to play in bioinformatics. In this article, we discuss the most popular approaches based on SOAP/WS-I and REST and describe our, a cross section of the community, experiences with providing and using web services in the context of biological sequence analysis. We briefly review main technological approaches as well as best practice hints that are useful for both users and developers. Finally, syntactic and semantic data integration issues with multiple web services are discussed.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Internet/estatística & dados numéricos , Análise de Sequência/métodos , Sistemas de Gerenciamento de Base de Dados , Humanos , Integração de Sistemas , Interface Usuário-Computador
14.
Stud Health Technol Inform ; 159: 55-63, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20543426

RESUMO

Cloud computing has recently become very popular, and several bioinformatics applications exist already in that domain. The aim of this article is to analyse a current cloud system with respect to usability, benchmark its performance and compare its user friendliness with a conventional cluster job submission system. Given the current hype on the theme, user expectations are rather high, but current results show that neither the price/performance ratio nor the usage model is very satisfactory for large-scale embarrassingly parallel applications. However, for small to medium scale applications that require CPU time at certain peak times the cloud is a suitable alternative.


Assuntos
Metodologias Computacionais , Aplicações da Informática Médica , Filogenia , Design de Software , Biologia Computacional
15.
Stud Health Technol Inform ; 270: 1170-1174, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32570566

RESUMO

The BioMedIT project is funded by the Swiss government as an integral part of the Swiss Personalized Health Network (SPHN), aiming to provide researchers with access to a secure, powerful and versatile IT infrastructure for doing data-driven research on sensitive biomedical data while ensuring data privacy protection. The BioMedIT network gives researchers the ability to securely transfer, store, manage and process sensitive research data. The underlying BioMedIT nodes provide compute and storage capacity that can be used locally or through a federated environment. The network operates under a common Information Security Policy using state-of-the-art security techniques. It utilizes cloud computing, virtualization, compute accelerators (GPUs), big data storage as well as federation technologies to lower computational boundaries for researchers and to guarantee that sensitive data can be processed in a secure and lawful way. Building on existing expertise and research infrastructure at the partnering Swiss institutions, the BioMedIT network establishes a competitive Swiss private-cloud - a secure national infrastructure resource that can be used by researchers of Swiss universities, hospitals and other research institutions.


Assuntos
Armazenamento e Recuperação da Informação , Big Data , Computação em Nuvem , Segurança Computacional , Privacidade
16.
Database (Oxford) ; 20192019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31697362

RESUMO

MOTIVATION: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases. RESULTS: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface.


Assuntos
Ontologias Biológicas , Biologia Computacional , Bases de Dados Factuais , Web Semântica
17.
Genome Biol ; 20(1): 164, 2019 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-31405382

RESUMO

Bioinformaticians and biologists rely increasingly upon workflows for the flexible utilization of the many life science tools that are needed to optimally convert data into knowledge. We outline a pan-European enterprise to provide a catalogue ( https://bio.tools ) of tools and databases that can be used in these workflows. bio.tools not only lists where to find resources, but also provides a wide variety of practical information.


Assuntos
Disciplinas das Ciências Biológicas , Bases de Dados Factuais , Software , Internet
18.
Nat Biotechnol ; 37(4): 480, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894680

RESUMO

In the version of this article initially published, Lena Dolman's second affiliation was given as Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. The correct second affiliation is Ontario Institute for Cancer Research, Toronto, Ontario, Canada. The error has been corrected in the HTML and PDF versions of the article.

19.
F1000Res ; 52016.
Artigo em Inglês | MEDLINE | ID: mdl-28232860

RESUMO

ISMARA ( ismara.unibas.ch) automatically infers the key regulators and regulatory interactions from high-throughput gene expression or chromatin state data. However, given the large sizes of current next generation sequencing (NGS) datasets, data uploading times are a major bottleneck. Additionally, for proprietary data, users may be uncomfortable with uploading entire raw datasets to an external server. Both these problems could be alleviated by providing a means by which users could pre-process their raw data locally, transferring only a small summary file to the ISMARA server. We developed a stand-alone client application that pre-processes large input files (RNA-seq or ChIP-seq data) on the user's computer for performing ISMARA analysis in a completely automated manner, including uploading of small processed summary files to the ISMARA server. This reduces file sizes by up to a factor of 1000, and upload times from many hours to mere seconds. The client application is available from ismara.unibas.ch/ISMARA/client.

20.
F1000Res ; 52016.
Artigo em Inglês | MEDLINE | ID: mdl-27803796

RESUMO

The core mission of ELIXIR is to build a stable and sustainable infrastructure for biological information across Europe. At the heart of this are the data resources, tools and services that ELIXIR offers to the life-sciences community, providing stable and sustainable access to biological data. ELIXIR aims to ensure that these resources are available long-term and that the life-cycles of these resources are managed such that they support the scientific needs of the life-sciences, including biological research. ELIXIR Core Data Resources are defined as a set of European data resources that are of fundamental importance to the wider life-science community and the long-term preservation of biological data. They are complete collections of generic value to life-science, are considered an authority in their field with respect to one or more characteristics, and show high levels of scientific quality and service. Thus, ELIXIR Core Data Resources are of wide applicability and usage. This paper describes the structures, governance and processes that support the identification and evaluation of ELIXIR Core Data Resources. It identifies key indicators which reflect the essence of the definition of an ELIXIR Core Data Resource and support the promotion of excellence in resource development and operation. It describes the specific indicators in more detail and explains their application within ELIXIR's sustainability strategy and science policy actions, and in capacity building, life-cycle management and technical actions. The identification process is currently being implemented and tested for the first time. The findings and outcome will be evaluated by the ELIXIR Scientific Advisory Board in March 2017. Establishing the portfolio of ELIXIR Core Data Resources and ELIXIR Services is a key priority for ELIXIR and publicly marks the transition towards a cohesive infrastructure.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA