Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 36(14): 4203-4205, 2020 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-32415960

RESUMO

MOTIVATION: Molecular docking is aimed at predicting the conformation of small-molecule (ligands) within an identified binding site (BS) in a target protein (receptor). Protein-ligand docking plays an important role in modern drug discovery and biochemistry for protein engineering. However, efficient docking analysis of proteins requires prior knowledge of the BS, which is not always known. The process which covers BS identification and protein-ligand docking usually requires the combination of different programs, which require several input parameters. This is furtherly aggravated when factoring in computational demands, such as CPU-time. Therefore, these types of simulation experiments can become a complex process for researchers without a background in computer sciences. RESULTS: To overcome these problems, we have designed an automatic computational workflow (WF) to process protein-ligand complexes, which runs from the identification of the possible BSs positions to the prediction of the experimental binding modes and affinities of the ligand. This open-access WF runs under the Galaxy platform that integrates public domain software. The results of the proposed method are in close agreement with state-of-the-art docking software. AVAILABILITY AND IMPLEMENTATION: Software is available at: https://pistacho.ac.uma.es/galaxy-bitlab. CONTACT: euv@uma.es. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas , Software , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/metabolismo , Fluxo de Trabalho
2.
Bioinformatics ; 34(5): 869-870, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29069310

RESUMO

Motivation: Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Results: Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. Availability and implementation: mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. Contact: ortrelles@uma.es. Supplementary information: Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca.


Assuntos
Telefone Celular , Biologia Computacional/métodos , Aplicativos Móveis , Metadados , Navegador
3.
BMC Genomics ; 19(1): 56, 2018 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-29338691

RESUMO

BACKGROUND: Technical advances in mobile devices such as smartphones and tablets have produced an extraordinary increase in their use around the world and have become part of our daily lives. The possibility of carrying these devices in a pocket, particularly mobile phones, has enabled ubiquitous access to Internet resources. Furthermore, in the life sciences world there has been a vast proliferation of data types and services that finish as Web Services. This suggests the need for research into mobile clients to deal with life sciences applications for effective usage and exploitation. RESULTS: Analysing the current features in existing bioinformatics applications managing Web Services, we have devised, implemented, and deployed an easy-to-use web-based lightweight mobile client. This client is able to browse, select, compose parameters, invoke, and monitor the execution of Web Services stored in catalogues or central repositories. The client is also able to deal with huge amounts of data between external storage mounts. In addition, we also present a validation use case, which illustrates the usage of the application while executing, monitoring, and exploring the results of a registered workflow. The software its available in the Apple Store and Android Market and the source code is publicly available in Github. CONCLUSIONS: Mobile devices are becoming increasingly important in the scientific world due to their strong potential impact on scientific applications. Bioinformatics should not fall behind this trend. We present an original software client that deals with the intrinsic limitations of such devices and propose different guidelines to provide location-independent access to computational resources in bioinformatics and biomedicine. Its modular design makes it easily expandable with the inclusion of new repositories, tools, types of visualization, etc.


Assuntos
Biologia Computacional , Computadores de Mão , Software , Internet , Interface Usuário-Computador , Fluxo de Trabalho
4.
Brief Bioinform ; 17(3): 368-79, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26272945

RESUMO

It is becoming clear that most human diseases have a complex etiology that cannot be explained by single nucleotide polymorphisms (SNPs) or simple additive combinations; the general consensus is that they are caused by combinations of multiple genetic variations. The limited success of some genome-wide association studies is partly a result of this focus on single genetic markers. A more promising approach is to take into account epistasis, by considering the association of multiple SNP interactions with disease. However, as genomic data continues to grow in resolution, and genome and exome sequencing become more established, the number of combinations of variants to consider increases rapidly. Two potential solutions should be considered: the use of high-performance computing, which allows us to consider a larger number of variables, and heuristics to make the solution more tractable, essential in the case of genome sequencing. In this review, we look at different computational methods to analyse epistatic interactions within disease-related genetic data sets created by microarray technology. We also review efforts to use epistatic analysis results to produce biomarkers for diagnostic tests and give our views on future directions in this field in light of advances in sequencing technology and variants in non-coding regions.


Assuntos
Genoma , Algoritmos , Epistasia Genética , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único
5.
BMC Genomics ; 17(Suppl 8): 802, 2016 10 25.
Artigo em Inglês | MEDLINE | ID: mdl-27801291

RESUMO

BACKGROUND: The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. RESULTS: A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. CONCLUSIONS: The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes.


Assuntos
Biologia Computacional/métodos , Metagenoma , Metagenômica , Algoritmos , Conectoma , Metagenômica/métodos , Anotação de Sequência Molecular , Reprodutibilidade dos Testes , Fluxo de Trabalho
6.
BMC Bioinformatics ; 16: 250, 2015 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-26260162

RESUMO

BACKGROUND: Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. RESULTS: We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. CONCLUSIONS: We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genoma , Software , Animais , Bactérias/genética , Conjuntos de Dados como Assunto , Drosophila/genética , Humanos , Mamíferos/genética , Sintenia , Vírus/genética
7.
BMC Genomics ; 13: 187, 2012 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-22583865

RESUMO

BACKGROUND: L-ascorbic acid (AsA; vitamin C) is essential for all living plants where it functions as the main hydrosoluble antioxidant. It has diverse roles in the regulation of plant cell growth and expansion, photosynthesis, and hormone-regulated processes. AsA is also an essential component of the human diet, being tomato fruit one of the main sources of this vitamin. To identify genes responsible for AsA content in tomato fruit, transcriptomic studies followed by clustering analysis were applied to two groups of fruits with contrasting AsA content. These fruits were identified after AsA profiling of an F8 Recombinant Inbred Line (RIL) population generated from a cross between the domesticated species Solanum lycopersicum and the wild relative Solanum pimpinellifollium. RESULTS: We found large variability in AsA content within the RIL population with individual RILs with up to 4-fold difference in AsA content. Transcriptomic analysis identified genes whose expression correlated either positively (PVC genes) or negatively (NVC genes) with the AsA content of the fruits. Cluster analysis using SOTA allowed the identification of subsets of co-regulated genes mainly involved in hormones signaling, such as ethylene, ABA, gibberellin and auxin, rather than any of the known AsA biosynthetic genes. Data mining of the corresponding PVC and NVC orthologs in Arabidopis databases identified flagellin and other ROS-producing processes as cues resulting in differential regulation of a high percentage of the genes from both groups of co-regulated genes; more specifically, 26.6% of the orthologous PVC genes, and 15.5% of the orthologous NVC genes were induced and repressed, respectively, under flagellin22 treatment in Arabidopsis thaliana. CONCLUSION: Results here reported indicate that the content of AsA in red tomato fruit from our selected RILs are not correlated with the expression of genes involved in its biosynthesis. On the contrary, the data presented here supports that AsA content in tomato fruit co-regulates with genes involved in hormone signaling and they are dependent on the oxidative status of the fruit.


Assuntos
Ácido Ascórbico/metabolismo , Frutas/metabolismo , Genes de Plantas/fisiologia , Solanum/metabolismo , Análise por Conglomerados , Regulação da Expressão Gênica de Plantas/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Genes de Plantas/genética , Solanum lycopersicum/genética , Solanum lycopersicum/metabolismo , Oxirredução , Solanum/genética
9.
Nucleic Acids Res ; 38(Web Server issue): W671-6, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20525794

RESUMO

The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user's tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/.


Assuntos
Biologia Computacional , Software , Bases de Dados de Proteínas , Internet , Filogenia , Proteínas/química , Proteínas/classificação , Integração de Sistemas
10.
BMC Bioinformatics ; 12: 419, 2011 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-22032807

RESUMO

BACKGROUND: Bioinformatics is commonly featured as a well assorted list of available web resources. Although diversity of services is positive in general, the proliferation of tools, their dispersion and heterogeneity complicate the integrated exploitation of such data processing capacity. RESULTS: To facilitate the construction of software clients and make integrated use of this variety of tools, we present a modular programmatic application interface (MAPI) that provides the necessary functionality for uniform representation of Web Services metadata descriptors including their management and invocation protocols of the services which they represent. This document describes the main functionality of the framework and how it can be used to facilitate the deployment of new software under a unified structure of bioinformatics Web Services. A notable feature of MAPI is the modular organization of the functionality into different modules associated with specific tasks. This means that only the modules needed for the client have to be installed, and that the module functionality can be extended without the need for re-writing the software client. CONCLUSIONS: The potential utility and versatility of the software library has been demonstrated by the implementation of several currently available clients that cover different aspects of integrated data processing, ranging from service discovery to service invocation with advanced features such as workflows composition and asynchronous services calls to multiple types of Web Services including those registered in repositories (e.g. GRID-based, SOAP, BioMOBY, R-bioconductor, and others).


Assuntos
Biologia Computacional/métodos , Software , Internet , Bibliotecas , Fluxo de Trabalho
11.
Bioinformatics ; 26(4): 553-9, 2010 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-20047879

RESUMO

MOTIVATION: Web services technology is becoming the option of choice to deploy bioinformatics tools that are universally available. One of the major strengths of this approach is that it supports machine-to-machine interoperability over a network. However, a weakness of this approach is that various Web Services differ in their definition and invocation protocols, as well as their communication and data formats-and this presents a barrier to service interoperability. RESULTS: jORCA is a desktop client aimed at facilitating seamless integration of Web Services. It does so by making a uniform representation of the different web resources, supporting scalable service discovery, and automatic composition of workflows. Usability is at the top of the jORCA agenda; thus it is a highly customizable and extensible application that accommodates a broad range of user skills featuring double-click invocation of services in conjunction with advanced execution-control, on the fly data standardization, extensibility of viewer plug-ins, drag-and-drop editing capabilities, plus a file-based browsing style and organization of favourite tools. The integration of bioinformatics Web Services is made easier to support a wider range of users. .


Assuntos
Biologia Computacional/métodos , Internet , Software , Disseminação de Informação/métodos
12.
Bioinform Biol Insights ; 15: 11779322211021422, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34163150

RESUMO

Due to major breakthroughs in sequencing technologies throughout the last decades, the time and cost per sequencing experiment have reduced drastically, overcoming the data generation barrier during the early genomic era. Such a shift has encouraged the scientific community to develop new computational methods that are able to compare large genomic sequences, thus enabling large-scale studies of genome evolution. The field of comparative genomics has proven itself invaluable for studying the evolutionary mechanisms and the forces driving genome evolution. In this line, a full genome comparison study between 2 species requires a quadratic number of comparisons in terms of the number of sequences (around 400 chromosome comparisons in the case of mammalian genomes); however, when studying conserved syntenies or evolutionary rearrangements, many sequence comparisons can be skipped for not all will contain significant signals. Subsequently, the scientific community has developed fast heuristics to perform multiple pairwise comparisons between large sequences to determine whether significant sets of conserved similarities exist. The data generation problem is no longer an issue, yet the limitations have shifted toward the analysis of such massive data. Therefore, we present XCout, a Web-based visual analytics application for multiple genome comparisons designed to improve the analysis of large-scale evolutionary studies using novel techniques in Web visualization. XCout enables to work on hundreds of comparisons at once, thus reducing the time of the analysis by identifying significant signals between chromosomes across multiple species. Among others, XCout introduces several techniques to aid in the analysis of large-scale genome rearrangements, particularly (1) an interactive heatmap interface to display comparisons using automatic color scales based on similarity thresholds to ease detection at first sight, (2) an overlay system to detect individual signal contributions between chromosomes, (3) a tracking tool to trace conserved blocks across different species to perform evolutionary studies, and (4) a search engine to search annotations throughout different species.

13.
Brief Bioinform ; 9(3): 220-31, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18238804

RESUMO

The BioMoby project was initiated in 2001 from within the model organism database community. It aimed to standardize methodologies to facilitate information exchange and access to analytical resources, using a consensus driven approach. Six years later, the BioMoby development community is pleased to announce the release of the 1.0 version of the interoperability framework, registry Application Programming Interface and supporting Perl and Java code-bases. Together, these provide interoperable access to over 1400 bioinformatics resources worldwide through the BioMoby platform, and this number continues to grow. Here we highlight and discuss the features of BioMoby that make it distinct from other Semantic Web Service and interoperability initiatives, and that have been instrumental to its deployment and use by a wide community of bioinformatics service providers. The standard, client software, and supporting code libraries are all freely available at http://www.biomoby.org/.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Internet , Linguagens de Programação , Integração de Sistemas
14.
BMC Bioinformatics ; 10: 334, 2009 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-19832968

RESUMO

BACKGROUND: To aid in bioinformatics data processing and analysis, an increasing number of web-based applications are being deployed. Although this is a positive circumstance in general, the proliferation of tools makes it difficult to find the right tool, or more importantly, the right set of tools that can work together to solve real complex problems. RESULTS: Magallanes (Magellan) is a versatile, platform-independent Java library of algorithms aimed at discovering bioinformatics web services and associated data types. A second important feature of Magallanes is its ability to connect available and compatible web services into workflows that can process data sequentially to reach a desired output given a particular input. Magallanes' capabilities can be exploited both as an API or directly accessed through a graphic user interface.The Magallanes' API is freely available for academic use, and together with Magallanes application has been tested in MS-Windows XP and Unix-like operating systems. Detailed implementation information, including user manuals and tutorials, is available at http://www.bitlab-es.com/magallanes. CONCLUSION: Different implementations of the same client (web page, desktop applications, web services, etc.) have been deployed and are currently in use in real installations such as the National Institute of Bioinformatics (Spain) and the ACGT-EU project. This shows the potential utility and versatility of the software library, including the integration of novel tools in the domain and with strong evidences in the line of facilitate the automatic discovering and composition of workflows.


Assuntos
Biologia Computacional/métodos , Armazenamento e Recuperação da Informação/métodos , Software , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Internet , Fluxo de Trabalho
15.
BMC Bioinformatics ; 10: 16, 2009 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-19134227

RESUMO

BACKGROUND: Nowadays, microarray gene expression analysis is a widely used technology that scientists handle but whose final interpretation usually requires the participation of a specialist. The need for this participation is due to the requirement of some background in statistics that most users lack or have a very vague notion of. Moreover, programming skills could also be essential to analyse these data. An interactive, easy to use application seems therefore necessary to help researchers to extract full information from data and analyse them in a simple, powerful and confident way. RESULTS: PreP+07 is a standalone Windows XP application that presents a friendly interface for spot filtration, inter- and intra-slide normalization, duplicate resolution, dye-swapping, error removal and statistical analyses. Additionally, it contains two unique implementation of the procedures - double scan and Supervised Lowess-, a complete set of graphical representations - MA plot, RG plot, QQ plot, PP plot, PN plot - and can deal with many data formats, such as tabulated text, GenePix GPR and ArrayPRO. PreP+07 performance has been compared with the equivalent functions in Bioconductor using a tomato chip with 13056 spots. The number of differentially expressed genes considering p-values coming from the PreP+07 and Bioconductor Limma packages were statistically identical when the data set was only normalized; however, a slight variability was appreciated when the data was both normalized and scaled. CONCLUSION: PreP+07 implementation provides a high degree of freedom in selecting and organizing a small set of widely used data processing protocols, and can handle many data formats. Its reliability has been proven so that a laboratory researcher can afford a statistical pre-processing of his/her microarray results and obtain a list of differentially expressed genes using PreP+07 without any programming skills. All of this gives support to scientists that have been using previous PreP releases since its first version in 2003.


Assuntos
Biologia Computacional/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Reprodutibilidade dos Testes
16.
Sci Rep ; 9(1): 10274, 2019 07 16.
Artigo em Inglês | MEDLINE | ID: mdl-31312019

RESUMO

In the last decade, a technological shift in the bioinformatics field has occurred: larger genomes can now be sequenced quickly and cost effectively, resulting in the computational need to efficiently compare large and abundant sequences. Furthermore, detecting conserved similarities across large collections of genomes remains a problem. The size of chromosomes, along with the substantial amount of noise and number of repeats found in DNA sequences (particularly in mammals and plants), leads to a scenario where executing and waiting for complete outputs is both time and resource consuming. Filtering steps, manual examination and annotation, very long execution times and a high demand for computational resources represent a few of the many difficulties faced in large genome comparisons. In this work, we provide a method designed for comparisons of considerable amounts of very long sequences that employs a heuristic algorithm capable of separating noise and repeats from conserved fragments in pairwise genomic comparisons. We provide software implementation that computes in linear time using one core as a minimum and a small, constant memory footprint. The method produces both a previsualization of the comparison and a collection of indices to drastically reduce computational complexity when performing exhaustive comparisons. Last, the method scores the comparison to automate classification of sequences and produces a list of detected synteny blocks to enable new evolutionary studies.


Assuntos
Genoma , Genômica/métodos , Algoritmos , Animais , Evolução Biológica , Visualização de Dados , Humanos , Mamíferos/genética , Camundongos , Poaceae/genética , Software , Sintenia , Fatores de Tempo , Triticum/genética
17.
Bioinform Biol Insights ; 13: 1177932218825127, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30783378

RESUMO

The eclosion of data acquisition technologies has shifted the bottleneck in molecular biology research from data acquisition to data analysis. Such is the case in Comparative Genomics, where sequence analysis has transitioned from genes to genomes of several orders of magnitude larger. This fact has revealed the need to adapt software to work with huge experiments efficiently and to incorporate new data-analysis strategies to manage results from such studies. In previous works, we presented GECKO, a software to compare large sequences; now we address the representation, browsing, data exploration, and post-processing of the massive amount of information derived from such comparisons. GECKO-MGV is a web-based application organized as client-server architecture. It is aimed at visual analysis of the results from both pairwise and multiple sequences comparison studies combining a set of common commands for image exploration with improved state-of-the-art solutions. In addition, GECKO-MGV integrates different visualization analysis tools while exploiting the concept of layers to display multiple genome comparison datasets. Moreover, the software is endowed with capabilities for contacting external-proprietary and third-party services for further data post-processing and also presents a method to display a timeline of large-scale evolutionary events. As proof-of-concept, we present 2 exercises using bacterial and mammalian genomes which depict the capabilities of GECKO-MGV to perform in-depth, customizable analyses on the fly using web technologies. The first exercise is mainly descriptive and is carried out over bacterial genomes, whereas the second one aims to show the ability to deal with large sequence comparisons. In this case, we display results from the comparison of the first Homo sapiens chromosome against the first 5 chromosomes of Mus musculus.

18.
BMC Bioinformatics ; 9: 93, 2008 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-18267014

RESUMO

BACKGROUND: Array-based comparative genome hybridization (aCGH) is commonly used to determine the genomic content of bacterial strains. Since prokaryotes in general have less conserved genome sequences than eukaryotes, sequence divergences between the genes in the genomes used for an aCGH experiment obstruct determination of genome variations (e.g. deletions). Current normalization methods do not take into consideration sequence divergence between target and microarray features and therefore cannot distinguish a difference in signal due to systematic errors in the data or due to sequence divergence. RESULTS: We present supervised Lowess, or S-Lowess, an application of the subset Lowess normalization method. By using a predicted subset of array features with minimal sequence divergence between the analyzed strains for the normalization procedure we remove systematic errors from dual-dye aCGH data in two steps: (1) determination of a subset of conserved genes (i.e. likely conserved genes, LCG); and (2) using the LCG for subset Lowess normalization. Subset Lowess determines the correction factors for systematic errors in the subset of array features and normalizes all array features using these correction factors. The performance of S-Lowess was assessed on aCGH experiments in which differentially labeled genomic DNA fragments of Lactococcus lactis IL1403 and L. lactis MG1363 strains were hybridized to IL1403 DNA microarrays. Since both genomes are sequenced and gene deletions identified, the success rate of different aCGH normalization methods in detecting these deletions in the MG1363 genome were determined. S-Lowess detects 97% of the deletions, whereas other aCGH normalization methods detect up to only 60% of the deletions. CONCLUSION: S-Lowess is implemented in a user-friendly web-tool accessible from http://bioinformatics.biol.rug.nl/websoftware/s-lowess. We demonstrate that it outperforms existing normalization methods and maximizes detection of genomic variation (e.g. deletions) from microbial aCGH data.


Assuntos
Mapeamento Cromossômico/métodos , Genoma Bacteriano/genética , Hibridização In Situ/métodos , Lactococcus/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Bases de Dados Genéticas , Variação Genética/genética , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
19.
J Comput Biol ; 25(8): 841-849, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30084692

RESUMO

The comparison and assessment of similarity across metagenomes are still an open problem. Uncultivated samples suffer from high variability, thus making it difficult for heuristic sequence comparison methods to find precise matches in reference databases. Finer methods are required to provide higher accuracy and certainty, although these come at the expense of larger computation times. Therefore, in this work, we present our software for the highly parallel, fine-grained pairwise alignment of metagenomes. First, an analysis of the computational limitations of performing coarse-grained global alignments in parallel manner is described, and a solution is discussed and employed by our proposal. Second, we show that our development is competitive with state-of-the-art software in terms of speed and consumption of resources, while achieving more accurate results. In addition, the parallel scheme adopted is tested, depicting a performance of up to 98% efficiency while using up to 64 cores. Sequential optimizations are also tested and show a speedup of 9× over our previous proposal.


Assuntos
Biologia Computacional/métodos , Metagenoma , Metagenômica/métodos , Metagenômica/normas , Alinhamento de Sequência/normas , Software , Algoritmos , Humanos
20.
Heliyon ; 4(12): e01057, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30582061

RESUMO

In the last decade, bioinformatics has become an indispensable branch of modern science research, experiencing an explosion in financial support, developed applications and data collection. The growth of the datasets that are emerging from research laboratories, industry, the health sector, etc., are increasingly raising the levels of demand in computing power and storage. Processing biological data, in the large scales of these datasets, often requires the use of High Performance Computing (HPC) resources, especially when dealing with certain types of omics data, such as genomic and metagenomic data. Such computational resources not only require substantial investments, but they also involve high maintenance costs. More importantly, in order to keep good returns from the investments, specific training needs to be put in place to ensure that wasting is minimized. Furthermore, given that bioinformatics is a highly interdisciplinary field where several other domains intersect (such as biology, chemistry, physics and computer science), researchers from these areas also require bioinformatics-specific training in HPC, in order to fully take advantage of supercomputing centers. In this document, we describe our experience in training researchers from several different disciplines in HPC, as applied to bioinformatics under the framework of the leading European bioinformatics platform ELIXIR, and analyze both the content and outcomes of the course.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA