Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
BMC Bioinformatics ; 21(1): 422, 2020 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-32993478

RESUMO

BACKGROUND: Antigen receptors are characterized by an extreme diversity of specificities, which poses major computational and analytical challenges, particularly in the era of high-throughput immunoprofiling by next generation sequencing (NGS). The T cell Receptor/Immunoglobulin Profiler (TRIP) tool offers the opportunity for an in-depth analysis based on the processing of the output files of the IMGT/HighV-Quest tool, a standard in NGS immunoprofiling, through a number of interoperable modules. These provide detailed information about antigen receptor gene rearrangements, including variable (V), diversity (D) and joining (J) gene usage, CDR3 amino acid and nucleotide composition and clonality of both T cell receptors (TR) and B cell receptor immunoglobulins (BcR IG), and characteristics of the somatic hypermutation within the BcR IG genes. TRIP is a web application implemented in R shiny. RESULTS: Two sets of experiments have been performed in order to evaluate the efficiency and performance of the TRIP tool. The first used a number of synthetic datasets, ranging from 250k to 1M sequences, and established the linear response time of the tool (about 6 h for 1M sequences processed through the entire BcR IG data pipeline). The reproducibility of the tool was tested comparing the results produced by the main TRIP workflow with the results from a previous pipeline used on the Galaxy platform. As expected, no significant differences were noted between the two tools; although the preselection process seems to be stricter within the TRIP pipeline, about 0.1% more rearrangements were filtered out, with no impact on the final results. CONCLUSIONS: TRIP is a software framework that provides analytical services on antigen receptor gene sequence data. It is accurate and contains functions for data wrangling, cleaning, analysis and visualization, enabling the user to build a pipeline tailored to their needs. TRIP is publicly available at https://bio.tools/TRIP_-_T-cell_Receptor_Immunoglobulin_Profiler .


Assuntos
Imunoglobulinas/metabolismo , Receptores de Antígenos de Linfócitos T/metabolismo , Interface Usuário-Computador , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoglobulinas/química , Imunoglobulinas/genética , Receptores de Antígenos de Linfócitos B/química , Receptores de Antígenos de Linfócitos B/genética , Receptores de Antígenos de Linfócitos B/metabolismo , Receptores de Antígenos de Linfócitos T/química , Receptores de Antígenos de Linfócitos T/genética
2.
J Biomed Inform ; 43(1): 1-14, 2010 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19576292

RESUMO

Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.


Assuntos
Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Animais , Biologia Computacional/métodos , Computadores , Perfilação da Expressão Gênica , Marcadores Genéticos , Humanos , Camundongos , Modelos Estatísticos , Ratos , Reprodutibilidade dos Testes , Software , Interface Usuário-Computador
3.
Stud Health Technol Inform ; 124: 99-104, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17108510

RESUMO

This paper proposes a novel method for aligning multiple genomic or proteomic sequences using a fuzzyfied Hidden Markov Model (HMM). HMMs are known to provide compelling performance among multiple sequence alignment (MSA) algorithms, yet their stochastic nature does not help them cope with the existing dependence among the sequence elements. Fuzzy HMMs are a novel type of HMMs based on fuzzy sets and fuzzy integrals which generalizes the classical stochastic HMM, by relaxing its independence assumptions. In this paper, the fuzzy HMM model for MSA is mathematically defined. New fuzzy algorithms are described for building and training fuzzy HMMs, as well as for their use in aligning multiple sequences. Fuzzy HMMs can also increase the model capability of aligning multiple sequences mainly in terms of computation time. Modeling the multiple sequence alignment procedure with fuzzy HMMs can yield a robust and time-effective solution that can be widely used in bioinformatics in various applications, such as protein classification, phylogenetic analysis and gene prediction, among others.


Assuntos
Sequência de Bases , Lógica Fuzzy , Cadeias de Markov , Grécia , Humanos
4.
PLoS One ; 8(1): e52854, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23341912

RESUMO

Phylogenetic profiles express the presence or absence of genes and their homologs across a number of reference genomes. They have emerged as an elegant representation framework for comparative genomics and have been used for the genome-wide inference and discovery of functionally linked genes or metabolic pathways. As the number of reference genomes grows, there is an acute need for faster and more accurate methods for phylogenetic profile analysis with increased performance in speed and quality. We propose a novel, efficient method for the detection of genomic idiosyncrasies, i.e. sets of genes found in a specific genome with peculiar phylogenetic properties, such as intra-genome correlations or inter-genome relationships. Our algorithm is a four-step process where genome profiles are first defined as fuzzy vectors, then discretized to binary vectors, followed by a de-noising step, and finally a comparison step to generate intra- and inter-genome distances for each gene profile. The method is validated with a carefully selected benchmark set of five reference genomes, using a range of approaches regarding similarity metrics and pre-processing stages for noise reduction. We demonstrate that the fuzzy profile method consistently identifies the actual phylogenetic relationship and origin of the genes under consideration for the majority of the cases, while the detected outliers are found to be particular genes with peculiar phylogenetic patterns. The proposed method provides a time-efficient and highly scalable approach for phylogenetic stratification, with the detected groups of genes being either similar to their own genome profile or different from it, thus revealing atypical evolutionary histories.


Assuntos
Archaea/genética , Bactérias/genética , Lógica Fuzzy , Genoma Arqueal/genética , Genoma Bacteriano/genética , Filogenia , Genes Arqueais/genética , Genes Bacterianos/genética , Reprodutibilidade dos Testes , Especificidade da Espécie
5.
Cancer Inform ; 8: 31-44, 2009 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-19458792

RESUMO

The current work addresses the unification of Electronic Health Records related to cervical cancer into a single medical knowledge source, in the context of the EU-funded ASSIST research project. The project aims to facilitate the research for cervical precancer and cancer through a system that virtually unifies multiple patient record repositories, physically located in different medical centers/hospitals, thus, increasing flexibility by allowing the formation of study groups "on demand" and by recycling patient records in new studies. To this end, ASSIST uses semantic technologies to translate all medical entities (such as patient examination results, history, habits, genetic profile) and represent them in a common form, encoded in the ASSIST Cervical Cancer Ontology. The current paper presents the knowledge elicitation approach followed, towards the definition and representation of the disease's medical concepts and rules that constitute the basis for the ASSIST Cervical Cancer Ontology. The proposed approach constitutes a paradigm for semantic integration of heterogeneous clinical data that may be applicable to other biomedical application domains.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA