Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Bioinformatics ; 31(9): 1396-404, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25573913

RESUMO

MOTIVATION: Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. RESULTS: Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. AVAILABILITY AND IMPLEMENTATION: All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. CONTACT: ivan.borozan@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Classificação/métodos , DNA Viral , Metagenômica , Modelos Teóricos , Vírus/classificação
2.
Genomics ; 102(3): 140-7, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23603536

RESUMO

Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician.


Assuntos
Genética Médica/métodos , Genoma Humano , Genômica/métodos , Gestão da Informação , Neoplasias/genética , Medicina de Precisão , Software , Variação Genética , Genômica/economia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Análise de Sequência de RNA
3.
Int J Cancer ; 132(7): 1547-55, 2013 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-22948899

RESUMO

The successes of targeted drugs with companion predictive biomarkers and the technological advances in gene sequencing have generated enthusiasm for evaluating personalized cancer medicine strategies using genomic profiling. We assessed the feasibility of incorporating real-time analysis of somatic mutations within exons of 19 genes into patient management. Blood, tumor biopsy and archived tumor samples were collected from 50 patients recruited from four cancer centers. Samples were analyzed using three technologies: targeted exon sequencing using Pacific Biosciences PacBio RS, multiplex somatic mutation genotyping using Sequenom MassARRAY and Sanger sequencing. An expert panel reviewed results prior to reporting to clinicians. A clinical laboratory verified actionable mutations. Fifty patients were recruited. Nineteen actionable mutations were identified in 16 (32%) patients. Across technologies, results were in agreement in 100% of biopsy specimens and 95% of archival specimens. Profiling results from paired archival/biopsy specimens were concordant in 30/34 (88%) patients. We demonstrated that the use of next generation sequencing for real-time genomic profiling in advanced cancer patients is feasible. Additionally, actionable mutations identified in this study were relatively stable between archival and biopsy samples, implying that cancer mutations that are good predictors of drug response may remain constant across clinical stages.


Assuntos
Antineoplásicos/farmacologia , Ensaios Clínicos como Assunto , Genes Neoplásicos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias/genética , Medicina de Precisão , Adulto , Idoso , Biologia Computacional , Estudos de Viabilidade , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Mutação/genética , Metástase Neoplásica , Neoplasias/tratamento farmacológico
4.
BMC Bioinformatics ; 13: 206, 2012 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-22901030

RESUMO

BACKGROUND: It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. RESULTS: Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. CONCLUSIONS: To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID's predictions were successfully validated in vitro.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma Humano , Software , Transcriptoma , Algoritmos , Linhagem Celular Tumoral , Simulação por Computador , Feminino , Humanos , Internet , Vírus Oncogênicos/genética , Neoplasias Ovarianas/genética , Sensibilidade e Especificidade
5.
PLoS One ; 8(10): e76935, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24204709

RESUMO

Next-generation sequencing technologies provide an unparallelled opportunity for the characterization and discovery of known and novel viruses. Because viruses are known to have the highest mutation rates when compared to eukaryotic and bacterial organisms, we assess the extent to which eleven well-known alignment algorithms (BLAST, BLAT, BWA, BWA-SW, BWA-MEM, BFAST, Bowtie2, Novoalign, GSNAP, SHRiMP2 and STAR) can be used for characterizing mutated and non-mutated viral sequences--including those that exhibit RNA splicing--in transcriptome samples. To evaluate aligners objectively we developed a realistic RNA-Seq simulation and evaluation framework (RiSER) and propose a new combined score to rank aligners for viral characterization in terms of their precision, sensitivity and alignment accuracy. We used RiSER to simulate both human and viral read sequences and suggest the best set of aligners for viral sequence characterization in human transcriptome samples. Our results show that significant and substantial differences exist between aligners and that a digital-subtraction-based viral identification framework can and should use different aligners for different parts of the process. We determine the extent to which mutated viral sequences can be effectively characterized and show that more sensitive aligners such as BLAST, BFAST, SHRiMP2, BWA-SW and GSNAP can accurately characterize substantially divergent viral sequences with up to 15% overall sequence mutation rate. We believe that the results presented here will be useful to researchers choosing aligners for viral sequence characterization using next-generation sequencing data.


Assuntos
Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Vírus/genética , Genes Virais/genética , Genoma Humano/genética , Genoma Viral/genética , HIV-1/genética , Herpesvirus Humano 1/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Papillomavirus Humano 18/genética , Humanos , Virus da Influenza A Subtipo H5N1/genética , Internet , Mutação , Reprodutibilidade dos Testes , Transcriptoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA