Pesquisa | Secretaria de Estado da Saúde

Genome Alert!: A standardized procedure for genomic variant reinterpretation and automated gene-phenotype reassessment in clinical routine.

Yauy, Kevin; Lecoquierre, François; Baert-Desurmont, Stéphanie; Trost, Detlef; Boughalem, Aicha; Luscan, Armelle; Costa, Jean-Marc; Geromel, Vanna; Raymond, Laure; Richard, Pascale; Coutant, Sophie; Broutin, Mélanie; Lanos, Raphael; Fort, Quentin; Cackowski, Stenzel; Testard, Quentin; Diallo, Abdoulaye; Soirat, Nicolas; Holder, Jean-Marc; Duforet-Frebourg, Nicolas; Bouge, Anne-Laure; Beaumeunier, Sacha; Bertrand, Denis; Audoux, Jerome; Genevieve, David; Mesnard, Laurent; Nicolas, Gael; Thevenon, Julien; Philippe, Nicolas.

Genet Med ; 24(6): 1316-1327, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35311657

RESUMO

PURPOSE: Retrospective interpretation of sequenced data in light of the current literature is a major concern of the field. Such reinterpretation is manual and both human resources and variable operating procedures are the main bottlenecks. METHODS: Genome Alert! method automatically reports changes with potential clinical significance in variant classification between releases of the ClinVar database. Using ClinVar submissions across time, this method assigns validity category to gene-disease associations. RESULTS: Between July 2017 and December 2019, the retrospective analysis of ClinVar submissions revealed a monthly median of 1247 changes in variant classification with potential clinical significance and 23 new gene-disease associations. Re-examination of 4929 targeted sequencing files highlighted 45 changes in variant classification, and of these classifications, 89% were expert validated, leading to 4 additional diagnoses. Genome Alert! gene-disease association catalog provided 75 high-confidence associations not available in the OMIM morbid list; of which, 20% became available in OMIM morbid list For more than 356 negative exome sequencing data that were reannotated for variants in these 75 genes, this elective approach led to a new diagnosis. CONCLUSION: Genome Alert! (https://genomealert.univ-grenoble-alpes.fr/) enables systematic and reproducible reinterpretation of acquired sequencing data in a clinical routine with limited human resource effect.

Assuntos

Bases de Dados Genéticas , Variação Genética , Variação Genética/genética , Genoma Humano/genética , Genômica , Humanos , Fenótipo , Estudos Retrospectivos

SimBA: A methodology and tools for evaluating the performance of RNA-Seq bioinformatic pipelines.

Audoux, Jérôme; Salson, Mikaël; Grosset, Christophe F; Beaumeunier, Sacha; Holder, Jean-Marc; Commes, Thérèse; Philippe, Nicolas.

BMC Bioinformatics ; 18(1): 428, 2017 Sep 29.

Artigo em Inglês | MEDLINE | ID: mdl-28969586

RESUMO

BACKGROUND: The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices. RESULTS: To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved. CONCLUSION: Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis for comparing the performance of RNA-Seq bioinformatics pipelines in addressing a specific biological question. We would like to see the creation of a reference corpus of data-sets that would allow accurate comparison between benchmarks performed by different groups and the publication of more benchmarks based on this public corpus. SimBA software and data-set are available at http://cractools.gforge.inria.fr/softwares/simba/ .

Assuntos

Biologia Computacional/métodos , Simulação por Computador , Análise de Sequência de RNA/métodos , Software , Fusão Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação INDEL/genética , Polimorfismo de Nucleotídeo Único/genética

New chimeric RNAs in acute myeloid leukemia.

Rufflé, Florence; Audoux, Jerome; Boureux, Anthony; Beaumeunier, Sacha; Gaillard, Jean-Baptiste; Bou Samra, Elias; Megarbane, Andre; Cassinat, Bruno; Chomienne, Christine; Alves, Ronnie; Riquier, Sebastien; Gilbert, Nicolas; Lemaitre, Jean-Marc; Bacq-Daian, Delphine; Bougé, Anne Laure; Philippe, Nicolas; Commes, Therese.

F1000Res ; 62017.

Artigo em Inglês | MEDLINE | ID: mdl-29623188

RESUMO

Background: High-throughput next generation sequencing (NGS) technologies enable the detection of biomarkers used for tumor classification, disease monitoring and cancer therapy. Whole-transcriptome analysis using RNA-seq is important, not only as a means of understanding the mechanisms responsible for complex diseases but also to efficiently identify novel genes/exons, splice isoforms, RNA editing, allele-specific mutations, differential gene expression and fusion-transcripts or chimeric RNA (chRNA). Methods: We used Crac, a tool that uses genomic locations and local coverage to classify biological events and directly infer splice and chimeric junctions within a single read. Crac's algorithm extracts transcriptional chimeric events irrespective of annotation with a high sensitivity, and CracTools was used to aggregate, annotate and filter the chRNA reads. The selected chRNA candidates were validated by real time PCR and sequencing. In order to check the tumor specific expression of chRNA, we analyzed a publicly available dataset using a new tag search approach. Results: We present data related to acute myeloid leukemia (AML) RNA-seq analysis. We highlight novel biological cases of chRNA, in addition to previously well characterized leukemia chRNA. We have identified and validated 17 chRNAs among 3 AML patients: 10 from an AML patient with a translocation between chromosomes 15 and 17 (AML-t(15;17), 4 from patient with normal karyotype (AML-NK) 3 from a patient with chromosomal 16 inversion (AML-inv16). The new fusion transcripts can be classified into four groups according to the exon organization. Conclusions: All groups suggest complex but distinct synthesis mechanisms involving either collinear exons of different genes, non-collinear exons, or exons of different chromosomes. Finally, we check tumor-specific expression in a larger RNA-seq AML cohort and identify new AML biomarkers that could improve diagnosis and prognosis of AML.

On the evaluation of the fidelity of supervised classifiers in the prediction of chimeric RNAs.

Beaumeunier, Sacha; Audoux, Jérôme; Boureux, Anthony; Ruffle, Florence; Commes, Thérèse; Philippe, Nicolas; Alves, Ronnie.

BioData Min ; 9: 34, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27822312

RESUMO

BACKGROUND: High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. RESULTS: The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases. Moreover, if we succeed to have a proper set of observations (enough sequencing data) about true chRNAs, chances are that the devised model can not be able to generalize beyond it. Like any other machine learning problem, the first big issue is finding the good data to build models. As far as we were concerned, there is no common benchmark data available for chRNAs detection. The definition of a classification baseline is lacking in the related literature too. In this work we are moving towards benchmark data and an evaluation of the fidelity of supervised classifiers in the prediction of chRNAs. CONCLUSIONS: We proposed a modelization strategy that can be used to increase the tools performances in context of chRNA classification based on a simulated data generator, that permit to continuously integrate new complex chimeric events. The pipeline incorporated a genome mutation process and simulated RNA-seq data. The reads within distinct depth were aligned and analysed by CRAC that integrates genomic location and local coverage, allowing biological predictions at the read scale. Additionally, these reads were functionally annotated and aggregated to form chRNAs events, making it possible to evaluate ML methods (classifiers) performance in both levels of reads and events. Ensemble learning strategies demonstrated to be more robust to this classification problem, providing an average AUC performance of 95 % (ACC=94 %, Kappa=0.87 %). The resulting classification models were also tested on real RNA-seq data from a set of twenty-seven patients with acute myeloid leukemia (AML).

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa