Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34015811

RESUMEN

Formalin-fixed paraffin-embedded tissue, the most common tissue specimen stored in clinical practice, presents challenges in the analysis due to formalin-induced artifacts. Here, we present Strand Orientation Bias Detector (SOBDetector), a flexible computational platform compatible with all the common somatic SNV-calling pipelines, designed to assess the probability whether a given detected mutation is an artifact. The underlying predictor mechanism is based on the posterior distribution of a Bayesian logistic regression model trained on The Cancer Genome Atlas whole exomes. SOBDetector is a freely available cross-platform program, implemented in Java 1.8.


Asunto(s)
Artefactos , Técnicas Citológicas/normas , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Modelos Estadísticos , Análisis de Secuencia de ADN/normas , Moldes Genéticos , Algoritmos , ADN de Neoplasias , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación , Neoplasias/diagnóstico , Neoplasias/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodos
2.
BMC Bioinformatics ; 21(Suppl 18): 498, 2020 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-33375939

RESUMEN

BACKGROUND: Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines. RESULTS: In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping. CONCLUSIONS: Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.


Asunto(s)
Inmunoterapia , Neoplasias/terapia , Péptidos/análisis , Medicina de Precisión , Aprendizaje Automático Supervisado , Epítopos/inmunología , Epítopos/metabolismo , Humanos , Polimorfismo de Nucleótido Simple , Espectrometría de Masas en Tándem , Secuenciación del Exoma
3.
J Comput Biol ; 30(4): 538-551, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36999902

RESUMEN

High-throughput DNA and RNA sequencing are revolutionizing precision oncology, enabling personalized therapies such as cancer vaccines designed to target tumor-specific neoepitopes generated by somatic mutations expressed in cancer cells. Identification of these neoepitopes from next-generation sequencing data of clinical samples remains challenging and requires the use of complex bioinformatics pipelines. In this paper, we present GeNeo, a bioinformatics toolbox for genomics-guided neoepitope prediction. GeNeo includes a comprehensive set of tools for somatic variant calling and filtering, variant validation, and neoepitope prediction and filtering. For ease of use, GeNeo tools can be accessed via web-based interfaces deployed on a Galaxy portal publicly accessible at https://neo.engr.uconn.edu/. A virtual machine image for running GeNeo locally is also available to academic users upon request.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Neoplasias/terapia , Medicina de Precisión , Genómica/métodos , Biología Computacional , Inmunoterapia , Secuenciación de Nucleótidos de Alto Rendimiento
4.
Leuk Res ; 135: 107419, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37956474

RESUMEN

Clonal hematopoiesis (CH) is defined by the presence of an expanded clonal hematopoietic cell population due to an acquired mutation conferring a selective growth advantage and is known to predispose to hematologic malignancy. In this review, we discuss sequencing methods for CH detection in bulk sequencing data and corresponding bioinformatic approaches for variant calling, filtering, and curation. We detail practical recommendations for CH calling. Finally, we discuss how improvements in CH sequencing and bioinformatic approaches will enable the characterization of CH trajectories, its impact on human health, and therapeutic approaches to mitigate its adverse effects.


Asunto(s)
Hematopoyesis Clonal , Neoplasias Hematológicas , Humanos , Hematopoyesis Clonal/genética , Hematopoyesis/genética , Neoplasias Hematológicas/genética , Neoplasias Hematológicas/terapia , Neoplasias Hematológicas/patología , Mutación , Células Clonales/patología
5.
Cancers (Basel) ; 15(13)2023 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-37444566

RESUMEN

(1) Background: Next-generation sequencing (NGS) of patients with advanced tumors is becoming an established method in Molecular Tumor Boards. However, somatic variant detection, interpretation, and report generation, require in-depth knowledge of both bioinformatics and oncology. (2) Methods: MIRACUM-Pipe combines many individual tools into a seamless workflow for comprehensive analyses and annotation of NGS data including quality control, alignment, variant calling, copy number variation estimation, evaluation of complex biomarkers, and RNA fusion detection. (3) Results: MIRACUM-Pipe offers an easy-to-use, one-prompt standardized solution to analyze NGS data, including quality control, variant calling, copy number estimation, annotation, visualization, and report generation. (4) Conclusions: MIRACUM-Pipe, a versatile pipeline for NGS, can be customized according to bioinformatics and clinical needs and to support clinical decision-making with visual processing and interactive reporting.

6.
Gigascience ; 11(1)2022 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-35022699

RESUMEN

BACKGROUND: The accurate detection of somatic variants from sequencing data is of key importance for cancer treatment and research. Somatic variant calling requires a high sequencing depth of the tumor sample, especially when the detection of low-frequency variants is also desired. In turn, this leads to large volumes of raw sequencing data to process and hence, large computational requirements. For example, calling the somatic variants according to the GATK best practices guidelines requires days of computing time for a typical whole-genome sequencing sample. FINDINGS: We introduce Halvade Somatic, a framework for somatic variant calling from DNA sequencing data that takes advantage of multi-node and/or multi-core compute platforms to reduce runtime. It relies on Apache Spark to provide scalable I/O and to create and manage data streams that are processed on different CPU cores in parallel. Halvade Somatic contains all required steps to process the tumor and matched normal sample according to the GATK best practices recommendations: read alignment (BWA), sorting of reads, preprocessing steps such as marking duplicate reads and base quality score recalibration (GATK), and, finally, calling the somatic variants (Mutect2). Our approach reduces the runtime on a single 36-core node to 19.5 h compared to a runtime of 84.5 h for the original pipeline, a speedup of 4.3 times. Runtime can be further decreased by scaling to multiple nodes, e.g., we observe a runtime of 1.36 h using 16 nodes, an additional speedup of 14.4 times. Halvade Somatic supports variant calling from both whole-genome sequencing and whole-exome sequencing data and also supports Strelka2 as an alternative or complementary variant calling tool. We provide a Docker image to facilitate single-node deployment. Halvade Somatic can be executed on a variety of compute platforms, including Amazon EC2 and Google Cloud. CONCLUSIONS: To our knowledge, Halvade Somatic is the first somatic variant calling pipeline that leverages Big Data processing platforms and provides reliable, scalable performance. Source code is freely available.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Secuenciación del Exoma , Secuenciación Completa del Genoma
7.
Front Genet ; 13: 834764, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35571031

RESUMEN

Formalin fixation of paraffin-embedded tissue samples is a well-established method for preserving tissue and is routinely used in clinical settings. Although formalin-fixed, paraffin-embedded (FFPE) tissues are deemed crucial for research and clinical applications, the fixation process results in molecular damage to nucleic acids, thus confounding their use in genome sequence analysis. Methods to improve genomic data quality from FFPE tissues have emerged, but there remains significant room for improvement. Here, we use whole-genome sequencing (WGS) data from matched Fresh Frozen (FF) and FFPE tissue samples to optimize a sensitive and precise FFPE single nucleotide variant (SNV) calling approach. We present methods to reduce the prevalence of false-positive SNVs by applying combinatorial techniques to five publicly available variant callers. We also introduce FFPolish, a novel variant classification method that efficiently classifies FFPE-specific false-positive variants. Our combinatorial and statistical techniques improve precision and F1 scores compared to the results of publicly available tools when tested individually.

8.
Methods Mol Biol ; 2493: 267-277, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35751821

RESUMEN

SCAN-SNV is a recent computational tool for somatic single-nucleotide variant (SNV) identification from the single-cell DNA sequencing data. The workflow of the SCAN-SNV package is as follows. First, candidate somatic SNVs and credible heterozygous single-nucleotide polymorphisms (hSNP) are obtained by analyzing single-cell and matched bulk sequencing data, respectively. Subsequently, SCAN-SNV estimates genome-wide allele-specific amplification balance (AB) at any position of DNA sequencing data using a probabilistic spatial statistical model. Finally, candidate somatic SNVs that are likely artifacts according to the AB predictions are further removed to obtain putative mutations. This chapter provides a step-by-step practical guide of the package by explaining how to install and use the variance caller in a real-world example.


Asunto(s)
Polimorfismo de Nucleótido Simple , Alelos , Secuencia de Bases , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN
9.
Genome Biol ; 23(1): 248, 2022 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-36451239

RESUMEN

We present SIEVE, a statistical method for the joint inference of somatic variants and cell phylogeny under the finite-sites assumption from single-cell DNA sequencing. SIEVE leverages raw read counts for all nucleotides and corrects the acquisition bias of branch lengths. In our simulations, SIEVE outperforms other methods in phylogenetic reconstruction and variant calling accuracy, especially in the inference of homozygous variants. Applying SIEVE to three datasets, one for triple-negative breast (TNBC), and two for colorectal cancer (CRC), we find that double mutant genotypes are rare in CRC but unexpectedly frequent in the TNBC samples.


Asunto(s)
Neoplasias de la Mama Triple Negativas , Humanos , Filogenia , Secuencia de Bases , Análisis de Secuencia de ADN , ADN , Nucleótidos
10.
BMC Med Genomics ; 12(Suppl 9): 181, 2019 12 24.
Artículo en Inglés | MEDLINE | ID: mdl-31874647

RESUMEN

BACKGROUND: The application of next-generation sequencing in cancer has revealed the genomic landscape of many tumour types and is nowadays routinely used in research and clinical settings. Multiple algorithms have been developed to detect somatic variation from sequencing data using either paired tumour-blood or tumour-only samples. Most of these methods have been developed and evaluated for the identification of somatic variation using Illumina sequencing datasets of moderate coverage. However, a comprehensive evaluation of somatic variant detection algorithms on Ion Torrent targeted deep sequencing data has not been performed. METHODS: We have applied three somatic detection algorithms, Torrent Variant Caller, MuTect2 and VarScan2, on a large cohort of ovarian cancer patients comprising of 208 paired tumour-blood samples and 253 tumour-only samples sequenced deeply on Ion Torrent Proton platform across 330 amplicons. Subsequently, the concordance and performance of the three somatic variant callers were assessed. RESULTS: We have observed low concordance across the algorithms with only 0.5% of SNV and 0.02% of INDEL calls in common across all three methods. The intersection of all methods showed better performance when assessed using correlation with known mutational signatures, overlap with COSMIC variation and by examining the variant characteristics. The Torrent Variant Caller also performed well with the advantage of not eliminating a high number of variants that could lead to high type II error. CONCLUSIONS: Our results suggest that caution should be taken when applying state-of-the-art somatic variant algorithms to Ion Torrent targeted deep sequencing data. Better quality control procedures and strategies that combine results from multiple methods should ensure that higher accuracy is achieved. This is essential to ensure that results from bioinformatics pipelines using Ion Torrent deep sequencing can be robustly applied in cancer research and in the clinic.


Asunto(s)
Algoritmos , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Frecuencia de los Genes , Mutación INDEL , Polimorfismo de Nucleótido Simple
11.
Front Oncol ; 9: 119, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30949446

RESUMEN

Archival tumor samples represent a rich resource of annotated specimens for translational genomics research. However, standard variant calling approaches require a matched normal sample from the same individual, which is often not available in the retrospective setting, making it difficult to distinguish between true somatic variants and individual-specific germline variants. Archival sections often contain adjacent normal tissue, but this tissue can include infiltrating tumor cells. As existing comparative somatic variant callers are designed to exclude variants present in the normal sample, a novel approach is required to leverage adjacent normal tissue with infiltrating tumor cells for somatic variant calling. Here we present lumosVar 2.0, a software package designed to jointly analyze multiple samples from the same patient, built upon our previous single sample tumor only variant caller lumosVar 1.0. The approach assumes that the allelic fraction of somatic variants and germline variants follow different patterns as tumor content and copy number state change. lumosVar 2.0 estimates allele specific copy number and tumor sample fractions from the data, and uses a to model to determine expected allelic fractions for somatic and germline variants and to classify variants accordingly. To evaluate the utility of lumosVar 2.0 to jointly call somatic variants with tumor and adjacent normal samples, we used a glioblastoma dataset with matched high and low tumor content and germline whole exome sequencing data (for true somatic variants) available for each patient. Both sensitivity and positive predictive value were improved when analyzing the high tumor and low tumor samples jointly compared to analyzing the samples individually or in-silico pooling of the two samples. Finally, we applied this approach to a set of breast and prostate archival tumor samples for which tumor blocks containing adjacent normal tissue were available for sequencing. Joint analysis using lumosVar 2.0 detected several variants, including known cancer hotspot mutations that were not detected by standard somatic variant calling tools using the adjacent tissue as presumed normal reference. Together, these results demonstrate the utility of leveraging paired tissue samples to improve somatic variant calling when a constitutional sample is not available.

12.
Curr Protoc Bioinformatics ; 45: 15.5.1-8, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25431635

RESUMEN

Detecting somatic single nucleotide variants (SNVs) is an essential component of cancer research with next generation sequencing data. This protocol describes how to run the SomaticSniper somatic SNV detector and then filter the output to eliminate most false positives. It also includes support protocols detailing the compilation of the software.


Asunto(s)
Biología Computacional , Polimorfismo de Nucleótido Simple , Internet
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA