Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Bioinformatics ; 35(8): 1299-1309, 2019 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-30192920

RESUMEN

MOTIVATION: Low-frequency DNA mutations are often confounded with technical artifacts from sample preparation and sequencing. With unique molecular identifiers (UMIs), most of the sequencing errors can be corrected. However, errors before UMI tagging, such as DNA polymerase errors during end repair and the first PCR cycle, cannot be corrected with single-strand UMIs and impose fundamental limits to UMI-based variant calling. RESULTS: We developed smCounter2, a UMI-based variant caller for targeted sequencing data and an upgrade from the current version of smCounter. Compared to smCounter, smCounter2 features lower detection limit that decreases from 1 to 0.5%, better overall accuracy (particularly in non-coding regions), a consistent threshold that can be applied to both deep and shallow sequencing runs, and easier use via a Docker image and code for read pre-processing. We benchmarked smCounter2 against several state-of-the-art UMI-based variant calling methods using multiple datasets and demonstrated smCounter2's superior performance in detecting somatic variants. At the core of smCounter2 is a statistical test to determine whether the allele frequency of the putative variant is significantly above the background error rate, which was carefully modeled using an independent dataset. The improved accuracy in non-coding regions was mainly achieved using novel repetitive region filters that were specifically designed for UMI data. AVAILABILITY AND IMPLEMENTATION: The entire pipeline is available at https://github.com/qiaseq/qiaseq-dna under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Frecuencia de los Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Reacción en Cadena de la Polimerasa , Análisis de Secuencia de ADN
2.
JAMA ; 319(23): 2401-2409, 2018 06 19.
Artículo en Inglés | MEDLINE | ID: mdl-29922827

RESUMEN

Importance: Individuals genetically predisposed to pancreatic cancer may benefit from early detection. Genes that predispose to pancreatic cancer and the risks of pancreatic cancer associated with mutations in these genes are not well defined. Objective: To determine whether inherited germline mutations in cancer predisposition genes are associated with increased risks of pancreatic cancer. Design, Setting, and Participants: Case-control analysis to identify pancreatic cancer predisposition genes; longitudinal analysis of patients with pancreatic cancer for prognosis. The study included 3030 adults diagnosed as having pancreatic cancer and enrolled in a Mayo Clinic registry between October 12, 2000, and March 31, 2016, with last follow-up on June 22, 2017. Reference controls were 123 136 individuals with exome sequence data in the public Genome Aggregation Database and 53 105 in the Exome Aggregation Consortium database. Exposures: Individuals were classified based on carrying a deleterious mutation in cancer predisposition genes and having a personal or family history of cancer. Main Outcomes and Measures: Germline mutations in coding regions of 21 cancer predisposition genes were identified by sequencing of products from a custom multiplex polymerase chain reaction-based panel; associations of genes with pancreatic cancer were assessed by comparing frequency of mutations in genes of pancreatic cancer patients with those of reference controls. Results: Comparing 3030 case patients with pancreatic cancer (43.2% female; 95.6% non-Hispanic white; mean age at diagnosis, 65.3 [SD, 10.7] years) with reference controls, significant associations were observed between pancreatic cancer and mutations in CDKN2A (0.3% of cases and 0.02% of controls; odds ratio [OR], 12.33; 95% CI, 5.43-25.61); TP53 (0.2% of cases and 0.02% of controls; OR, 6.70; 95% CI, 2.52-14.95); MLH1 (0.13% of cases and 0.02% of controls; OR, 6.66; 95% CI, 1.94-17.53); BRCA2 (1.9% of cases and 0.3% of controls; OR, 6.20; 95% CI, 4.62-8.17); ATM (2.3% of cases and 0.37% of controls; OR, 5.71; 95% CI, 4.38-7.33); and BRCA1 (0.6% of cases and 0.2% of controls; OR, 2.58; 95% CI, 1.54-4.05). Conclusions and Relevance: In this case-control study, mutations in 6 genes associated with pancreatic cancer were found in 5.5% of all pancreatic cancer patients, including 7.9% of patients with a family history of pancreatic cancer and 5.2% of patients without a family history of pancreatic cancer. Further research is needed for replication in other populations.


Asunto(s)
Carcinoma Ductal Pancreático/genética , Predisposición Genética a la Enfermedad , Mutación de Línea Germinal , Neoplasias Pancreáticas/genética , Anciano , Estudios de Casos y Controles , ADN de Neoplasias/análisis , Bases de Datos Genéticas , Femenino , Humanos , Masculino , Persona de Mediana Edad , Modelos de Riesgos Proporcionales , Sistema de Registros , Riesgo , Análisis de Secuencia de ADN , Análisis de Supervivencia
3.
BMC Genomics ; 18(1): 5, 2017 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-28049435

RESUMEN

BACKGROUND: Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. RESULTS: We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. CONCLUSIONS: We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.


Asunto(s)
Alelos , Secuencia de Bases , Código de Barras del ADN Taxonómico , Frecuencia de los Genes , Variación Genética , Biología Computacional/métodos , Modelos Estadísticos , Reacción en Cadena de la Polimerasa Multiplex , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
4.
BMC Bioinformatics ; 16: 17, 2015 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-25626454

RESUMEN

BACKGROUND: Next-generation sequencing (NGS) is rapidly becoming common practice in clinical diagnostics and cancer research. In addition to the detection of single nucleotide variants (SNVs), information on copy number variants (CNVs) is of great interest. Several algorithms exist to detect CNVs by analyzing whole genome sequencing data or data from samples enriched by hybridization-capture. PCR-enriched amplicon-sequencing data have special characteristics that have been taken into account by only one publicly available algorithm so far. RESULTS: We describe a new algorithm named quandico to detect copy number differences based on NGS data generated following PCR-enrichment. A weighted t-test statistic was applied to calculate probabilities (p-values) of copy number changes. We assessed the performance of the method using sequencing reads generated from reference DNA with known CNVs, and we were able to detect these variants with 98.6% sensitivity and 98.5% specificity which is significantly better than another recently described method for amplicon sequencing. The source code (R-package) of quandico is licensed under the GPLv3 and it is available at https://github.com/reineckef/quandico . CONCLUSION: We demonstrated that our new algorithm is suitable to call copy number changes using data from PCR-enriched samples with high sensitivity and specificity even for single copy differences.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reacción en Cadena de la Polimerasa/métodos , Análisis de Secuencia de ADN/métodos , Estudios de Casos y Controles , Variaciones en el Número de Copia de ADN , Humanos , Sensibilidad y Especificidad
5.
BMC Genomics ; 15: 1073, 2014 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-25480444

RESUMEN

BACKGROUND: Analysis of targeted amplicon sequencing data presents some unique challenges in comparison to the analysis of random fragment sequencing data. Whereas reads from randomly fragmented DNA have arbitrary start positions, the reads from amplicon sequencing have fixed start positions that coincide with the amplicon boundaries. As a result, any variants near the amplicon boundaries can cause misalignments of multiple reads that can ultimately lead to false-positive or false-negative variant calls. RESULTS: We show that amplicon boundaries are variant calling blind spots where the variant calls are highly inaccurate. We propose that an effective strategy to avoid these blind spots is to incorporate the primer bases in obtaining read alignments and post-processing of the alignments, thereby effectively moving these blind spots into the primer binding regions (which are not used for variant calling). Targeted sequencing data analysis pipelines can provide better variant calling accuracy when primer bases are retained and sequenced. CONCLUSIONS: Read bases beyond the variant site are necessary for analysis of amplicon sequencing data. Enzymatic primer digestion, if used in the target enrichment process, should leave at least a few primer bases to ensure that these bases are available during data analysis. The primer bases should only be removed immediately before the variant calling step to ensure that the variants can be called irrespective of where they occur within the amplicon insert region.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN/métodos , Simulación por Computador , Cartilla de ADN , Reacción en Cadena de la Polimerasa/métodos , Reproducibilidad de los Resultados
6.
BMC Genomics ; 15: 244, 2014 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-24678773

RESUMEN

BACKGROUND: High-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research. Many algorithms have been developed for somatic single nucleotide variant (SNV) detection in matched tumor-normal DNA sequencing. Although numerous studies have compared the performance of various algorithms on exome data, there has not yet been a systematic evaluation using PCR-enriched amplicon data with a range of variant allele fractions. The recently developed gold standard variant set for the reference individual NA12878 by the NIST-led "Genome in a Bottle" Consortium (NIST-GIAB) provides a good resource to evaluate admixtures with various SNV fractions. RESULTS: Using the NIST-GIAB gold standard, we compared the performance of five popular somatic SNV calling algorithms (GATK UnifiedGenotyper followed by simple subtraction, MuTect, Strelka, SomaticSniper and VarScan2) for matched tumor-normal amplicon and exome sequencing data. CONCLUSIONS: We demonstrated that the five commonly used somatic SNV calling methods are applicable to both targeted amplicon and exome sequencing data. However, the sensitivities of these methods vary based on the allelic fraction of the mutation in the tumor sample. Our analysis can assist researchers in choosing a somatic SNV calling method suitable for their specific needs.


Asunto(s)
Biología Computacional/métodos , Exoma , Secuenciación de Nucleótidos de Alto Rendimiento , Mutación , Programas Informáticos , Algoritmos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , Humanos , Mutación Puntual , Curva ROC , Sensibilidad y Especificidad
7.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 7058-7062, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34892728

RESUMEN

In this work, we demonstrated a Smart Sleep Mask with several integrated physiological sensors such as 3-axis accelerometers, respiratory acoustic sensor, and an eye movement sensor. In particular, using infrared optical sensors, eye movement frequency, direction, and amplitude can be directly monitored and recorded during sleep sessions. We also developed a mobile app for data storage, signal processing and data analytics. Aggregation of these signals from a single wearable device may offer ease of use and more insights for sleep monitoring and REM sleep assessment. The user-friendly mask design can enable at-home use applications in the studies of digital biomarkers for sleep disorder related neurodegenerative diseases. Examples include REM Sleep Behavior Disorder, epilepsy event detection and stroke induced facial and eye movement disorder.Clinical Relevance-Many diseases such as stroke, epilepsy, and Parkinson's disease can cause significant abnormal events during sleep or are associated with sleep disorder. A smart sleep mask may serve as a simple platform to provide various physiological signals and generate clinical meaningful insights by revealing the neurological activities during various sleep stages.


Asunto(s)
Trastorno de la Conducta del Sueño REM , Humanos , Polisomnografía , Sueño , Fases del Sueño , Sueño REM
8.
Sci Rep ; 9(1): 4810, 2019 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-30886209

RESUMEN

For specific detection of somatic variants at very low levels, artifacts from the NGS workflow have to be eliminated. Various approaches using unique molecular identifiers (UMI) to analytically remove NGS artifacts have been described. Among them, Duplex-seq was shown to be highly effective, by leveraging the sequence complementarity of two DNA strands. However, all of the published Duplex-seq implementations so far required pair-end sequencing and in the case of combining duplex sequencing with target enrichment, lengthy hybridization enrichment was required. We developed a simple protocol, which enabled the retrieval of duplex UMI in multiplex PCR based enrichment and sequencing. Using this protocol and reference materials, we demonstrated the accurate detection of known SNVs at 0.1-0.2% allele fractions, aided by duplex UMI. We also observed that low level base substitution artifacts could be introduced when preparing in vitro DNA reference materials, which could limit their utility as a benchmarking tool for variant detection at very low levels. Our new targeted sequencing method offers the benefit of using duplex UMI to remove NGS artifacts in a much more simplified workflow than existing targeted duplex sequencing methods.


Asunto(s)
Análisis Mutacional de ADN/métodos , ADN/aislamiento & purificación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reacción en Cadena de la Polimerasa Multiplex/métodos , Artefactos , ADN/genética , Análisis Mutacional de ADN/instrumentación , Secuenciación de Nucleótidos de Alto Rendimiento/instrumentación , Humanos , Límite de Detección , Reacción en Cadena de la Polimerasa Multiplex/instrumentación , Mutación , Neoplasias/diagnóstico , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Flujo de Trabajo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA