Búsqueda | BVS CLAP/SMR-OPS/OMS

Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA.

Wan, Nathan; Weinberg, David; Liu, Tzu-Yu; Niehaus, Katherine; Ariazi, Eric A; Delubac, Daniel; Kannan, Ajay; White, Brandon; Bailey, Mitch; Bertin, Marvin; Boley, Nathan; Bowen, Derek; Cregg, James; Drake, Adam M; Ennis, Riley; Fransen, Signe; Gafni, Erik; Hansen, Loren; Liu, Yaping; Otte, Gabriel L; Pecson, Jennifer; Rice, Brandon; Sanderson, Gabriel E; Sharma, Aarushi; St John, John; Tang, Catherina; Tzou, Abraham; Young, Leilani; Putcha, Girish; Haque, Imran S.

BMC Cancer ; 19(1): 832, 2019 Aug 23.

Artículo en Inglés | MEDLINE | ID: mdl-31443703

RESUMEN

BACKGROUND: Blood-based methods using cell-free DNA (cfDNA) are under development as an alternative to existing screening tests. However, early-stage detection of cancer using tumor-derived cfDNA has proven challenging because of the small proportion of cfDNA derived from tumor tissue in early-stage disease. A machine learning approach to discover signatures in cfDNA, potentially reflective of both tumor and non-tumor contributions, may represent a promising direction for the early detection of cancer. METHODS: Whole-genome sequencing was performed on cfDNA extracted from plasma samples (N = 546 colorectal cancer and 271 non-cancer controls). Reads aligning to protein-coding gene bodies were extracted, and read counts were normalized. cfDNA tumor fraction was estimated using IchorCNA. Machine learning models were trained using k-fold cross-validation and confounder-based cross-validations to assess generalization performance. RESULTS: In a colorectal cancer cohort heavily weighted towards early-stage cancer (80% stage I/II), we achieved a mean AUC of 0.92 (95% CI 0.91-0.93) with a mean sensitivity of 85% (95% CI 83-86%) at 85% specificity. Sensitivity generally increased with tumor stage and increasing tumor fraction. Stratification by age, sequencing batch, and institution demonstrated the impact of these confounders and provided a more accurate assessment of generalization performance. CONCLUSIONS: A machine learning approach using cfDNA achieved high sensitivity and specificity in a large, predominantly early-stage, colorectal cancer cohort. The possibility of systematic technical and institution-specific biases warrants similar confounder analyses in other studies. Prospective validation of this machine learning method and evaluation of a multi-analyte approach are underway.

Asunto(s)

Biomarcadores de Tumor , ADN Tumoral Circulante , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/patología , Genoma Humano , Genómica , Aprendizaje Automático , Anciano , Anciano de 80 o más Años , Neoplasias Colorrectales/sangre , Biología Computacional/métodos , Femenino , Perfilación de la Expresión Génica , Genómica/métodos , Humanos , Masculino , Persona de Mediana Edad , Estadificación de Neoplasias , Curva ROC , Reproducibilidad de los Resultados , Transcriptoma

COSMOS: Python library for massively parallel workflows.

Gafni, Erik; Luquette, Lovelace J; Lancaster, Alex K; Hawkins, Jared B; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P; Tonellato, Peter J.

Bioinformatics ; 30(20): 2956-8, 2014 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-24982428

RESUMEN

SUMMARY: Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. AVAILABILITY AND IMPLEMENTATION: Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. CONTACT: dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Lenguajes de Programación

Biomedical cloud computing with Amazon Web Services.

Fusaro, Vincent A; Patil, Prasad; Gafni, Erik; Wall, Dennis P; Tonellato, Peter J.

PLoS Comput Biol ; 7(8): e1002147, 2011 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-21901085

RESUMEN

In this overview to biomedical computing in the cloud, we discussed two primary ways to use the cloud (a single instance or cluster), provided a detailed example using NGS mapping, and highlighted the associated costs. While many users new to the cloud may assume that entry is as straightforward as uploading an application and selecting an instance type and storage options, we illustrated that there is substantial up-front effort required before an application can make full use of the cloud's vast resources. Our intention was to provide a set of best practices and to illustrate how those apply to a typical application pipeline for biomedical informatics, but also general enough for extrapolation to other types of computational problems. Our mapping example was intended to illustrate how to develop a scalable project and not to compare and contrast alignment algorithms for read mapping and genome assembly. Indeed, with a newer aligner such as Bowtie, it is possible to map the entire African genome using one m2.2xlarge instance in 48 hours for a total cost of approximately $48 in computation time. In our example, we were not concerned with data transfer rates, which are heavily influenced by the amount of available bandwidth, connection latency, and network availability. When transferring large amounts of data to the cloud, bandwidth limitations can be a major bottleneck, and in some cases it is more efficient to simply mail a storage device containing the data to AWS (http://aws.amazon.com/importexport/). More information about cloud computing, detailed cost analysis, and security can be found in references.

Asunto(s)

Almacenamiento y Recuperación de la Información/métodos , Internet , Programas Informáticos , Biología Computacional , Seguridad Computacional , Almacenamiento y Recuperación de la Información/economía

Evaluation of cfDNA as an early detection assay for dense tissue breast cancer.

Barbirou, Mouadh; Miller, Amanda A; Gafni, Erik; Mezlini, Amel; Zidi, Asma; Boley, Nathan; Tonellato, Peter J.

Sci Rep ; 12(1): 8458, 2022 05 19.

Artículo en Inglés | MEDLINE | ID: mdl-35589867

RESUMEN

A cell-free DNA (cfDNA) assay would be a promising approach to early cancer diagnosis, especially for patients with dense tissues. Consistent cfDNA signatures have been observed for many carcinogens. Recently, investigations of cfDNA as a reliable early detection bioassay have presented a powerful opportunity for detecting dense tissue screening complications early. We performed a prospective study to evaluate the potential of characterizing cfDNA as a central element in the early detection of dense tissue breast cancer (BC). Plasma samples were collected from 32 consenting subjects with dense tissue and positive mammograms, 20 with positive biopsies and 12 with negative biopsies. After screening and before biopsy, cfDNA was extracted, and whole-genome next-generation sequencing (NGS) was performed on all samples. Copy number alteration (CNA) and single nucleotide polymorphism (SNP)/insertion/deletion (Indel) analyses were performed to characterize cfDNA. In the positive-positive subjects (cases), a total of 5 CNAs overlapped with 5 previously reported BC-related oncogenes (KSR2, MAP2K4, MSI2, CANT1 and MSI2). In addition, 1 SNP was detected in KMT2C, a BC oncogene, and 9 others were detected in or near 10 genes (SERAC1, DAGLB, MACF1, NVL, FBXW4, FANK1, KCTD4, CAVIN1; ATP6V0A1 and ZBTB20-AS1) previously associated with non-BC cancers. For the positive-negative subjects (screening), 3 CNAs were detected in BC genes (ACVR2A, CUL3 and PIK3R1), and 5 SNPs were identified in 6 non-BC cancer genes (SNIP1, TBC1D10B, PANK1, PRKCA and RUNX2; SUPT3H). This study presents evidence of the potential of using cfDNA somatic variants as dense tissue BC biomarkers from a noninvasive liquid bioassay for early cancer detection.

Asunto(s)

Neoplasias de la Mama , Ácidos Nucleicos Libres de Células , Proteínas Adaptadoras Transductoras de Señales/genética , Bioensayo , Biomarcadores de Tumor/genética , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/genética , Ácidos Nucleicos Libres de Células/genética , Detección Precoz del Cáncer , Femenino , Humanos , Mutación , Estudios Prospectivos , Proteínas de Unión al ARN/genética

Scalable detection of technically challenging variants through modified next-generation sequencing.

Rojahn, Susan; Hambuch, Tina; Adrian, Jessika; Gafni, Erik; Gileta, Alex; Hatchell, Hannah; Johnson, Britt; Kallman, Ben; Karfilis, Kate; Kautzer, Curtis; Kennemer, Michael; Kirk, Lloyd; Kvitek, Daniel; Lettes, Jessica; Macrae, Fenner; Mendez, Fernando; Paul, Joshua; Pellegrino, Maurizio; Preciado, Ronny; Risinger, Jan; Schultz, Matthew; Spurka, Lindsay; Swamy, Sajani; Truty, Rebecca; Usem, Nathan; Velenich, Andrea; Aradhya, Swaroop.

Mol Genet Genomic Med ; 10(12): e2072, 2022 12.

Artículo en Inglés | MEDLINE | ID: mdl-36251442

RESUMEN

BACKGROUND: Some clinically important genetic variants are not easily evaluated with next-generation sequencing (NGS) methods due to technical challenges arising from high- similarity copies (e.g., PMS2, SMN1/SMN2, GBA1, HBA1/HBA2, CYP21A2), repetitive short sequences (e.g., ARX polyalanine repeats, FMR1 AGG interruptions in CGG repeats, CFTR poly-T/TG repeats), and other complexities (e.g., MSH2 Boland inversions). METHODS: We customized our NGS processes to detect the technically challenging variants mentioned above with adaptations including target enrichment and bioinformatic masking of similar sequences. Adaptations were validated with samples of known genotypes. RESULTS: Our adaptations provided high-sensitivity and high-specificity detection for most of the variants and provided a high-sensitivity primary assay to be followed with orthogonal disambiguation for the others. The sensitivity of the NGS adaptations was 100% for all of the technically challenging variants. Specificity was 100% for those in PMS2, GBA1, SMN1/SMN2, and HBA1/HBA2, and for the MSH2 Boland inversion; 97.8%-100% for CYP21A2 variants; and 85.7% for ARX polyalanine repeats. CONCLUSIONS: NGS assays can detect technically challenging variants when chemistries and bioinformatics are jointly refined. The adaptations described support a scalable, cost-effective path to identifying all clinically relevant variants within a single sample.

Asunto(s)

Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Endonucleasa PMS2 de Reparación del Emparejamiento Incorrecto , Hemoglobina Glucada , Proteína 2 Homóloga a MutS , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Genotipo , Esteroide 21-Hidroxilasa

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA