Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
1.
Nat Methods ; 12(6): 527-30, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25938371

ABSTRACT

We developed Copy Number Segmentation by Regression Tree in Next Generation Sequencing (CONSERTING), an algorithm for detecting somatic copy-number alteration (CNA) using whole-genome sequencing (WGS) data. CONSERTING performs iterative analysis of segmentation on the basis of changes in read depth and the detection of localized structural variations, with high accuracy and sensitivity. Analysis of 43 cancer genomes from both pediatric and adult patients revealed novel oncogenic CNAs, complex rearrangements and subclonal CNAs missed by alternative approaches.


Subject(s)
DNA Copy Number Variations/genetics , DNA/genetics , Genomics/methods , Neoplasms/genetics , Software , Adult , Algorithms , Child , Computational Biology , Gene Expression Regulation, Neoplastic , Genetic Markers , Genome , Humans
2.
Nature ; 481(7380): 157-63, 2012 Jan 11.
Article in English | MEDLINE | ID: mdl-22237106

ABSTRACT

Early T-cell precursor acute lymphoblastic leukaemia (ETP ALL) is an aggressive malignancy of unknown genetic basis. We performed whole-genome sequencing of 12 ETP ALL cases and assessed the frequency of the identified somatic mutations in 94 T-cell acute lymphoblastic leukaemia cases. ETP ALL was characterized by activating mutations in genes regulating cytokine receptor and RAS signalling (67% of cases; NRAS, KRAS, FLT3, IL7R, JAK3, JAK1, SH2B3 and BRAF), inactivating lesions disrupting haematopoietic development (58%; GATA3, ETV6, RUNX1, IKZF1 and EP300) and histone-modifying genes (48%; EZH2, EED, SUZ12, SETD2 and EP300). We also identified new targets of recurrent mutation including DNM2, ECT2L and RELN. The mutational spectrum is similar to myeloid tumours, and moreover, the global transcriptional profile of ETP ALL was similar to that of normal and myeloid leukaemia haematopoietic stem cells. These findings suggest that addition of myeloid-directed therapies might improve the poor outcome of ETP ALL.


Subject(s)
Genetic Predisposition to Disease/genetics , Mutation/genetics , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Age of Onset , Child , DNA Copy Number Variations/genetics , Genes, ras/genetics , Genome, Human/genetics , Genomics , Hematopoiesis/genetics , Histones/metabolism , Humans , Janus Kinases/genetics , Janus Kinases/metabolism , Leukemia, Myeloid, Acute/drug therapy , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/pathology , Molecular Sequence Data , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/drug therapy , Precursor T-Cell Lymphoblastic Leukemia-Lymphoma/pathology , Receptors, Interleukin-7/genetics , Reelin Protein , Sequence Analysis, DNA , Signal Transduction/genetics , Stem Cells/metabolism , Stem Cells/pathology , T-Lymphocytes/metabolism , T-Lymphocytes/pathology , Translocation, Genetic/genetics
3.
JMIR Public Health Surveill ; 9: e45246, 2023 05 19.
Article in English | MEDLINE | ID: mdl-37204824

ABSTRACT

BACKGROUND: Fatal drug overdose surveillance informs prevention but is often delayed because of autopsy report processing and death certificate coding. Autopsy reports contain narrative text describing scene evidence and medical history (similar to preliminary death scene investigation reports) and may serve as early data sources for identifying fatal drug overdoses. To facilitate timely fatal overdose reporting, natural language processing was applied to narrative texts from autopsies. OBJECTIVE: This study aimed to develop a natural language processing-based model that predicts the likelihood that an autopsy report narrative describes an accidental or undetermined fatal drug overdose. METHODS: Autopsy reports of all manners of death (2019-2021) were obtained from the Tennessee Office of the State Chief Medical Examiner. The text was extracted from autopsy reports (PDFs) using optical character recognition. Three common narrative text sections were identified, concatenated, and preprocessed (bag-of-words) using term frequency-inverse document frequency scoring. Logistic regression, support vector machine (SVM), random forest, and gradient boosted tree classifiers were developed and validated. Models were trained and calibrated using autopsies from 2019 to 2020 and tested using those from 2021. Model discrimination was evaluated using the area under the receiver operating characteristic, precision, recall, F1-score, and F2-score (prioritizes recall over precision). Calibration was performed using logistic regression (Platt scaling) and evaluated using the Spiegelhalter z test. Shapley additive explanations values were generated for models compatible with this method. In a post hoc subgroup analysis of the random forest classifier, model discrimination was evaluated by forensic center, race, age, sex, and education level. RESULTS: A total of 17,342 autopsies (n=5934, 34.22% cases) were used for model development and validation. The training set included 10,215 autopsies (n=3342, 32.72% cases), the calibration set included 538 autopsies (n=183, 34.01% cases), and the test set included 6589 autopsies (n=2409, 36.56% cases). The vocabulary set contained 4002 terms. All models showed excellent performance (area under the receiver operating characteristic ≥0.95, precision ≥0.94, recall ≥0.92, F1-score ≥0.94, and F2-score ≥0.92). The SVM and random forest classifiers achieved the highest F2-scores (0.948 and 0.947, respectively). The logistic regression and random forest were calibrated (P=.95 and P=.85, respectively), whereas the SVM and gradient boosted tree classifiers were miscalibrated (P=.03 and P<.001, respectively). "Fentanyl" and "accident" had the highest Shapley additive explanations values. Post hoc subgroup analyses revealed lower F2-scores for autopsies from forensic centers D and E. Lower F2-score were observed for the American Indian, Asian, ≤14 years, and ≥65 years subgroups, but larger sample sizes are needed to validate these findings. CONCLUSIONS: The random forest classifier may be suitable for identifying potential accidental and undetermined fatal overdose autopsies. Further validation studies should be conducted to ensure early detection of accidental and undetermined fatal drug overdoses across all subgroups.


Subject(s)
Drug Overdose , Natural Language Processing , Humans , Autopsy , Algorithms , Random Forest
SELECTION OF CITATIONS
SEARCH DETAIL