Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 193
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 173(4): 864-878.e29, 2018 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-29681454

RESUMO

Diversity in the genetic lesions that cause cancer is extreme. In consequence, a pressing challenge is the development of drugs that target patient-specific disease mechanisms. To address this challenge, we employed a chemistry-first discovery paradigm for de novo identification of druggable targets linked to robust patient selection hypotheses. In particular, a 200,000 compound diversity-oriented chemical library was profiled across a heavily annotated test-bed of >100 cellular models representative of the diverse and characteristic somatic lesions for lung cancer. This approach led to the delineation of 171 chemical-genetic associations, shedding light on the targetability of mechanistic vulnerabilities corresponding to a range of oncogenotypes present in patient populations lacking effective therapy. Chemically addressable addictions to ciliogenesis in TTC21B mutants and GLUT8-dependent serine biosynthesis in KRAS/KEAP1 double mutants are prominent examples. These observations indicate a wealth of actionable opportunities within the complex molecular etiology of cancer.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/patologia , Proliferação de Células/efeitos dos fármacos , Neoplasias Pulmonares/patologia , Bibliotecas de Moléculas Pequenas/farmacologia , Carcinoma Pulmonar de Células não Pequenas/metabolismo , Linhagem Celular Tumoral , Família 4 do Citocromo P450/deficiência , Família 4 do Citocromo P450/genética , Descoberta de Drogas , Pontos de Checagem da Fase G1 do Ciclo Celular/efeitos dos fármacos , Glucocorticoides/farmacologia , Proteínas Facilitadoras de Transporte de Glucose/antagonistas & inibidores , Proteínas Facilitadoras de Transporte de Glucose/genética , Proteínas Facilitadoras de Transporte de Glucose/metabolismo , Humanos , Proteína 1 Associada a ECH Semelhante a Kelch/genética , Proteína 1 Associada a ECH Semelhante a Kelch/metabolismo , Neoplasias Pulmonares/metabolismo , Proteínas Associadas aos Microtúbulos/genética , Proteínas Associadas aos Microtúbulos/metabolismo , Mutação , Fator 2 Relacionado a NF-E2/antagonistas & inibidores , Fator 2 Relacionado a NF-E2/genética , Fator 2 Relacionado a NF-E2/metabolismo , Proteínas Proto-Oncogênicas p21(ras)/genética , Proteínas Proto-Oncogênicas p21(ras)/metabolismo , Interferência de RNA , RNA Interferente Pequeno/metabolismo , Receptor Notch2/genética , Receptor Notch2/metabolismo , Receptores de Glucocorticoides/antagonistas & inibidores , Receptores de Glucocorticoides/genética , Receptores de Glucocorticoides/metabolismo , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/metabolismo
2.
Genome Res ; 32(1): 55-70, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34903527

RESUMO

Human papillomavirus (HPV) causes 5% of all cancers and frequently integrates into host chromosomes. The HPV oncoproteins E6 and E7 are necessary but insufficient for cancer formation, indicating that additional secondary genetic events are required. Here, we investigate potential oncogenic impacts of virus integration. Analysis of 105 HPV-positive oropharyngeal cancers by whole-genome sequencing detects virus integration in 77%, revealing five statistically significant sites of recurrent integration near genes that regulate epithelial stem cell maintenance (i.e., SOX2, TP63, FGFR, MYC) and immune evasion (i.e., CD274). Genomic copy number hyperamplification is enriched 16-fold near HPV integrants, and the extent of focal host genomic instability increases with their local density. The frequency of genes expressed at extreme outlier levels is increased 86-fold within ±150 kb of integrants. Across 95% of tumors with integration, host gene transcription is disrupted via intragenic integrants, chimeric transcription, outlier expression, gene breaking, and/or de novo expression of noncoding or imprinted genes. We conclude that virus integration can contribute to carcinogenesis in a large majority of HPV-positive oropharyngeal cancers by inducing extensive disruption of host genome structure and gene expression.


Assuntos
Alphapapillomavirus , Proteínas Oncogênicas Virais , Neoplasias Orofaríngeas , Alphapapillomavirus/metabolismo , Carcinogênese , Humanos , Proteínas Oncogênicas Virais/genética , Neoplasias Orofaríngeas/genética , Papillomaviridae/genética , Papillomaviridae/metabolismo , Proteínas E7 de Papillomavirus/genética , Proteínas E7 de Papillomavirus/metabolismo , Integração Viral/genética
3.
Blood ; 142(1): 44-61, 2023 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-37023372

RESUMO

In chronic lymphocytic leukemia (CLL), epigenetic alterations are considered to centrally shape the transcriptional signatures that drive disease evolution and underlie its biological and clinical subsets. Characterizations of epigenetic regulators, particularly histone-modifying enzymes, are very rudimentary in CLL. In efforts to establish effectors of the CLL-associated oncogene T-cell leukemia 1A (TCL1A), we identified here the lysine-specific histone demethylase KDM1A to interact with the TCL1A protein in B cells in conjunction with an increased catalytic activity of KDM1A. We demonstrate that KDM1A is upregulated in malignant B cells. Elevated KDM1A and associated gene expression signatures correlated with aggressive disease features and adverse clinical outcomes in a large prospective CLL trial cohort. Genetic Kdm1a knockdown in Eµ-TCL1A mice reduced leukemic burden and prolonged animal survival, accompanied by upregulated p53 and proapoptotic pathways. Genetic KDM1A depletion also affected milieu components (T, stromal, and monocytic cells), resulting in significant reductions in their capacity to support CLL-cell survival and proliferation. Integrated analyses of differential global transcriptomes (RNA sequencing) and H3K4me3 marks (chromatin immunoprecipitation sequencing) in Eµ-TCL1A vs iKdm1aKD;Eµ-TCL1A mice (confirmed in human CLL) implicate KDM1A as an oncogenic transcriptional repressor in CLL which alters histone methylation patterns with pronounced effects on defined cell death and motility pathways. Finally, pharmacologic KDM1A inhibition altered H3K4/9 target methylation and revealed marked anti-B-cell leukemic synergisms. Overall, we established the pathogenic role and effector networks of KDM1A in CLL via tumor-cell intrinsic mechanisms and its impacts in cells of the microenvironment. Our data also provide rationales to further investigate therapeutic KDM1A targeting in CLL.


Assuntos
Leucemia Linfocítica Crônica de Células B , Humanos , Camundongos , Animais , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Histonas/metabolismo , Lisina , Estudos Prospectivos , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Microambiente Tumoral
4.
Ann Surg ; 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38771951

RESUMO

OBJECTIVE: We aimed to assess the levels of MDM2-DNA within extracellular vesicles (EVs) isolated from the serum of retroperitoneal liposarcoma (RLS) patients versus healthy donors, as well as within the same patients at the time of surgery versus post-operative surveillance visits. To determine whether EV-MDM2 may serve as a possible first-ever biomarker of liposarcoma recurrence. BACKGROUND: A hallmark of well-differentiated and de-differentiated (WD/DD) retroperitoneal liposarcoma is elevated MDM2 due to genome amplification, with recurrence rates of >50% even after complete resection. Imaging technologies frequently cannot resolve recurrent WD/DD-RLS versus postoperative scarring. Early detection of recurrent lesions, for which biomarkers are lacking, would guide surveillance and treatment decisions. METHODS: WD/DD-RLS serum samples were collected both at the time of surgery and during follow-up visits from 42 patients, along with sera from healthy donors (n=14). EVs were isolated, DNA purified and MDM2-DNA levels determined through q-PCR analysis. Non-parametric tests were employed to compare EV-MDM2 DNA levels from patients versus control group, as well as the time of surgery versus post-surgery conditions. RESULTS: EV-MDM2 levels were significantly higher in WD/DD-RLS than controls (P= 0.00085). Moreover, EV-MDM2 levels were remarkably decreased in WD/DD-RLS patients after resection (P=0.00036), reaching values comparable to control group (P=0.124). During post-operative surveillance, significant increases of EV-MDM2 was observed in some patients, correlating with CT scan evidence of recurrent or persistent post-resection disease. CONCLUSIONS: Serum EV-MDM2 may serve as a potential biomarker of early recurrent or post-operatively persistent WD/DD-RLS, a disease currently lacking such determinants.

5.
Genome Res ; 31(5): 747-761, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33707228

RESUMO

Acute myeloid leukemia (AML) is a molecularly complex disease characterized by heterogeneous tumor genetic profiles and involving numerous pathogenic mechanisms and pathways. Integration of molecular data types across multiple patient cohorts may advance current genetic approaches for improved subclassification and understanding of the biology of the disease. Here, we analyzed genome-wide DNA methylation in 649 AML patients using Illumina arrays and identified a configuration of 13 subtypes (termed "epitypes") using unbiased clustering. Integration of genetic data revealed that most epitypes were associated with a certain recurrent mutation (or combination) in a majority of patients, yet other epitypes were largely independent. Epitypes showed developmental blockage at discrete stages of myeloid differentiation, revealing epitypes that retain arrested hematopoietic stem-cell-like phenotypes. Detailed analyses of DNA methylation patterns identified unique patterns of aberrant hyper- and hypomethylation among epitypes, with variable involvement of transcription factors influencing promoter, enhancer, and repressed regions. Patients in epitypes with stem-cell-like methylation features showed inferior overall survival along with up-regulated stem cell gene expression signatures. We further identified a DNA methylation signature involving STAT motifs associated with FLT3-ITD mutations. Finally, DNA methylation signatures were stable at relapse for the large majority of patients, and rare epitype switching accompanied loss of the dominant epitype mutations and reversion to stem-cell-like methylation patterns. These results show that DNA methylation-based classification integrates important molecular features of AML to reveal the diverse pathogenic and biological aspects of the disease.


Assuntos
Metilação de DNA , Leucemia Mieloide Aguda , Humanos , Leucemia Mieloide Aguda/metabolismo , Mutação , Regiões Promotoras Genéticas
6.
Bioinformatics ; 38(23): 5245-5252, 2022 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-36250792

RESUMO

MOTIVATION: Clustered regularly interspaced short palindromic repeats (CRISPR)-based genetic perturbation screen is a powerful tool to probe gene function. However, experimental noises, especially for the lowly expressed genes, need to be accounted for to maintain proper control of false positive rate. METHODS: We develop a statistical method, named CRISPR screen with Expression Data Analysis (CEDA), to integrate gene expression profiles and CRISPR screen data for identifying essential genes. CEDA stratifies genes based on expression level and adopts a three-component mixture model for the log-fold change of single-guide RNAs (sgRNAs). Empirical Bayesian prior and expectation-maximization algorithm are used for parameter estimation and false discovery rate inference. RESULTS: Taking advantage of gene expression data, CEDA identifies essential genes with higher expression. Compared to existing methods, CEDA shows comparable reliability but higher sensitivity in detecting essential genes with moderate sgRNA fold change. Therefore, using the same CRISPR data, CEDA generates an additional hit gene list. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Genes Essenciais , Teorema de Bayes , Sistemas CRISPR-Cas , Expressão Gênica , Reprodutibilidade dos Testes , Pequeno RNA não Traduzido/genética
7.
Genome Res ; 29(1): 1-17, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30563911

RESUMO

Human papillomavirus (HPV) is a necessary but insufficient cause of a subset of oral squamous cell carcinomas (OSCCs) that is increasing markedly in frequency. To identify contributory, secondary genetic alterations in these cancers, we used comprehensive genomics methods to compare 149 HPV-positive and 335 HPV-negative OSCC tumor/normal pairs. Different behavioral risk factors underlying the two OSCC types were reflected in distinctive genomic mutational signatures. In HPV-positive OSCCs, the signatures of APOBEC cytosine deaminase editing, associated with anti-viral immunity, were strongly linked to overall mutational burden. In contrast, in HPV-negative OSCCs, T>C substitutions in the sequence context 5'-ATN-3' correlated with tobacco exposure. Universal expression of HPV E6*1 and E7 oncogenes was a sine qua non of HPV-positive OSCCs. Significant enrichment of somatic mutations was confirmed or newly identified in PIK3CA, KMT2D, FGFR3, FBXW7, DDX3X, PTEN, TRAF3, RB1, CYLD, RIPK4, ZNF750, EP300, CASZ1, TAF5, RBL1, IFNGR1, and NFKBIA Of these, many affect host pathways already targeted by HPV oncoproteins, including the p53 and pRB pathways, or disrupt host defenses against viral infections, including interferon (IFN) and nuclear factor kappa B signaling. Frequent copy number changes were associated with concordant changes in gene expression. Chr 11q (including CCND1) and 14q (including DICER1 and AKT1) were recurrently lost in HPV-positive OSCCs, in contrast to their gains in HPV-negative OSCCs. High-ranking variant allele fractions implicated ZNF750, PIK3CA, and EP300 mutations as candidate driver events in HPV-positive cancers. We conclude that virus-host interactions cooperatively shape the unique genetic features of these cancers, distinguishing them from their HPV-negative counterparts.


Assuntos
Carcinoma de Células Escamosas , Neoplasias Bucais , Proteínas de Neoplasias , Proteínas Oncogênicas Virais , Infecções por Papillomavirus , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/metabolismo , Carcinoma de Células Escamosas/patologia , Carcinoma de Células Escamosas/virologia , Feminino , Humanos , Masculino , Neoplasias Bucais/genética , Neoplasias Bucais/metabolismo , Neoplasias Bucais/patologia , Neoplasias Bucais/virologia , Mutação , Proteínas de Neoplasias/biossíntese , Proteínas de Neoplasias/genética , Proteínas Oncogênicas Virais/biossíntese , Proteínas Oncogênicas Virais/genética , Papillomaviridae/genética , Papillomaviridae/metabolismo
8.
Bioinformatics ; 37(23): 4589-4590, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34601554

RESUMO

SUMMARY: Cytogenetics data, or karyotypes, are among the most common clinically used forms of genetic data. Karyotypes are stored as standardized text strings using the International System for Human Cytogenomic Nomenclature (ISCN). Historically, these data have not been used in large-scale computational analyses due to limitations in the ISCN text format and structure. Recently developed computational tools such as CytoGPS have enabled large-scale computational analyses of karyotypes. To further enable such analyses, we have now developed RCytoGPS, an R package that takes JSON files generated from CytoGPS.org and converts them into objects in R. This conversion facilitates the analysis and visualizations of karyotype data. In effect this tool streamlines the process of performing large-scale karyotype analyses, thus advancing the field of computational cytogenetic pathology. AVAILABILITY AND IMPLEMENTATION: Freely available at https://CRAN.R-project.org/package=RCytoGPS. The code for the underlying CytoGPS software can be found at https://github.com/i2-wustl/CytoGPS.


Assuntos
Leitura , Software , Humanos , Cariotipagem , Cariótipo
9.
Bioinformatics ; 37(17): 2780-2781, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-33515233

RESUMO

SUMMARY: Unsupervised machine learning provides tools for researchers to uncover latent patterns in large-scale data, based on calculated distances between observations. Methods to visualize high-dimensional data based on these distances can elucidate subtypes and interactions within multi-dimensional and high-throughput data. However, researchers can select from a vast number of distance metrics and visualizations, each with their own strengths and weaknesses. The Mercator R package facilitates selection of a biologically meaningful distance from 10 metrics, together appropriate for binary, categorical and continuous data, and visualization with 5 standard and high-dimensional graphics tools. Mercator provides a user-friendly pipeline for informaticians or biologists to perform unsupervised analyses, from exploratory pattern recognition to production of publication-quality graphics. AVAILABILITYAND IMPLEMENTATION: Mercator is freely available at the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/Mercator/index.html).

10.
Chem Biodivers ; 19(11): e202200657, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36216587

RESUMO

We present a novel model of time-series analysis to learn from electronic health record (EHR) data when infection occurred in the intensive care unit (ICU) by translating methods from proteomics and Bayesian statistics. Using 48,536 patients hospitalized in an ICU, we describe each hospital course as an 'alphabet' of 23 physician actions ('events') in temporal order. We analyze these as k-mers of length 3-12 events and apply a Bayesian model of (cumulative) relative risk (RR). The log2-transformed RR (median=0.248, mean=0.226) supported the conclusion that the events selected were individually associated with increased risk of infection. Selecting from all possible cutoffs of maximum gain (MG), MG>0.0244 predicts administration of antibiotics with PPV 82.0 %, NPV 44.4 %, and AUC 0.706. Our approach holds value for retrospective analysis of other clinical syndromes for which time-of-onset is critical to analysis but poorly marked in EHRs, including delirium and decompensation.


Assuntos
Registros Eletrônicos de Saúde , Unidades de Terapia Intensiva , Humanos , Estudos Retrospectivos , Teorema de Bayes
11.
BMC Bioinformatics ; 22(1): 100, 2021 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-33648439

RESUMO

BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers. RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database. CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases.


Assuntos
Doenças Hematológicas , Cariotipagem , Neoplasias , Aberrações Cromossômicas , Humanos , Cariótipo
12.
Blood ; 134(8): 688-698, 2019 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-31292113

RESUMO

Alterations in global DNA methylation patterns are a major hallmark of cancer and represent attractive biomarkers for personalized risk stratification. Chronic lymphocytic leukemia (CLL) risk stratification studies typically focus on time to first treatment (TTFT), time to progression (TTP) after treatment, and overall survival (OS). Whereas TTFT risk stratification remains similar over time, TTP and OS have changed dramatically with the introduction of targeted therapies, such as the Bruton tyrosine kinase inhibitor ibrutinib. We have shown that genome-wide DNA methylation patterns in CLL are strongly associated with phenotypic differentiation and patient outcomes. Here, we developed a novel assay, termed methylation-iPLEX (Me-iPLEX), for high-throughput quantification of targeted panels of single cytosine guanine dinucleotides from multiple independent loci. Me-iPLEX was used to classify CLL samples into 1 of 3 known epigenetic subtypes (epitypes). We examined the impact of epitype in 1286 CLL patients from 4 independent cohorts representing a comprehensive view of CLL disease course and therapies. We found that epitype significantly predicted TTFT and OS among newly diagnosed CLL patients. Additionally, epitype predicted TTP and OS with 2 common CLL therapies: chemoimmunotherapy and ibrutinib. Epitype retained significance after stratifying by biologically related biomarkers, immunoglobulin heavy chain mutational status, and ZAP70 expression, as well as other common prognostic markers. Furthermore, among several biological traits enriched between epitypes, we found highly biased immunogenetic features, including IGLV3-21 usage in the poorly characterized intermediate-programmed CLL epitype. In summary, Me-iPLEX is an elegant method to assess epigenetic signatures, including robust classification of CLL epitypes that independently stratify patient risk at diagnosis and time of treatment.


Assuntos
Metilação de DNA , Leucemia Linfocítica Crônica de Células B/genética , Biomarcadores Tumorais/genética , Progressão da Doença , Epigênese Genética , Loci Gênicos , Testes Genéticos , Humanos , Leucemia Linfocítica Crônica de Células B/diagnóstico , Prognóstico
13.
J Biomed Inform ; 118: 103788, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33862229

RESUMO

INTRODUCTION: Clustering analyses in clinical contexts hold promise to improve the understanding of patient phenotype and disease course in chronic and acute clinical medicine. However, work remains to ensure that solutions are rigorous, valid, and reproducible. In this paper, we evaluate best practices for dissimilarity matrix calculation and clustering on mixed-type, clinical data. METHODS: We simulate clinical data to represent problems in clinical trials, cohort studies, and EHR data, including single-type datasets (binary, continuous, categorical) and 4 data mixtures. We test 5 single distance metrics (Jaccard, Hamming, Gower, Manhattan, Euclidean) and 3 mixed distance metrics (DAISY, Supersom, and Mercator) with 3 clustering algorithms (hierarchical (HC), k-medoids, self-organizing maps (SOM)). We quantitatively and visually validate by Adjusted Rand Index (ARI) and silhouette width (SW). We applied our best methods to two real-world data sets: (1) 21 features collected on 247 patients with chronic lymphocytic leukemia, and (2) 40 features collected on 6000 patients admitted to an intensive care unit. RESULTS: HC outperformed k-medoids and SOM by ARI across data types. DAISY produced the highest mean ARI for mixed data types for all mixtures except unbalanced mixtures dominated by continuous data. Compared to other methods, DAISY with HC uncovered superior, separable clusters in both real-world data sets. DISCUSSION: Selecting an appropriate mixed-type metric allows the investigator to obtain optimal separation of patient clusters and get maximum use of their data. Superior metrics for mixed-type data handle multiple data types using multiple, type-focused distances. Better subclassification of disease opens avenues for targeted treatments, precision medicine, clinical decision support, and improved patient outcomes.


Assuntos
Leucemia Linfocítica Crônica de Células B , Algoritmos , Análise por Conglomerados , Simulação por Computador , Humanos
14.
BMC Med Inform Decis Mak ; 21(1): 97, 2021 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-33750375

RESUMO

BACKGROUND: In the intensive care unit (ICU), delirium is a common, acute, confusional state associated with high risk for short- and long-term morbidity and mortality. Machine learning (ML) has promise to address research priorities and improve delirium outcomes. However, due to clinical and billing conventions, delirium is often inconsistently or incompletely labeled in electronic health record (EHR) datasets. Here, we identify clinical actions abstracted from clinical guidelines in electronic health records (EHR) data that indicate risk of delirium among intensive care unit (ICU) patients. We develop a novel prediction model to label patients with delirium based on a large data set and assess model performance. METHODS: EHR data on 48,451 admissions from 2001 to 2012, available through Medical Information Mart for Intensive Care-III database (MIMIC-III), was used to identify features to develop our prediction models. Five binary ML classification models (Logistic Regression; Classification and Regression Trees; Random Forests; Naïve Bayes; and Support Vector Machines) were fit and ranked by Area Under the Curve (AUC) scores. We compared our best model with two models previously proposed in the literature for goodness of fit, precision, and through biological validation. RESULTS: Our best performing model with threshold reclassification for predicting delirium was based on a multiple logistic regression using the 31 clinical actions (AUC 0.83). Our model out performed other proposed models by biological validation on clinically meaningful, delirium-associated outcomes. CONCLUSIONS: Hurdles in identifying accurate labels in large-scale datasets limit clinical applications of ML in delirium. We developed a novel labeling model for delirium in the ICU using a large, public data set. By using guideline-directed clinical actions independent from risk factors, treatments, and outcomes as model predictors, our classifier could be used as a delirium label for future clinically targeted models.


Assuntos
Delírio , Unidades de Terapia Intensiva , Teorema de Bayes , Delírio/diagnóstico , Registros Eletrônicos de Saúde , Humanos , Aprendizado de Máquina
15.
Bioinformatics ; 35(17): 2924-2931, 2019 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30689715

RESUMO

MOTIVATION: Clonal heterogeneity is common in many types of cancer, including chronic lymphocytic leukemia (CLL). Previous research suggests that the presence of multiple distinct cancer clones is associated with clinical outcome. Detection of clonal heterogeneity from high throughput data, such as sequencing or single nucleotide polymorphism (SNP) array data, is important for gaining a better understanding of cancer and may improve prediction of clinical outcome or response to treatment. Here, we present a new method, CloneSeeker, for inferring clinical heterogeneity from sequencing data, SNP array data, or both. RESULTS: We generated simulated SNP array and sequencing data and applied CloneSeeker along with two other methods. We demonstrate that CloneSeeker is more accurate than existing algorithms at determining the number of clones, distribution of cancer cells among clones, and mutation and/or copy numbers belonging to each clone. Next, we applied CloneSeeker to SNP array data from samples of 258 previously untreated CLL patients to gain a better understanding of the characteristics of CLL tumors and to elucidate the relationship between clonal heterogeneity and clinical outcome. We found that a significant majority of CLL patients appear to have multiple clones distinguished by copy number alterations alone. We also found that the presence of multiple clones corresponded with significantly worse survival among CLL patients. These findings may prove useful for improving the accuracy of prognosis and design of treatment strategies. AVAILABILITY AND IMPLEMENTATION: Code available on R-Forge: https://r-forge.r-project.org/projects/CloneSeeker/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Leucemia Linfocítica Crônica de Células B , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Algoritmos , Variações do Número de Cópias de DNA , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino
16.
Bioinformatics ; 35(24): 5365-5366, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31263896

RESUMO

SUMMARY: Karyotype data are the most common form of genetic data that is regularly used clinically. They are collected as part of the standard of care in many diseases, particularly in pediatric and cancer medicine contexts. Karyotypes are represented in a unique text-based format, with a syntax defined by the International System for human Cytogenetic Nomenclature (ISCN). While human-readable, ISCN is not intrinsically machine-readable. This limitation has prevented the full use of complex karyotype data in discovery science use cases. To enhance the utility and value of karyotype data, we developed a tool named CytoGPS. CytoGPS first parses ISCN karyotypes into a machine-readable format. It then converts the ISCN karyotype into a binary Loss-Gain-Fusion (LGF) model, which represents all cytogenetic abnormalities as combinations of loss, gain, or fusion events, in a format that is analyzable using modern computational methods. Such data is then made available for comprehensive 'downstream' analyses that previously were not feasible. AVAILABILITY AND IMPLEMENTATION: Freely available at http://cytogps.org.


Assuntos
Aberrações Cromossômicas , Cariótipo , Humanos , Cariotipagem , Neoplasias , Software
17.
Lancet Oncol ; 20(11): 1576-1586, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31582354

RESUMO

BACKGROUND: Fludarabine, cyclophosphamide, and rituximab (FCR) has become a gold-standard chemoimmunotherapy regimen for patients with chronic lymphocytic leukaemia. However, the question remains of how to treat treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia. We therefore aimed to develop and validate a gene expression signature to identify which of these patients are likely to achieve durable remissions with FCR chemoimmunotherapy. METHODS: We did a retrospective cohort study in two cohorts of treatment-naive patients (aged ≥18 years) with chronic lymphocytic leukaemia. The discovery and training cohort consisted of peripheral blood samples collected from patients treated at the University of Texas MD Anderson Cancer Center (Houston, TX, USA), who fulfilled the diagnostic criteria of the International Workshop on Chronic Lymphocytic Leukemia, had received at least three cycles of FCR chemoimmunotherapy, and had been treated between Oct 10, 2000, and Oct 26, 2006 (ie, the MDACC cohort). We did transcriptional profiling on samples obtained from the MDACC cohort to identify genes associated with time to progression. We did univariate Cox proportional hazards analyses and used significant genes to cluster IGHV-unmutated samples into two groups (intermediate prognosis and unfavourable prognosis). After using cross-validation to assess robustness, we applied the Lasso method to standardise the gene expression values to find a minimum gene signature. We validated this signature in an external cohort of treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia enrolled on the CLL8 trial of the German Chronic Lymphocytic Leukaemia Study Group who were treated between July 21, 2003, and April 4, 2006 (ie, the CLL8 cohort). FINDINGS: The MDACC cohort consisted of 101 patients and the CLL8 cohort consisted of 109 patients. Using the MDACC cohort, we identified and developed a 17-gene expression signature that distinguished IGHV-unmutated patients who were likely to achieve a long-term remission following front-line FCR chemoimmunotherapy from those who might benefit from alternative front-line regimens (hazard ratio 3·83, 95% CI 1·94-7·59; p<0·0001). We validated this gene signature in the CLL8 cohort; patients with an unfavourable prognosis versus those with an intermediate prognosis had a cause-specific hazard ratio of 1·90 (95% CI 1·18-3·06; p=0·008). Median time to progression was 39 months (IQR 22-69) for those with an unfavourable prognosis compared with 59 months (28-84) for those with an intermediate prognosis. INTERPRETATION: We have developed a robust, reproducible 17-gene signature that identifies a subset of treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia who might substantially benefit from treatment with FCR chemoimmunotherapy. We recommend testing the value of this gene signature in a prospective study that compares FCR treatment with newer alternative therapies as part of a randomised clinical trial. FUNDING: Chronic Lymphocytic Leukaemia Global Research Foundation and the National Institutes of Health/National Cancer Institute.


Assuntos
Antineoplásicos Imunológicos/administração & dosagem , Protocolos de Quimioterapia Combinada Antineoplásica/administração & dosagem , Ciclofosfamida/administração & dosagem , Perfilação da Expressão Gênica , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Rituximab/administração & dosagem , Transcriptoma , Vidarabina/análogos & derivados , Idoso , Antineoplásicos Imunológicos/efeitos adversos , Protocolos de Quimioterapia Combinada Antineoplásica/efeitos adversos , Ciclofosfamida/efeitos adversos , Progressão da Doença , Feminino , Alemanha , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Leucemia Linfocítica Crônica de Células B/imunologia , Leucemia Linfocítica Crônica de Células B/patologia , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Indução de Remissão , Medição de Risco , Fatores de Risco , Rituximab/efeitos adversos , Texas , Fatores de Tempo , Resultado do Tratamento , Vidarabina/administração & dosagem , Vidarabina/efeitos adversos
18.
BMC Bioinformatics ; 20(Suppl 24): 679, 2019 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861985

RESUMO

BACKGROUND: RNA sequencing technologies have allowed researchers to gain a better understanding of how the transcriptome affects disease. However, sequencing technologies often unintentionally introduce experimental error into RNA sequencing data. To counteract this, normalization methods are standardly applied with the intent of reducing the non-biologically derived variability inherent in transcriptomic measurements. However, the comparative efficacy of the various normalization techniques has not been tested in a standardized manner. Here we propose tests that evaluate numerous normalization techniques and applied them to a large-scale standard data set. These tests comprise a protocol that allows researchers to measure the amount of non-biological variability which is present in any data set after normalization has been performed, a crucial step to assessing the biological validity of data following normalization. RESULTS: In this study we present two tests to assess the validity of normalization methods applied to a large-scale data set collected for systematic evaluation purposes. We tested various RNASeq normalization procedures and concluded that transcripts per million (TPM) was the best performing normalization method based on its preservation of biological signal as compared to the other methods tested. CONCLUSION: Normalization is of vital importance to accurately interpret the results of genomic and transcriptomic experiments. More work, however, needs to be performed to optimize normalization methods for RNASeq data. The present effort helps pave the way for more systematic evaluations of normalization methods across different platforms. With our proposed schema researchers can evaluate their own or future normalization methods to further improve the field of RNASeq normalization.


Assuntos
RNA/genética , Análise de Sequência de RNA/métodos , Genoma , Genômica , Humanos , Transcriptoma
19.
Proteomics ; 18(8): e1700379, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29505696

RESUMO

Posttranslational histone tail modifications are known to play a role in leukemogenesis and are therapeutic targets. A global analysis of the level and patterns of expression of multiple histone-modifying proteins (HMP) in acute myeloid leukemia (AML) and the effect of different patterns of expression on outcome and prognosis has not been investigated in AML patients. Here we analyzed 20 HMP by reverse phase protein array (RPPA) in a cohort of 205 newly diagnosed AML patients. Protein levels were correlated with patient and disease characteristics, including survival and mutational state. We identified different protein clusters characterized by higher (more on) or lower (more off) expression of HMP, relative to normal CD34+ cells. On state of HMP was associated with poorer outcome compared to normal-like and a more off state. FLT3 mutated AML patients were significantly overrepresented in the more on state. DNA methylation related mutations showed no correlation with the different HMP states. In this study, we demonstrate for the first time that HMP form recurrent patterns of expression and that these significantly correlate with survival in newly diagnosed AML patients.


Assuntos
Regulação Leucêmica da Expressão Gênica , Código das Histonas , Leucemia Mieloide Aguda/genética , Adulto , Idoso , Metilação de DNA , Feminino , Humanos , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/metabolismo , Masculino , Pessoa de Meia-Idade , Prognóstico , Análise Serial de Proteínas , Mapas de Interação de Proteínas , Análise de Sobrevida
20.
BMC Bioinformatics ; 19(1): 9, 2018 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-29310570

RESUMO

BACKGROUND: Cluster analysis is the most common unsupervised method for finding hidden groups in data. Clustering presents two main challenges: (1) finding the optimal number of clusters, and (2) removing "outliers" among the objects being clustered. Few clustering algorithms currently deal directly with the outlier problem. Furthermore, existing methods for identifying the number of clusters still have some drawbacks. Thus, there is a need for a better algorithm to tackle both challenges. RESULTS: We present a new approach, implemented in an R package called Thresher, to cluster objects in general datasets. Thresher combines ideas from principal component analysis, outlier filtering, and von Mises-Fisher mixture models in order to select the optimal number of clusters. We performed a large Monte Carlo simulation study to compare Thresher with other methods for detecting outliers and determining the number of clusters. We found that Thresher had good sensitivity and specificity for detecting and removing outliers. We also found that Thresher is the best method for estimating the optimal number of clusters when the number of objects being clustered is smaller than the number of variables used for clustering. Finally, we applied Thresher and eleven other methods to 25 sets of breast cancer data downloaded from the Gene Expression Omnibus; only Thresher consistently estimated the number of clusters to lie in the range of 4-7 that is consistent with the literature. CONCLUSIONS: Thresher is effective at automatically detecting and removing outliers. By thus cleaning the data, it produces better estimates of the optimal number of clusters when there are more variables than objects. When we applied Thresher to a variety of breast cancer datasets, it produced estimates that were both self-consistent and consistent with the literature. We expect Thresher to be useful for studying a wide variety of biological datasets.


Assuntos
Análise por Conglomerados , Algoritmos , Neoplasias da Mama/metabolismo , Neoplasias da Mama/patologia , Feminino , Humanos , Método de Monte Carlo , Análise de Componente Principal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA