Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 134
Filtrar
1.
PLoS One ; 19(6): e0300358, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38848330

RESUMO

Clustering is an important task in biomedical science, and it is widely believed that different data sets are best clustered using different algorithms. When choosing between clustering algorithms on the same data set, reseachers typically rely on global measures of quality, such as the mean silhouette width, and overlook the fine details of clustering. However, the silhouette width actually computes scores that describe how well each individual element is clustered. Inspired by this observation, we developed a novel clustering method, called SillyPutty. Unlike existing methods, SillyPutty uses the silhouette width for individual elements as a tool to optimize the mean silhouette width. This shift in perspective allows for a more granular evaluation of clustering quality, potentially addressing limitations in current methodologies. To test the SillyPutty algorithm, we first simulated a series of data sets using the Umpire R package and then used real-workd data from The Cancer Genome Atlas. Using these data sets, we compared SillyPutty to several existing algorithms using multiple metrics (Silhouette Width, Adjusted Rand Index, Entropy, Normalized Within-group Sum of Square errors, and Perfect Classification Count). Our findings revealed that SillyPutty is a valid standalone clustering method, comparable in accuracy to the best existing methods. We also found that the combination of hierarchical clustering followed by SillyPutty has the best overall performance in terms of both accuracy and speed. Availability: The SillyPutty R package can be downloaded from the Comprehensive R Archive Network (CRAN).


Assuntos
Algoritmos , Análise por Conglomerados , Humanos , Neoplasias/patologia , Software
2.
Ann Surg ; 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38771951

RESUMO

OBJECTIVE: We aimed to assess the levels of MDM2-DNA within extracellular vesicles (EVs) isolated from the serum of retroperitoneal liposarcoma (RLS) patients versus healthy donors, as well as within the same patients at the time of surgery versus post-operative surveillance visits. To determine whether EV-MDM2 may serve as a possible first-ever biomarker of liposarcoma recurrence. BACKGROUND: A hallmark of well-differentiated and de-differentiated (WD/DD) retroperitoneal liposarcoma is elevated MDM2 due to genome amplification, with recurrence rates of >50% even after complete resection. Imaging technologies frequently cannot resolve recurrent WD/DD-RLS versus postoperative scarring. Early detection of recurrent lesions, for which biomarkers are lacking, would guide surveillance and treatment decisions. METHODS: WD/DD-RLS serum samples were collected both at the time of surgery and during follow-up visits from 42 patients, along with sera from healthy donors (n=14). EVs were isolated, DNA purified and MDM2-DNA levels determined through q-PCR analysis. Non-parametric tests were employed to compare EV-MDM2 DNA levels from patients versus control group, as well as the time of surgery versus post-surgery conditions. RESULTS: EV-MDM2 levels were significantly higher in WD/DD-RLS than controls (P= 0.00085). Moreover, EV-MDM2 levels were remarkably decreased in WD/DD-RLS patients after resection (P=0.00036), reaching values comparable to control group (P=0.124). During post-operative surveillance, significant increases of EV-MDM2 was observed in some patients, correlating with CT scan evidence of recurrent or persistent post-resection disease. CONCLUSIONS: Serum EV-MDM2 may serve as a potential biomarker of early recurrent or post-operatively persistent WD/DD-RLS, a disease currently lacking such determinants.

3.
bioRxiv ; 2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37131792

RESUMO

Gene regulatory networks play a critical role in understanding cell states, gene expression, and biological processes. Here, we investigated the utility of transcription factors (TFs) and microRNAs (miRNAs) in creating a low-dimensional representation of cell states and predicting gene expression across 31 cancer types. We identified 28 clusters of miRNAs and 28 clusters of TFs, demonstrating that they can differentiate tissue of origin. Using a simple SVM classifier, we achieved an average accuracy of 92.8% in tissue classification. We also predicted the entire transcriptome using Tissue-Agnostic and Tissue-Aware models, with average R2 values of 0.45 and 0.70, respectively. Our Tissue-Aware model, using 56 selected features, showed comparable predictive power to the widely-used L1000 genes. However, the model's transportability was impacted by covariate shift, particularly inconsistent microRNA expression across datasets.

4.
Blood ; 142(1): 44-61, 2023 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-37023372

RESUMO

In chronic lymphocytic leukemia (CLL), epigenetic alterations are considered to centrally shape the transcriptional signatures that drive disease evolution and underlie its biological and clinical subsets. Characterizations of epigenetic regulators, particularly histone-modifying enzymes, are very rudimentary in CLL. In efforts to establish effectors of the CLL-associated oncogene T-cell leukemia 1A (TCL1A), we identified here the lysine-specific histone demethylase KDM1A to interact with the TCL1A protein in B cells in conjunction with an increased catalytic activity of KDM1A. We demonstrate that KDM1A is upregulated in malignant B cells. Elevated KDM1A and associated gene expression signatures correlated with aggressive disease features and adverse clinical outcomes in a large prospective CLL trial cohort. Genetic Kdm1a knockdown in Eµ-TCL1A mice reduced leukemic burden and prolonged animal survival, accompanied by upregulated p53 and proapoptotic pathways. Genetic KDM1A depletion also affected milieu components (T, stromal, and monocytic cells), resulting in significant reductions in their capacity to support CLL-cell survival and proliferation. Integrated analyses of differential global transcriptomes (RNA sequencing) and H3K4me3 marks (chromatin immunoprecipitation sequencing) in Eµ-TCL1A vs iKdm1aKD;Eµ-TCL1A mice (confirmed in human CLL) implicate KDM1A as an oncogenic transcriptional repressor in CLL which alters histone methylation patterns with pronounced effects on defined cell death and motility pathways. Finally, pharmacologic KDM1A inhibition altered H3K4/9 target methylation and revealed marked anti-B-cell leukemic synergisms. Overall, we established the pathogenic role and effector networks of KDM1A in CLL via tumor-cell intrinsic mechanisms and its impacts in cells of the microenvironment. Our data also provide rationales to further investigate therapeutic KDM1A targeting in CLL.


Assuntos
Leucemia Linfocítica Crônica de Células B , Humanos , Camundongos , Animais , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Histonas/metabolismo , Lisina , Estudos Prospectivos , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Microambiente Tumoral
5.
Cancer Discov ; 13(4): 910-927, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36715691

RESUMO

The human papillomavirus (HPV) genome is integrated into host DNA in most HPV-positive cancers, but the consequences for chromosomal integrity are unknown. Continuous long-read sequencing of oropharyngeal cancers and cancer cell lines identified a previously undescribed form of structural variation, "heterocateny," characterized by diverse, interrelated, and repetitive patterns of concatemerized virus and host DNA segments within a cancer. Unique breakpoints shared across structural variants facilitated stepwise reconstruction of their evolution from a common molecular ancestor. This analysis revealed that virus and virus-host concatemers are unstable and, upon insertion into and excision from chromosomes, facilitate capture, amplification, and recombination of host DNA and chromosomal rearrangements. Evidence of heterocateny was detected in extrachromosomal and intrachromosomal DNA. These findings indicate that heterocateny is driven by the dynamic, aberrant replication and recombination of an oncogenic DNA virus, thereby extending known consequences of HPV integration to include promotion of intratumoral heterogeneity and clonal evolution. SIGNIFICANCE: Long-read sequencing of HPV-positive cancers revealed "heterocateny," a previously unreported form of genomic structural variation characterized by heterogeneous, interrelated, and repetitive genomic rearrangements within a tumor. Heterocateny is driven by unstable concatemerized HPV genomes, which facilitate capture, rearrangement, and amplification of host DNA, and promotes intratumoral heterogeneity and clonal evolution. See related commentary by McBride and White, p. 814. This article is highlighted in the In This Issue feature, p. 799.


Assuntos
Neoplasias Orofaríngeas , Infecções por Papillomavirus , Humanos , Papillomavirus Humano , Rearranjo Gênico , Evolução Clonal/genética , Integração Viral/genética , Papillomaviridae/genética
6.
Comput Syst Oncol ; 2(2)2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35966389

RESUMO

Cancer progression, including the development of intratumor heterogeneity, is inherently a spatial process. Mathematical models of tumor evolution may be a useful starting point for understanding the patterns of heterogeneity that can emerge in the presence of spatial growth. A commonly studied spatial growth model assumes that tumor cells occupy sites on a lattice and replicate into neighboring sites. Our R package SITH provides a convenient interface for exploring this model. Our efficient simulation algorithm allows for users to generate 3D tumors with millions of cells in under a minute. For visualizing the distribution of mutations throughout the tumor, SITH provides interactive graphics and summary plots. Additionally, SITH can produce synthetic bulk and single-cell DNA-seq datasets by sampling from the simulated tumor. A streamlined API makes SITH a useful tool for investigating the relationship between spatial growth and intratumor heterogeneity. SITH is a part of CRAN and can be installed by running install.packages("SITH") from the R console. See https://CRAN.R-project.org/package=SITH for the user manual and package vignette.

7.
Genome Res ; 32(1): 55-70, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34903527

RESUMO

Human papillomavirus (HPV) causes 5% of all cancers and frequently integrates into host chromosomes. The HPV oncoproteins E6 and E7 are necessary but insufficient for cancer formation, indicating that additional secondary genetic events are required. Here, we investigate potential oncogenic impacts of virus integration. Analysis of 105 HPV-positive oropharyngeal cancers by whole-genome sequencing detects virus integration in 77%, revealing five statistically significant sites of recurrent integration near genes that regulate epithelial stem cell maintenance (i.e., SOX2, TP63, FGFR, MYC) and immune evasion (i.e., CD274). Genomic copy number hyperamplification is enriched 16-fold near HPV integrants, and the extent of focal host genomic instability increases with their local density. The frequency of genes expressed at extreme outlier levels is increased 86-fold within ±150 kb of integrants. Across 95% of tumors with integration, host gene transcription is disrupted via intragenic integrants, chimeric transcription, outlier expression, gene breaking, and/or de novo expression of noncoding or imprinted genes. We conclude that virus integration can contribute to carcinogenesis in a large majority of HPV-positive oropharyngeal cancers by inducing extensive disruption of host genome structure and gene expression.


Assuntos
Alphapapillomavirus , Proteínas Oncogênicas Virais , Neoplasias Orofaríngeas , Alphapapillomavirus/metabolismo , Carcinogênese , Humanos , Proteínas Oncogênicas Virais/genética , Neoplasias Orofaríngeas/genética , Papillomaviridae/genética , Papillomaviridae/metabolismo , Proteínas E7 de Papillomavirus/genética , Proteínas E7 de Papillomavirus/metabolismo , Integração Viral/genética
8.
J Biomed Inform ; 118: 103788, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33862229

RESUMO

INTRODUCTION: Clustering analyses in clinical contexts hold promise to improve the understanding of patient phenotype and disease course in chronic and acute clinical medicine. However, work remains to ensure that solutions are rigorous, valid, and reproducible. In this paper, we evaluate best practices for dissimilarity matrix calculation and clustering on mixed-type, clinical data. METHODS: We simulate clinical data to represent problems in clinical trials, cohort studies, and EHR data, including single-type datasets (binary, continuous, categorical) and 4 data mixtures. We test 5 single distance metrics (Jaccard, Hamming, Gower, Manhattan, Euclidean) and 3 mixed distance metrics (DAISY, Supersom, and Mercator) with 3 clustering algorithms (hierarchical (HC), k-medoids, self-organizing maps (SOM)). We quantitatively and visually validate by Adjusted Rand Index (ARI) and silhouette width (SW). We applied our best methods to two real-world data sets: (1) 21 features collected on 247 patients with chronic lymphocytic leukemia, and (2) 40 features collected on 6000 patients admitted to an intensive care unit. RESULTS: HC outperformed k-medoids and SOM by ARI across data types. DAISY produced the highest mean ARI for mixed data types for all mixtures except unbalanced mixtures dominated by continuous data. Compared to other methods, DAISY with HC uncovered superior, separable clusters in both real-world data sets. DISCUSSION: Selecting an appropriate mixed-type metric allows the investigator to obtain optimal separation of patient clusters and get maximum use of their data. Superior metrics for mixed-type data handle multiple data types using multiple, type-focused distances. Better subclassification of disease opens avenues for targeted treatments, precision medicine, clinical decision support, and improved patient outcomes.


Assuntos
Leucemia Linfocítica Crônica de Células B , Algoritmos , Análise por Conglomerados , Simulação por Computador , Humanos
9.
BMC Bioinformatics ; 22(1): 100, 2021 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-33648439

RESUMO

BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers. RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database. CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases.


Assuntos
Doenças Hematológicas , Cariotipagem , Neoplasias , Aberrações Cromossômicas , Humanos , Cariótipo
10.
Genome Res ; 31(5): 747-761, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33707228

RESUMO

Acute myeloid leukemia (AML) is a molecularly complex disease characterized by heterogeneous tumor genetic profiles and involving numerous pathogenic mechanisms and pathways. Integration of molecular data types across multiple patient cohorts may advance current genetic approaches for improved subclassification and understanding of the biology of the disease. Here, we analyzed genome-wide DNA methylation in 649 AML patients using Illumina arrays and identified a configuration of 13 subtypes (termed "epitypes") using unbiased clustering. Integration of genetic data revealed that most epitypes were associated with a certain recurrent mutation (or combination) in a majority of patients, yet other epitypes were largely independent. Epitypes showed developmental blockage at discrete stages of myeloid differentiation, revealing epitypes that retain arrested hematopoietic stem-cell-like phenotypes. Detailed analyses of DNA methylation patterns identified unique patterns of aberrant hyper- and hypomethylation among epitypes, with variable involvement of transcription factors influencing promoter, enhancer, and repressed regions. Patients in epitypes with stem-cell-like methylation features showed inferior overall survival along with up-regulated stem cell gene expression signatures. We further identified a DNA methylation signature involving STAT motifs associated with FLT3-ITD mutations. Finally, DNA methylation signatures were stable at relapse for the large majority of patients, and rare epitype switching accompanied loss of the dominant epitype mutations and reversion to stem-cell-like methylation patterns. These results show that DNA methylation-based classification integrates important molecular features of AML to reveal the diverse pathogenic and biological aspects of the disease.


Assuntos
Metilação de DNA , Leucemia Mieloide Aguda , Humanos , Leucemia Mieloide Aguda/metabolismo , Mutação , Regiões Promotoras Genéticas
11.
Cancer Genet ; 248-249: 34-38, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33059160

RESUMO

Karyotyping, the practice of visually examining and recording chromosomal abnormalities, is commonly used to diagnose diseases of genetic origin, including cancers. Karyotypes are recorded as text written in the International System for Human Cytogenetic Nomenclature (ISCN). Downstream analysis of karyotypes is conducted manually, due to the visual nature of analysis and the linguistic structure of the ISCN. The ISCN has not been computer-readable and, as such, prevents the full potential of these genomic data from being realized. In response, we developed CytoGPS, a platform to analyze large volumes of cytogenetic data using a Loss-Gain-Fusion model that converts the human-readable ISCN karyotypes into a machine-readable binary format. As proof of principle, we applied CytoGPS to cytogenetic data from the Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer, a National Cancer Institute hosted database of over 69,000 karyotypes of human cancers. Using the Jaccard coefficient to determine similarity between karyotypes structured as binary vectors, we were able to identify novel patterns from 4,968 Mitelman CML karyotypes, such as the co-occurrence of trisomy 19 and 21. The CytoGPS platform unlocks the potential for large-scale, comparative analysis of cytogenetic data. This methodological platform is freely available at CytoGPS.org.


Assuntos
Algoritmos , Aberrações Cromossômicas , Cromossomos Humanos , Bases de Dados Factuais , Cariotipagem/métodos , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/patologia , Análise Citogenética , Humanos , Prognóstico
12.
J Am Med Inform Assoc ; 27(7): 1019-1027, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32483590

RESUMO

OBJECTIVE: Unsupervised machine learning approaches hold promise for large-scale clinical data. However, the heterogeneity of clinical data raises new methodological challenges in feature selection, choosing a distance metric that captures biological meaning, and visualization. We hypothesized that clustering could discover prognostic groups from patients with chronic lymphocytic leukemia, a disease that provides biological validation through well-understood outcomes. METHODS: To address this challenge, we applied k-medoids clustering with 10 distance metrics to 2 experiments ("A" and "B") with mixed clinical features collapsed to binary vectors and visualized with both multidimensional scaling and t-stochastic neighbor embedding. To assess prognostic utility, we performed survival analysis using a Cox proportional hazard model, log-rank test, and Kaplan-Meier curves. RESULTS: In both experiments, survival analysis revealed a statistically significant association between clusters and survival outcomes (A: overall survival, P = .0164; B: time from diagnosis to treatment, P = .0039). Multidimensional scaling separated clusters along a gradient mirroring the order of overall survival. Longer survival was associated with mutated immunoglobulin heavy-chain variable region gene (IGHV) status, absent Zap 70 expression, female sex, and younger age. CONCLUSIONS: This approach to mixed-type data handling and selection of distance metric captured well-understood, binary, prognostic markers in chronic lymphocytic leukemia (sex, IGHV mutation status, ZAP70 expression status) with high fidelity.


Assuntos
Cadeias Pesadas de Imunoglobulinas/genética , Leucemia Linfocítica Crônica de Células B/mortalidade , Mutação , Aprendizado de Máquina não Supervisionado , Proteína-Tirosina Quinase ZAP-70/metabolismo , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Estimativa de Kaplan-Meier , Leucemia Linfocítica Crônica de Células B/imunologia , Leucemia Linfocítica Crônica de Células B/metabolismo , Masculino , Pessoa de Meia-Idade , Prognóstico , Modelos de Riscos Proporcionais
13.
JCI Insight ; 5(12)2020 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-32554930

RESUMO

Detecting, characterizing, and monitoring rare populations of cells can increase testing sensitivity, give insight into disease mechanism, and inform clinical decision making. One area that can benefit from increased resolution is management of cancers in clinical remission but with measurable residual disease (MRD) by multicolor FACS. Detecting and monitoring genomic clonal resistance to treatment in the setting of MRD is technically difficult and resource intensive due to the limited amounts of disease cells. Here, we describe limited-cell FACS sequencing (LC-FACSeq), a reproducible, highly sensitive method of characterizing clonal evolution in rare cells relevant to different types of acute and chronic leukemias. We demonstrate the utility of LC-FACSeq for broad multigene gene panels and its application for monitoring sequential acquisition of mutations conferring therapy resistance and clonal evolution in long-term ibrutinib treatment of patients with chronic lymphocytic leukemia. This technique is generalizable for monitoring of other blood and marrow infiltrating cancers.


Assuntos
Adenina/análogos & derivados , Evolução Clonal/imunologia , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Leucemia/tratamento farmacológico , Neoplasia Residual/tratamento farmacológico , Piperidinas/uso terapêutico , Adenina/uso terapêutico , Células Clonais , Humanos , Leucemia/imunologia , Mutação/genética , Neoplasia Residual/diagnóstico
14.
J Comput Biol ; 27(7): 1157-1170, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-31794247

RESUMO

The transcriptome of a tumor contains detailed information about the disease. Although advances in sequencing technologies have generated larger data sets, there are still many questions about exactly how the transcriptome is regulated. One class of regulatory elements consists of microRNAs (or miRs), many of which are known to be associated with cancer. To better understand the relationships between miRs and cancers, we analyzed ∼9000 samples from 32 cancer types studied in The Cancer Genome Atlas. Our feature reduction algorithm found evidence for 21 biologically interpretable clusters of miRs, many of which were statistically associated with a specific type of cancer. Moreover, the clusters contain sufficient information to distinguish between most types of cancer. We then used linear models to measure, genome-wide, how much variation in gene expression could be explained by the 21 average expression values ("scores") of the clusters. Based on the ∼20,000 per-gene R2 values, we found that (1) mean differences between tissues of origin explain about 36% of variation; (2) the 21 miR cluster scores explain about 30% of the variation; and (3) combining tissue type with the miR scores explained about 56% of the total genome-wide variation in gene expression. Our analysis of poorly explained genes shows that they are enriched for olfactory receptor processes, sensory perception, and nervous system processing, which are necessary to receive and interpret signals from outside the organism. Therefore, it is reasonable for those genes to be always active and not get downregulated by miRs. In contrast, highly explained genes are characterized by genes translating to proteins necessary for transport, plasma membrane, or metabolic processes that are heavily regulated processes inside the cell. Other genetic regulatory elements such as transcription factors and methylation might help explain some of the remaining variation in gene expression.


Assuntos
Regulação Neoplásica da Expressão Gênica , MicroRNAs/genética , Neoplasias/genética , Feminino , Humanos , Aprendizado de Máquina , Família Multigênica
15.
Lancet Oncol ; 20(11): 1576-1586, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31582354

RESUMO

BACKGROUND: Fludarabine, cyclophosphamide, and rituximab (FCR) has become a gold-standard chemoimmunotherapy regimen for patients with chronic lymphocytic leukaemia. However, the question remains of how to treat treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia. We therefore aimed to develop and validate a gene expression signature to identify which of these patients are likely to achieve durable remissions with FCR chemoimmunotherapy. METHODS: We did a retrospective cohort study in two cohorts of treatment-naive patients (aged ≥18 years) with chronic lymphocytic leukaemia. The discovery and training cohort consisted of peripheral blood samples collected from patients treated at the University of Texas MD Anderson Cancer Center (Houston, TX, USA), who fulfilled the diagnostic criteria of the International Workshop on Chronic Lymphocytic Leukemia, had received at least three cycles of FCR chemoimmunotherapy, and had been treated between Oct 10, 2000, and Oct 26, 2006 (ie, the MDACC cohort). We did transcriptional profiling on samples obtained from the MDACC cohort to identify genes associated with time to progression. We did univariate Cox proportional hazards analyses and used significant genes to cluster IGHV-unmutated samples into two groups (intermediate prognosis and unfavourable prognosis). After using cross-validation to assess robustness, we applied the Lasso method to standardise the gene expression values to find a minimum gene signature. We validated this signature in an external cohort of treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia enrolled on the CLL8 trial of the German Chronic Lymphocytic Leukaemia Study Group who were treated between July 21, 2003, and April 4, 2006 (ie, the CLL8 cohort). FINDINGS: The MDACC cohort consisted of 101 patients and the CLL8 cohort consisted of 109 patients. Using the MDACC cohort, we identified and developed a 17-gene expression signature that distinguished IGHV-unmutated patients who were likely to achieve a long-term remission following front-line FCR chemoimmunotherapy from those who might benefit from alternative front-line regimens (hazard ratio 3·83, 95% CI 1·94-7·59; p<0·0001). We validated this gene signature in the CLL8 cohort; patients with an unfavourable prognosis versus those with an intermediate prognosis had a cause-specific hazard ratio of 1·90 (95% CI 1·18-3·06; p=0·008). Median time to progression was 39 months (IQR 22-69) for those with an unfavourable prognosis compared with 59 months (28-84) for those with an intermediate prognosis. INTERPRETATION: We have developed a robust, reproducible 17-gene signature that identifies a subset of treatment-naive patients with IGHV-unmutated chronic lymphocytic leukaemia who might substantially benefit from treatment with FCR chemoimmunotherapy. We recommend testing the value of this gene signature in a prospective study that compares FCR treatment with newer alternative therapies as part of a randomised clinical trial. FUNDING: Chronic Lymphocytic Leukaemia Global Research Foundation and the National Institutes of Health/National Cancer Institute.


Assuntos
Antineoplásicos Imunológicos/administração & dosagem , Protocolos de Quimioterapia Combinada Antineoplásica/administração & dosagem , Ciclofosfamida/administração & dosagem , Perfilação da Expressão Gênica , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Rituximab/administração & dosagem , Transcriptoma , Vidarabina/análogos & derivados , Idoso , Antineoplásicos Imunológicos/efeitos adversos , Protocolos de Quimioterapia Combinada Antineoplásica/efeitos adversos , Ciclofosfamida/efeitos adversos , Progressão da Doença , Feminino , Alemanha , Humanos , Leucemia Linfocítica Crônica de Células B/genética , Leucemia Linfocítica Crônica de Células B/imunologia , Leucemia Linfocítica Crônica de Células B/patologia , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Indução de Remissão , Medição de Risco , Fatores de Risco , Rituximab/efeitos adversos , Texas , Fatores de Tempo , Resultado do Tratamento , Vidarabina/administração & dosagem , Vidarabina/efeitos adversos
16.
Blood ; 134(8): 688-698, 2019 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-31292113

RESUMO

Alterations in global DNA methylation patterns are a major hallmark of cancer and represent attractive biomarkers for personalized risk stratification. Chronic lymphocytic leukemia (CLL) risk stratification studies typically focus on time to first treatment (TTFT), time to progression (TTP) after treatment, and overall survival (OS). Whereas TTFT risk stratification remains similar over time, TTP and OS have changed dramatically with the introduction of targeted therapies, such as the Bruton tyrosine kinase inhibitor ibrutinib. We have shown that genome-wide DNA methylation patterns in CLL are strongly associated with phenotypic differentiation and patient outcomes. Here, we developed a novel assay, termed methylation-iPLEX (Me-iPLEX), for high-throughput quantification of targeted panels of single cytosine guanine dinucleotides from multiple independent loci. Me-iPLEX was used to classify CLL samples into 1 of 3 known epigenetic subtypes (epitypes). We examined the impact of epitype in 1286 CLL patients from 4 independent cohorts representing a comprehensive view of CLL disease course and therapies. We found that epitype significantly predicted TTFT and OS among newly diagnosed CLL patients. Additionally, epitype predicted TTP and OS with 2 common CLL therapies: chemoimmunotherapy and ibrutinib. Epitype retained significance after stratifying by biologically related biomarkers, immunoglobulin heavy chain mutational status, and ZAP70 expression, as well as other common prognostic markers. Furthermore, among several biological traits enriched between epitypes, we found highly biased immunogenetic features, including IGLV3-21 usage in the poorly characterized intermediate-programmed CLL epitype. In summary, Me-iPLEX is an elegant method to assess epigenetic signatures, including robust classification of CLL epitypes that independently stratify patient risk at diagnosis and time of treatment.


Assuntos
Metilação de DNA , Leucemia Linfocítica Crônica de Células B/genética , Biomarcadores Tumorais/genética , Progressão da Doença , Epigênese Genética , Loci Gênicos , Testes Genéticos , Humanos , Leucemia Linfocítica Crônica de Células B/diagnóstico , Prognóstico
17.
Bioinformatics ; 35(24): 5365-5366, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31263896

RESUMO

SUMMARY: Karyotype data are the most common form of genetic data that is regularly used clinically. They are collected as part of the standard of care in many diseases, particularly in pediatric and cancer medicine contexts. Karyotypes are represented in a unique text-based format, with a syntax defined by the International System for human Cytogenetic Nomenclature (ISCN). While human-readable, ISCN is not intrinsically machine-readable. This limitation has prevented the full use of complex karyotype data in discovery science use cases. To enhance the utility and value of karyotype data, we developed a tool named CytoGPS. CytoGPS first parses ISCN karyotypes into a machine-readable format. It then converts the ISCN karyotype into a binary Loss-Gain-Fusion (LGF) model, which represents all cytogenetic abnormalities as combinations of loss, gain, or fusion events, in a format that is analyzable using modern computational methods. Such data is then made available for comprehensive 'downstream' analyses that previously were not feasible. AVAILABILITY AND IMPLEMENTATION: Freely available at http://cytogps.org.


Assuntos
Aberrações Cromossômicas , Cariótipo , Humanos , Cariotipagem , Neoplasias , Software
19.
EBioMedicine ; 44: 126-137, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31105032

RESUMO

BACKGROUND: Galectin 3 (LGALS3) gene expression is associated with poor survival in acute myeloid leukemia (AML) but the prognostic impact of LGALS3 protein expression in AML is unknown. LGALS3 supports diverse survival pathways including RAS mediated cascades, protein expression and stability of anti-apoptotic BCL2 family members, and activation of proliferative pathways including those mediated by beta Catenin. CD74 is a positive regulator of CD44 and CXCR4 signaling and this molecule may be critical for AML stem cell function. At present, the role of LGALS3 and CD74 in AML is unclear. In this study, we examine protein expression of LGALS3 and CD74 by reverse phase protein analysis (RPPA) and identify new protein networks associated with these molecules. In addition, we determine prognostic potential of LGALS3, CD74, and their protein networks for clinical correlates in AML patients. METHODS: RPPA was used to determine relative expression of LGALS3, CD74, and 229 other proteins in 231 fresh AML patient samples and 205 samples were from patients who were treated and evaluable for outcome. Pearson correlation analysis was performed to identify proteins associated with LGALS3 and CD74. Progeny clustering was performed to generate protein networks. String analysis was performed to determine protein:protein interactions in networks and to perform gene ontology analysis. Kaplan-Meir method was used to generate survival curves. FINDINGS: LGALS3 is highest in monocytic AML patients and those with elevated LGALS3 had significantly shorter remission duration compared to patients with lower LGALS3 levels (median 21.9 vs 51.3 weeks, p = 0.016). Pearson correlation of LGALS3 with 230 other proteins identifies a distinct set of 37 proteins positively correlated with LGALS3 expression levels with a high representation of proteins involved in AKT and ERK signaling pathways. Thirty-one proteins were negatively correlated with LGALS3 including an AKT phosphatase. Pearson correlation of proteins associated with CD74 identified 12 proteins negatively correlated with CD74 and 16 proteins that are positively correlated with CD74. CD74 network revealed strong association with CD44 signaling and a high representation of apoptosis regulators. Progeny clustering was used to build protein networks based on LGALS3 and CD74 associated proteins. A strong relationship of the LGALS3 network with the CD74 network was identified. For AML patients with both the LGALS3 and CD74 protein cluster active, median overall survival was only 24.3 weeks, median remission duration was 17.8 weeks, and no patient survived beyond one year. INTERPRETATION: The findings from this study identify for the first time protein networks associated with LGALS3 and CD74 in AML. Each network features unique pathway characteristics. The data also suggest that the LGALS3 network and the CD74 network each support AML cell survival and the two networks may cooperate in a novel high risk AML population. FUND: Leukemia Lymphoma Society provided funds to SMK for RPPA study of AML patient population. Texas Leukemia provided funds to PPR and SMK to study CD74 and LGALS3 expression in AML patients using RPPA. No payment was involved in the production of this manuscript.


Assuntos
Biomarcadores Tumorais , Ligante CD27/metabolismo , Galectina 3/metabolismo , Leucemia Mieloide Aguda/metabolismo , Leucemia Mieloide Aguda/mortalidade , Adulto , Idoso , Proteínas Sanguíneas , Ligante CD27/genética , Linhagem Celular Tumoral , Biologia Computacional/métodos , Feminino , Galectina 3/genética , Galectinas , Redes Reguladoras de Genes , Humanos , Estimativa de Kaplan-Meier , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/patologia , Masculino , Pessoa de Meia-Idade , Prognóstico , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Transdução de Sinais
20.
Bioinformatics ; 35(17): 2924-2931, 2019 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30689715

RESUMO

MOTIVATION: Clonal heterogeneity is common in many types of cancer, including chronic lymphocytic leukemia (CLL). Previous research suggests that the presence of multiple distinct cancer clones is associated with clinical outcome. Detection of clonal heterogeneity from high throughput data, such as sequencing or single nucleotide polymorphism (SNP) array data, is important for gaining a better understanding of cancer and may improve prediction of clinical outcome or response to treatment. Here, we present a new method, CloneSeeker, for inferring clinical heterogeneity from sequencing data, SNP array data, or both. RESULTS: We generated simulated SNP array and sequencing data and applied CloneSeeker along with two other methods. We demonstrate that CloneSeeker is more accurate than existing algorithms at determining the number of clones, distribution of cancer cells among clones, and mutation and/or copy numbers belonging to each clone. Next, we applied CloneSeeker to SNP array data from samples of 258 previously untreated CLL patients to gain a better understanding of the characteristics of CLL tumors and to elucidate the relationship between clonal heterogeneity and clinical outcome. We found that a significant majority of CLL patients appear to have multiple clones distinguished by copy number alterations alone. We also found that the presence of multiple clones corresponded with significantly worse survival among CLL patients. These findings may prove useful for improving the accuracy of prognosis and design of treatment strategies. AVAILABILITY AND IMPLEMENTATION: Code available on R-Forge: https://r-forge.r-project.org/projects/CloneSeeker/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Leucemia Linfocítica Crônica de Células B , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Algoritmos , Variações do Número de Cópias de DNA , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA