Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 116
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Nature ; 2024 Sep 04.
Article in English | MEDLINE | ID: mdl-39232164

ABSTRACT

Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task1,2. Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations3. Here, to address this challenge, we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. We developed CHIEF using 60,530 whole-slide images spanning 19 anatomical sites. Through pretraining on 44 terabytes of high-resolution pathology imaging datasets, CHIEF extracted microscopic representations useful for cancer cell detection, tumour origin identification, molecular profile characterization and prognostic prediction. We successfully validated CHIEF using 19,491 whole-slide images from 32 independent slide sets collected from 24 hospitals and cohorts internationally. Overall, CHIEF outperformed the state-of-the-art deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations and processed by different slide preparation methods. CHIEF provides a generalizable foundation for efficient digital pathology evaluation for patients with cancer.

2.
Genome Res ; 34(7): 1027-1035, 2024 Aug 20.
Article in English | MEDLINE | ID: mdl-38951026

ABSTRACT

mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties, including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs, which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods, including on a new flu vaccine data set.


Subject(s)
RNA, Messenger , mRNA Vaccines , Humans , RNA, Messenger/genetics , Codon , Algorithms
3.
Nucleic Acids Res ; 51(D1): D1432-D1445, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36400569

ABSTRACT

The toxic effects of compounds on environment, humans, and other organisms have been a major focus of many research areas, including drug discovery and ecological research. Identifying the potential toxicity in the early stage of compound/drug discovery is critical. The rapid development of computational methods for evaluating various toxicity categories has increased the need for comprehensive and system-level collection of toxicological data, associated attributes, and benchmarks. To contribute toward this goal, we proposed TOXRIC (https://toxric.bioinforai.tech/), a database with comprehensive toxicological data, standardized attribute data, practical benchmarks, informative visualization of molecular representations, and an intuitive function interface. The data stored in TOXRIC contains 113 372 compounds, 13 toxicity categories, 1474 toxicity endpoints covering in vivo/in vitro endpoints and 39 feature types, covering structural, target, transcriptome, metabolic data, and other descriptors. All the curated datasets of endpoints and features can be retrieved, downloaded and directly used as output or input to Machine Learning (ML)-based prediction models. In addition to serving as a data repository, TOXRIC also provides visualization of benchmarks and molecular representations for all endpoint datasets. Based on these results, researchers can better understand and select optimal feature types, molecular representations, and baseline algorithms for each endpoint prediction task. We believe that the rich information on compound toxicology, ML-ready datasets, benchmarks and molecular representation distribution can greatly facilitate toxicological investigations, interpretation of toxicological mechanisms, compound/drug discovery and the development of computational methods.


Subject(s)
Databases, Factual , Toxicology , Humans , Benchmarking , Toxicology/methods , Software
4.
Semin Cancer Biol ; 84: 310-328, 2022 09.
Article in English | MEDLINE | ID: mdl-33290844

ABSTRACT

Radiological imaging is an integral component of cancer care, including diagnosis, staging, and treatment response monitoring. It contains rich information about tumor phenotypes that are governed not only by cancer cellintrinsic biological processes but also by the tumor microenvironment, such as the composition and function of tumor-infiltrating immune cells. By analyzing the radiological scans using a quantitative radiomics approach, robust relations between specific imaging and molecular phenotypes can be established. Indeed, a number of studies have demonstrated the feasibility of radiogenomics for predicting intrinsic molecular subtypes and gene expression signatures in breast cancer based on MRI. In parallel, promising results have been shown for inferring the amount of tumor-infiltrating lymphocytes, a key factor for the efficacy of cancer immunotherapy, from standard-of-care radiological images. Compared with the biopsy-based approach, radiogenomics offers a unique avenue to profile the molecular makeup of the tumor and immune microenvironment as well as its evolution in a noninvasive and holistic manner through longitudinal imaging scans. Here, we provide a systematic review of the state of the art radiogenomics studies in the era of immunotherapy and discuss emerging paradigms and opportunities in AI and deep learning approaches. These technical advances are expected to transform the radiogenomics field, leading to the discovery of reliable imaging biomarkers. This will pave the way for their clinical translation to guide precision cancer therapy.


Subject(s)
Breast Neoplasms , Tumor Microenvironment , Biomarkers, Tumor/genetics , Breast Neoplasms/drug therapy , Female , Genomics/methods , Humans , Immunotherapy , Lymphocytes, Tumor-Infiltrating , Tumor Microenvironment/genetics
5.
Dis Colon Rectum ; 66(12): e1195-e1206, 2023 12 01.
Article in English | MEDLINE | ID: mdl-37682775

ABSTRACT

BACKGROUND: Accurate prediction of response to neoadjuvant chemoradiotherapy is critical for subsequent treatment decisions for patients with locally advanced rectal cancer. OBJECTIVE: To develop and validate a deep learning model based on the comparison of paired MRI before and after neoadjuvant chemoradiotherapy to predict pathological complete response. DESIGN: By capturing the changes from MRI before and after neoadjuvant chemoradiotherapy in 638 patients, we trained a multitask deep learning model for response prediction (DeepRP-RC) that also allowed simultaneous segmentation. Its performance was independently tested in an internal and 3 external validation sets, and its prognostic value was also evaluated. SETTINGS: Multicenter study. PATIENTS: We retrospectively enrolled 1201 patients diagnosed with locally advanced rectal cancer who underwent neoadjuvant chemoradiotherapy before total mesorectal excision. Patients had been treated at 1 of 4 hospitals in China between January 2013 and December 2020. MAIN OUTCOME MEASURES: The main outcome was the accuracy of predicting pathological complete response, measured as the area under receiver operating curve for the training and validation data sets. RESULTS: DeepRP-RC achieved high performance in predicting pathological complete response after neoadjuvant chemoradiotherapy, with area under the curve values of 0.969 (0.942-0.996), 0.946 (0.915-0.977), 0.943 (0.888-0.998), and 0.919 (0.840-0.997) for the internal and 3 external validation sets, respectively. DeepRP-RC performed similarly well in the subgroups defined by receipt of radiotherapy, tumor location, T/N stages before and after neoadjuvant chemoradiotherapy, and age. Compared with experienced radiologists, the model showed substantially higher performance in pathological complete response prediction. The model was also highly accurate in identifying the patients with poor response. Furthermore, the model was significantly associated with disease-free survival independent of clinicopathological variables. LIMITATIONS: This study was limited by its retrospective design and absence of multiethnic data. CONCLUSIONS: DeepRP-RC could be an accurate preoperative tool for pathological complete response prediction in rectal cancer after neoadjuvant chemoradiotherapy. UN SISTEMA DE IA BASADO EN RESONANCIA MAGNTICA LONGITUDINAL PARA PREDECIR LA RESPUESTA PATOLGICA COMPLETA DESPUS DE LA TERAPIA NEOADYUVANTE EN EL CNCER DE RECTO UN ESTUDIO DE VALIDACIN MULTICNTRICO: ANTECEDENTES:La predicción precisa de la respuesta a la quimiorradioterapia neoadyuvante es fundamental para las decisiones de tratamiento posteriores para los pacientes con cáncer de recto localmente avanzado.OBJETIVO:Desarrollar y validar un modelo de aprendizaje profundo basado en la comparación de resonancias magnéticas pareadas antes y después de la quimiorradioterapia neoadyuvante para predecir la respuesta patológica completa.DISEÑO:Al capturar los cambios de las imágenes de resonancia magnética antes y después de la quimiorradioterapia neoadyuvante en 638 pacientes, entrenamos un modelo de aprendizaje profundo multitarea para la predicción de respuesta (DeepRP-RC) que también permitió la segmentación simultánea. Su rendimiento se probó de forma independiente en un conjunto de validación interna y tres externas, y también se evaluó su valor pronóstico.ESCENARIO:Estudio multicéntrico.PACIENTES:Volvimos a incluir retrospectivamente a 1201 pacientes diagnosticados con cáncer de recto localmente avanzado y sometidos a quimiorradioterapia neoadyuvante antes de la escisión total del mesorrecto. Eran de cuatro hospitales en China en el período entre enero de 2013 y diciembre de 2020.PRINCIPALES MEDIDAS DE RESULTADO:Los principales resultados fueron la precisión de la predicción de la respuesta patológica completa, medida como el área bajo la curva operativa del receptor para los conjuntos de datos de entrenamiento y validación.RESULTADOS:DeepRP-RC logró un alto rendimiento en la predicción de la respuesta patológica completa después de la quimiorradioterapia neoadyuvante, con valores de área bajo la curva de 0,969 (0,942-0,996), 0,946 (0,915-0,977), 0,943 (0,888-0,998), y 0,919 (0,840-0,997) para los conjuntos de validación interna y las tres externas, respectivamente. DeepRP-RC se desempeñó de manera similar en los subgrupos definidos por la recepción de radioterapia, la ubicación del tumor, los estadios T/N antes y después de la quimiorradioterapia neoadyuvante y la edad. En comparación con los radiólogos experimentados, el modelo mostró un rendimiento sustancialmente mayor en la predicción de la respuesta patológica completa. El modelo también fue muy preciso en la identificación de los pacientes con mala respuesta. Además, el modelo se asoció significativamente con la supervivencia libre de enfermedad independientemente de las variables clinicopatológicas.LIMITACIONES:Este estudio estuvo limitado por el diseño retrospectivo y la ausencia de datos multiétnicos.CONCLUSIONES:DeepRP-RC podría servir como una herramienta preoperatoria precisa para la predicción de la respuesta patológica completa en el cáncer de recto después de la quimiorradioterapia neoadyuvante. (Traducción-Dr. Felipe Bellolio ).


Subject(s)
Neoadjuvant Therapy , Rectal Neoplasms , Humans , Retrospective Studies , Artificial Intelligence , Chemoradiotherapy/adverse effects , Rectal Neoplasms/therapy , Rectal Neoplasms/drug therapy , Magnetic Resonance Imaging , Neoplasm Staging
6.
Br J Cancer ; 126(6): 899-906, 2022 04.
Article in English | MEDLINE | ID: mdl-34921229

ABSTRACT

BACKGROUND: B lymphocytes have multifaceted functions in the tumour microenvironment, and their prognostic role in human cancers is controversial. Here we aimed to identify tumour microenvironmental factors that influence the prognostic effects of B cells. METHODS: We conducted a gene expression analysis of 3585 patients for whom the clinical outcome information was available. We further investigated the clinical relevance for predicting immunotherapy response. RESULTS: We identified a novel B cell-related gene (BCR) signature consisting of nine cytokine signalling genes whose high expression could diminish the beneficial impact of B cells on patient prognosis. In triple-negative breast cancer, higher B cell abundance was associated with favourable survival only when the BCR signature was low (HR = 0.68, p = 0.0046). By contrast, B cell abundance had no impact on prognosis when the BCR signature was high (HR = 0.93, p = 0.80). This pattern was consistently observed across multiple cancer types including lung, colorectal, and melanoma. Further, the BCR signature predicted response to immune checkpoint blockade in metastatic melanoma and compared favourably with the established markers. CONCLUSIONS: The prognostic impact of tumour-infiltrating B cells depends on the status of cytokine signalling genes, which together could predict response to cancer immunotherapy.


Subject(s)
Immunotherapy , Melanoma , B-Lymphocytes , Humans , Melanoma/genetics , Prognosis , Tumor Microenvironment/genetics
7.
Brief Bioinform ; 21(4): 1397-1410, 2020 07 15.
Article in English | MEDLINE | ID: mdl-31504171

ABSTRACT

Essential genes are those whose loss of function compromises organism viability or results in profound loss of fitness. Recent gene-editing technologies have provided new opportunities to characterize essential genes. Here, we present an integrated analysis that comprehensively and systematically elucidates the genetic and regulatory characteristics of human essential genes. First, we found that essential genes act as 'hubs' in protein-protein interaction networks, chromatin structure and epigenetic modification. Second, essential genes represent conserved biological processes across species, although gene essentiality changes differently among species. Third, essential genes are important for cell development due to their discriminate transcription activity in embryo development and oncogenesis. In addition, we developed an interactive web server, the Human Essential Genes Interactive Analysis Platform (http://sysomics.com/HEGIAP/), which integrates abundant analytical tools to enable global, multidimensional interpretation of gene essentiality. Our study provides new insights that improve the understanding of human essential genes.


Subject(s)
Genes, Essential , Internet , Embryonic Development/genetics , Epigenesis, Genetic , Humans , Transcription, Genetic
8.
Ann Surg ; 274(6): e1153-e1161, 2021 12 01.
Article in English | MEDLINE | ID: mdl-31913871

ABSTRACT

OBJECTIVE: We aimed to develop a deep learning-based signature to predict prognosis and benefit from adjuvant chemotherapy using preoperative computed tomography (CT) images. BACKGROUND: Current staging methods do not accurately predict the risk of disease relapse for patients with gastric cancer. METHODS: We proposed a novel deep neural network (S-net) to construct a CT signature for predicting disease-free survival (DFS) and overall survival in a training cohort of 457 patients, and independently tested it in an external validation cohort of 1158 patients. An integrated nomogram was constructed to demonstrate the added value of the imaging signature to established clinicopathologic factors for individualized survival prediction. Prediction performance was assessed with respect to discrimination, calibration, and clinical usefulness. RESULTS: The DeLIS was associated with DFS and overall survival in the overall validation cohort and among subgroups defined by clinicopathologic variables, and remained an independent prognostic factor in multivariable analysis (P< 0.001). Integrating the imaging signature and clinicopathologic factors improved prediction performance, with C-indices: 0.792-0.802 versus 0.719-0.724, and net reclassification improvement 10.1%-28.3%. Adjuvant chemotherapy was associated with improved DFS in stage II patients with high-DeLIS [hazard ratio = 0.362 (95% confidence interval 0.149-0.882)] and stage III patients with high- and intermediate-DeLIS [hazard ratio = 0.611 (0.442-0.843); 0.633 (0.433-0.925)]. On the other hand, adjuvant chemotherapy did not affect survival for patients with low-DeLIS, suggesting a predictive effect (Pinteraction = 0.048, 0.016 for DFS in stage II and III disease). CONCLUSIONS: The proposed imaging signature improved prognostic prediction and could help identify patients most likely to benefit from adjuvant chemotherapy in gastric cancer.


Subject(s)
Deep Learning , Stomach Neoplasms/diagnostic imaging , Stomach Neoplasms/drug therapy , Tomography, X-Ray Computed , Aged , Chemotherapy, Adjuvant , Disease-Free Survival , Female , Humans , Male , Middle Aged , Neoplasm Staging , Nomograms , Predictive Value of Tests , Prognosis , Retrospective Studies , Stomach Neoplasms/pathology
9.
Brief Bioinform ; 20(4): 1524-1541, 2019 07 19.
Article in English | MEDLINE | ID: mdl-29617727

ABSTRACT

The Cancer Genome Atlas (TCGA) is a publicly funded project that aims to catalog and discover major cancer-causing genomic alterations with the goal of creating a comprehensive 'atlas' of cancer genomic profiles. The availability of this genome-wide information provides an unprecedented opportunity to expand our knowledge of tumourigenesis. Computational analytics and mining are frequently used as effective tools for exploring this byzantine series of biological and biomedical data. However, some of the more advanced computational tools are often difficult to understand or use, thereby limiting their application by scientists who do not have a strong computational background. Hence, it is of great importance to build user-friendly interfaces that allow both computational scientists and life scientists without a computational background to gain greater biological and medical insights. To that end, this survey was designed to systematically present available Web-based tools and facilitate the use TCGA data for cancer research.


Subject(s)
Databases, Genetic/statistics & numerical data , Neoplasms/genetics , Biomarkers, Tumor/genetics , Computational Biology , Gene Expression Profiling/statistics & numerical data , Genomics/statistics & numerical data , Humans , Internet , Mutation , Neoplasms/classification , Software , Surveys and Questionnaires , Survival Analysis , User-Computer Interface
10.
PLoS Comput Biol ; 16(2): e1007287, 2020 02.
Article in English | MEDLINE | ID: mdl-32084131

ABSTRACT

Hi-C is commonly used to study three-dimensional genome organization. However, due to the high sequencing cost and technical constraints, the resolution of most Hi-C datasets is coarse, resulting in a loss of information and biological interpretability. Here we develop DeepHiC, a generative adversarial network, to predict high-resolution Hi-C contact maps from low-coverage sequencing data. We demonstrated that DeepHiC is capable of reproducing high-resolution Hi-C data from as few as 1% downsampled reads. Empowered by adversarial training, our method can restore fine-grained details similar to those in high-resolution Hi-C matrices, boosting accuracy in chromatin loops identification and TADs detection, and outperforms the state-of-the-art methods in accuracy of prediction. Finally, application of DeepHiC to Hi-C data on mouse embryonic development can facilitate chromatin loop detection. We develop a web-based tool (DeepHiC, http://sysomics.com/deephic) that allows researchers to enhance their own Hi-C data with just a few clicks.


Subject(s)
Genome , Models, Biological , Chromatin/chemistry , Datasets as Topic , Sequence Analysis/methods
11.
Bioinformatics ; 35(20): 3931-3936, 2019 10 15.
Article in English | MEDLINE | ID: mdl-30860576

ABSTRACT

MOTIVATION: During development of the mammalian embryo, histone modification H3K4me3 plays an important role in regulating gene expression and exhibits extensive reprograming on the parental genomes. In addition to these dramatic epigenetic changes, certain unchanging regulatory elements are also essential for embryonic development. RESULTS: Using large-scale H3K4me3 chromatin immunoprecipitation sequencing data, we identified a form of H3K4me3 that was present during all eight stages of the mouse embryo before implantation. This 'stable H3K4me3' was highly accessible and much longer than normal H3K4me3. Moreover, most of the stable H3K4me3 was in the promoter region and was enriched in higher chromatin architecture. Using in-depth analysis, we demonstrated that stable H3K4me3 was related to higher gene expression levels and transcriptional initiation during embryonic development. Furthermore, stable H3K4me3 was much more active in blood tumor cells than in normal blood cells, suggesting a potential mechanism of cancer progression. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Embryonic Development , Animals , Chromatin Immunoprecipitation , Epigenesis, Genetic , Histones , Methylation , Mice
12.
RNA Biol ; 16(8): 1010-1021, 2019 08.
Article in English | MEDLINE | ID: mdl-31046554

ABSTRACT

The study of cancer prognosis serves as an important part of cancer research. Large-scale cancer studies have identified numerous genes and microRNAs (miRNAs) associated with prognosis. These informative genes and miRNAs represent potential biomarkers to predict survival and to elucidate the molecular mechanism of tumour progression. MiRNAs and transcription factors (TFs) can work cooperatively as essential mediators of gene expression, and their dysregulation affects cancer prognosis. A panoramic view of cancer prognosis at the system level, considering the co-regulation roles of miRNA and TF, remains elusive. Here, we establish 12 prognosis-related miRNA-TF co-regulatory networks. The characteristics of prognostic target genes and their regulators in the network are depicted. Although the target genes and co-regulatory patterns exhibit cancer-specific properties, some miRNAs and TFs are highly conserved across cancers. We illustrate and interpret the roles of these conserved regulators by building a model associated with cancer hallmarks, functional enrichment analysis, network community detection, and exhaustive literature research. The elaborated system-level prognostic miRNA-TF co-regulation landscape, including the highlighted roles of conserved regulators, provides a novel and powerful insights into further biological and medical discoveries.


Subject(s)
Gene Regulatory Networks/genetics , MicroRNAs/genetics , Neoplasms/genetics , Transcription, Genetic , Gene Expression Regulation, Neoplastic/genetics , Humans , Neoplasms/pathology , Prognosis , Transcription Factors/genetics
13.
Breast Cancer Res ; 20(1): 101, 2018 09 03.
Article in English | MEDLINE | ID: mdl-30176944

ABSTRACT

BACKGROUND: We sought to investigate associations between dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) features and tumor-infiltrating lymphocytes (TILs) in breast cancer, as well as to study if MRI features are complementary to molecular markers of TILs. METHODS: In this retrospective study, we extracted 17 computational DCE-MRI features to characterize tumor and parenchyma in The Cancer Genome Atlas cohort (n = 126). The percentage of stromal TILs was evaluated on H&E-stained histological whole-tumor sections. We first evaluated associations between individual imaging features and TILs. Multiple-hypothesis testing was corrected by the Benjamini-Hochberg method using false discovery rate (FDR). Second, we implemented LASSO (least absolute shrinkage and selection operator) and linear regression nested with tenfold cross-validation to develop an imaging signature for TILs. Next, we built a composite prediction model for TILs by combining imaging signature with molecular features. Finally, we tested the prognostic significance of the TIL model in an independent cohort (I-SPY 1; n = 106). RESULTS: Four imaging features were significantly associated with TILs (P < 0.05 and FDR < 0.2), including tumor volume, cluster shade of signal enhancement ratio (SER), mean SER of tumor-surrounding background parenchymal enhancement (BPE), and proportion of BPE. Among molecular and clinicopathological factors, only cytolytic score was correlated with TILs (ρ = 0.51; 95% CI, 0.36-0.63; P = 1.6E-9). An imaging signature that linearly combines five features showed correlation with TILs (ρ = 0.40; 95% CI, 0.24-0.54; P = 4.2E-6). A composite model combining the imaging signature and cytolytic score improved correlation with TILs (ρ = 0.62; 95% CI, 0.50-0.72; P = 9.7E-15). The composite model successfully distinguished low vs high, intermediate vs high, and low vs intermediate TIL groups, with AUCs of 0.94, 0.76, and 0.79, respectively. During validation (I-SPY 1), the predicted TILs from the imaging signature separated patients into two groups with distinct recurrence-free survival (RFS), with log-rank P = 0.042 among triple-negative breast cancer (TNBC). The composite model further improved stratification of patients with distinct RFS (log-rank P = 0.0008), where TNBC with no/minimal TILs had a worse prognosis. CONCLUSIONS: Specific MRI features of tumor and parenchyma are associated with TILs in breast cancer, and imaging may play an important role in the evaluation of TILs by providing key complementary information in equivocal cases or situations that are prone to sampling bias.


Subject(s)
Biomarkers, Tumor/metabolism , Breast Neoplasms/diagnostic imaging , Lymphocytes, Tumor-Infiltrating/metabolism , Magnetic Resonance Imaging/methods , Models, Biological , Adult , Aged , Aged, 80 and over , Biomarkers, Tumor/immunology , Breast/cytology , Breast/diagnostic imaging , Breast/immunology , Breast/pathology , Breast Neoplasms/immunology , Breast Neoplasms/mortality , Breast Neoplasms/pathology , Cohort Studies , Disease-Free Survival , Female , Humans , Image Processing, Computer-Assisted/methods , Kaplan-Meier Estimate , Linear Models , Lymphocytes, Tumor-Infiltrating/immunology , Mastectomy , Middle Aged , Predictive Value of Tests , Prognosis
14.
Radiology ; 288(1): 26-35, 2018 07.
Article in English | MEDLINE | ID: mdl-29714680

ABSTRACT

Purpose To characterize intratumoral spatial heterogeneity at perfusion magnetic resonance (MR) imaging and investigate intratumoral heterogeneity as a predictor of recurrence-free survival (RFS) in breast cancer. Materials and Methods In this retrospective study, a discovery cohort (n = 60) and a multicenter validation cohort (n = 186) were analyzed. Each tumor was divided into multiple spatially segregated, phenotypically consistent subregions on the basis of perfusion MR imaging parameters. The authors first defined a multiregional spatial interaction (MSI) matrix and then, based on this matrix, calculated 22 image features. A network strategy was used to integrate all image features and classify patients into different risk groups. The prognostic value of imaging-based stratification was evaluated in relation to clinical-pathologic factors with multivariable Cox regression. Results Three intratumoral subregions with high, intermediate, and low MR perfusion were identified and showed high consistency between the two cohorts. Patients in both cohorts were stratified according to network analysis of multiregional image features regarding RFS (log-rank test, P = .002 for both). Aggressive tumors were associated with a larger volume of the poorly perfused subregion as well as interaction between poorly and moderately perfused subregions and surrounding parenchyma. At multivariable analysis, the proposed MSI-based marker was independently associated with RFS (hazard ratio: 3.42; 95% confidence interval: 1.55, 7.57; P = .002) adjusting for age, estrogen receptor (ER) status, progesterone receptor status, human epidermal growth factor receptor type 2 (HER2) status, tumor volume, and pathologic complete response (pCR). Furthermore, imaging helped stratify patients for RFS within the ER-positive and HER2-positive subgroups (log-rank test, P = .007 and .004) and among patients without pCR after neoadjuvant chemotherapy (log-rank test, P = .003). Conclusion Breast cancer consists of multiple spatially distinct subregions. Imaging heterogeneity is an independent prognostic factor beyond traditional risk predictors.


Subject(s)
Breast Neoplasms/diagnostic imaging , Breast Neoplasms/drug therapy , Magnetic Resonance Angiography/methods , Neoadjuvant Therapy/methods , Adult , Aged , Breast/diagnostic imaging , Chemotherapy, Adjuvant , Disease-Free Survival , Female , Humans , Middle Aged , Reproducibility of Results , Retrospective Studies , Treatment Outcome
15.
Eur Radiol ; 28(2): 736-746, 2018 Feb.
Article in English | MEDLINE | ID: mdl-28786009

ABSTRACT

PURPOSE: To evaluate the prognostic value and molecular basis of a CT-derived pleural contact index (PCI) in early stage non-small cell lung cancer (NSCLC). EXPERIMENTAL DESIGN: We retrospectively analysed seven NSCLC cohorts. A quantitative PCI was defined on CT as the length of tumour-pleura interface normalised by tumour diameter. We evaluated the prognostic value of PCI in a discovery cohort (n = 117) and tested in an external cohort (n = 88) of stage I NSCLC. Additionally, we identified the molecular correlates and built a gene expression-based surrogate of PCI using another cohort of 89 patients. To further evaluate the prognostic relevance, we used four datasets totalling 775 stage I patients with publically available gene expression data and linked survival information. RESULTS: At a cutoff of 0.8, PCI stratified patients for overall survival in both imaging cohorts (log-rank p = 0.0076, 0.0304). Extracellular matrix (ECM) remodelling was enriched among genes associated with PCI (p = 0.0003). The genomic surrogate of PCI remained an independent predictor of overall survival in the gene expression cohorts (hazard ratio: 1.46, p = 0.0007) adjusting for age, gender, and tumour stage. CONCLUSIONS: CT-derived pleural contact index is associated with ECM remodelling and may serve as a noninvasive prognostic marker in early stage NSCLC. KEY POINTS: • A quantitative pleural contact index (PCI) predicts survival in early stage NSCLC. • PCI is associated with extracellular matrix organisation and collagen catabolic process. • A multi-gene surrogate of PCI is an independent predictor of survival. • PCI can be used to noninvasively identify patients with poor prognosis.


Subject(s)
Carcinoma, Non-Small-Cell Lung/diagnostic imaging , Carcinoma, Non-Small-Cell Lung/pathology , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/pathology , Pleura/diagnostic imaging , Pleura/pathology , Tomography, X-Ray Computed , Adult , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Neoplasm Staging , Prognosis , Proportional Hazards Models , Retrospective Studies
16.
BMC Bioinformatics ; 18(1): 99, 2017 Feb 10.
Article in English | MEDLINE | ID: mdl-28187708

ABSTRACT

BACKGROUND: Conventional differential gene expression analysis by methods such as student's t-test, SAM, and Empirical Bayes often searches for statistically significant genes without considering the interactions among them. Network-based approaches provide a natural way to study these interactions and to investigate the rewiring interactions in disease versus control groups. In this paper, we apply weighted graphical LASSO (wgLASSO) algorithm to integrate a data-driven network model with prior biological knowledge (i.e., protein-protein interactions) for biological network inference. We propose a novel differentially weighted graphical LASSO (dwgLASSO) algorithm that builds group-specific networks and perform network-based differential gene expression analysis to select biomarker candidates by considering their topological differences between the groups. RESULTS: Through simulation, we showed that wgLASSO can achieve better performance in building biologically relevant networks than purely data-driven models (e.g., neighbor selection, graphical LASSO), even when only a moderate level of information is available as prior biological knowledge. We evaluated the performance of dwgLASSO for survival time prediction using two microarray breast cancer datasets previously reported by Bild et al. and van de Vijver et al. Compared with the top 10 significant genes selected by conventional differential gene expression analysis method, the top 10 significant genes selected by dwgLASSO in the dataset from Bild et al. led to a significantly improved survival time prediction in the independent dataset from van de Vijver et al. Among the 10 genes selected by dwgLASSO, UBE2S, SALL2, XBP1 and KIAA0922 have been confirmed by literature survey to be highly relevant in breast cancer biomarker discovery study. Additionally, we tested dwgLASSO on TCGA RNA-seq data acquired from patients with hepatocellular carcinoma (HCC) on tumors samples and their corresponding non-tumorous liver tissues. Improved sensitivity, specificity and area under curve (AUC) were observed when comparing dwgLASSO with conventional differential gene expression analysis method. CONCLUSIONS: The proposed network-based differential gene expression analysis algorithm dwgLASSO can achieve better performance than conventional differential gene expression analysis methods by integrating information at both gene expression and network topology levels. The incorporation of prior biological knowledge can lead to the identification of biologically meaningful genes in cancer biomarker studies.


Subject(s)
Algorithms , Gene Expression Profiling/methods , Gene Regulatory Networks/genetics , Area Under Curve , Biomarkers/metabolism , Breast Neoplasms/diagnosis , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Carcinoma, Hepatocellular/diagnosis , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/pathology , Female , Humans , Liver Neoplasms/diagnosis , Liver Neoplasms/genetics , Liver Neoplasms/pathology , RNA/chemistry , RNA/isolation & purification , RNA/metabolism , ROC Curve , Sequence Analysis, RNA
17.
Radiology ; 285(2): 401-413, 2017 11.
Article in English | MEDLINE | ID: mdl-28708462

ABSTRACT

Purpose To identify the molecular basis of quantitative imaging characteristics of tumor-adjacent parenchyma at dynamic contrast material-enhanced magnetic resonance (MR) imaging and to evaluate their prognostic value in breast cancer. Materials and Methods In this institutional review board-approved, HIPAA-compliant study, 10 quantitative imaging features depicting tumor-adjacent parenchymal enhancement patterns were extracted and screened for prognostic features in a discovery cohort of 60 patients. By using data from The Cancer Genome Atlas (TCGA), a radiogenomic map for the tumor-adjacent parenchymal tissue was created and molecular pathways associated with prognostic parenchymal imaging features were identified. Furthermore, a multigene signature of the parenchymal imaging feature was built in a training cohort (n = 126), and its prognostic relevance was evaluated in two independent cohorts (n = 879 and 159). Results One image feature measuring heterogeneity (ie, information measure of correlation) was significantly associated with prognosis (false-discovery rate < 0.1), and at a cutoff of 0.57 stratified patients into two groups with different recurrence-free survival rates (log-rank P = .024). The tumor necrosis factor signaling pathway was identified as the top enriched pathway (hypergeometric P < .0001) among genes associated with the image feature. A 73-gene signature based on the tumor profiles in TCGA achieved good association with the tumor-adjacent parenchymal image feature (R2 = 0.873), which stratified patients into groups regarding recurrence-free survival (log-rank P = .029) and overall survival (log-rank P = .042) in an independent TCGA cohort. The prognostic value was confirmed in another independent cohort (Gene Expression Omnibus GSE 1456), with log-rank P = .00058 for recurrence-free survival and log-rank P = .0026 for overall survival. Conclusion Heterogeneous enhancement patterns of tumor-adjacent parenchyma at MR imaging are associated with the tumor necrosis signaling pathway and poor survival in breast cancer. © RSNA, 2017 Online supplemental material is available for this article.


Subject(s)
Breast Neoplasms/diagnostic imaging , Breast Neoplasms/mortality , Image Interpretation, Computer-Assisted , Magnetic Resonance Imaging , Parenchymal Tissue/diagnostic imaging , Adult , Aged , Biomarkers, Tumor/analysis , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Breast Neoplasms/chemistry , Breast Neoplasms/genetics , Female , Gene Expression Profiling , Genomics , Humans , Kaplan-Meier Estimate , Middle Aged , Molecular Imaging , Parenchymal Tissue/chemistry , Prognosis , Signal Transduction
18.
J Magn Reson Imaging ; 46(4): 1017-1027, 2017 10.
Article in English | MEDLINE | ID: mdl-28177554

ABSTRACT

PURPOSE: To determine whether dynamic contrast enhancement magnetic resonance imaging (DCE-MRI) characteristics of the breast tumor and background parenchyma can distinguish molecular subtypes (ie, luminal A/B or basal) of breast cancer. MATERIALS AND METHODS: In all, 84 patients from one institution and 126 patients from The Cancer Genome Atlas (TCGA) were used for discovery and external validation, respectively. Thirty-five quantitative image features were extracted from DCE-MRI (1.5 or 3T) including morphology, texture, and volumetric features, which capture both tumor and background parenchymal enhancement (BPE) characteristics. Multiple testing was corrected using the Benjamini-Hochberg method to control the false-discovery rate (FDR). Sparse logistic regression models were built using the discovery cohort to distinguish each of the three studied molecular subtypes versus the rest, and the models were evaluated in the validation cohort. RESULTS: On univariate analysis in discovery and validation cohorts, two features characterizing tumor and two characterizing BPE were statistically significant in separating luminal A versus nonluminal A cancers; two features characterizing tumor were statistically significant for separating luminal B; one feature characterizing tumor and one characterizing BPE reached statistical significance for distinguishing basal (Wilcoxon P < 0.05, FDR < 0.25). In discovery and validation cohorts, multivariate logistic regression models achieved an area under the receiver operator characteristic curve (AUC) of 0.71 and 0.73 for luminal A cancer, 0.67 and 0.69 for luminal B cancer, and 0.66 and 0.79 for basal cancer, respectively. CONCLUSION: DCE-MRI characteristics of breast cancer and BPE may potentially be used to distinguish among molecular subtypes of breast cancer. LEVEL OF EVIDENCE: 3 Technical Efficacy: Stage 3 J. Magn. Reson. Imaging 2017;46:1017-1027.


Subject(s)
Breast Neoplasms/diagnostic imaging , Contrast Media , Image Enhancement/methods , Magnetic Resonance Imaging/methods , Adult , Aged , Aged, 80 and over , Breast/diagnostic imaging , Diagnosis, Differential , Female , Humans , Middle Aged , Phenotype , Reproducibility of Results
19.
Eur Radiol ; 27(9): 3583-3592, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28168370

ABSTRACT

OBJECTIVE: To develop and validate a volume-based, quantitative imaging marker by integrating multi-parametric MR images for predicting glioblastoma survival, and to investigate its relationship and synergy with molecular characteristics. METHODS: We retrospectively analysed 108 patients with primary glioblastoma. The discovery cohort consisted of 62 patients from the cancer genome atlas (TCGA). Another 46 patients comprising 30 from TCGA and 16 internally were used for independent validation. Based on integrated analyses of T1-weighted contrast-enhanced (T1-c) and diffusion-weighted MR images, we identified an intratumoral subregion with both high T1-c and low ADC, and accordingly defined a high-risk volume (HRV). We evaluated its prognostic value and biological significance with genomic data. RESULTS: On both discovery and validation cohorts, HRV predicted overall survival (OS) (concordance index: 0.642 and 0.653, P < 0.001 and P = 0.038, respectively). HRV stratified patients within the proneural molecular subtype (log-rank P = 0.040, hazard ratio = 2.787). We observed different OS among patients depending on their MGMT methylation status and HRV (log-rank P = 0.011). Patients with unmethylated MGMT and high HRV had significantly shorter survival (median survival: 9.3 vs. 18.4 months, log-rank P = 0.002). CONCLUSION: Volume of the high-risk intratumoral subregion identified on multi-parametric MRI predicts glioblastoma survival, and may provide complementary value to genomic information. KEY POINTS: • High-risk volume (HRV) defined on multi-parametric MRI predicted GBM survival. • The proneural molecular subtype tended to harbour smaller HRV than other subtypes. • Patients with unmethylated MGMT and high HRV had significantly shorter survival. • HRV complements genomic information in predicting GBM survival.


Subject(s)
Brain Neoplasms/diagnostic imaging , Glioblastoma/diagnostic imaging , Adult , Aged , Brain Neoplasms/genetics , Brain Neoplasms/pathology , DNA Methylation , DNA Modification Methylases/genetics , DNA Repair Enzymes/genetics , DNA, Neoplasm/genetics , Female , Glioblastoma/genetics , Glioblastoma/pathology , Humans , Image Interpretation, Computer-Assisted/methods , Kaplan-Meier Estimate , Magnetic Resonance Imaging/methods , Male , Middle Aged , Prognosis , Proportional Hazards Models , Reproducibility of Results , Retrospective Studies , Tumor Suppressor Proteins/genetics
20.
Methods ; 111: 12-20, 2016 12 01.
Article in English | MEDLINE | ID: mdl-27592383

ABSTRACT

Differential expression (DE) analysis is commonly used to identify biomarker candidates that have significant changes in their expression levels between distinct biological groups. One drawback of DE analysis is that it only considers the changes on single biomolecule level. Recently, differential network (DN) analysis has become popular due to its capability to measure the changes on biomolecular pair level. In DN analysis, network is typically built based on correlation and biomarker candidates are selected by investigating the network topology. However, correlation tends to generate over-complicated networks and the selection of biomarker candidates purely based on network topology ignores the changes on single biomolecule level. In this paper, we propose a novel approach, INDEED, that builds sparse differential network based on partial correlation and integrates DE and DN analyses for biomarker discovery. We applied this approach on real proteomic and glycomic data generated by liquid chromatography coupled with mass spectrometry for hepatocellular carcinoma (HCC) biomarker discovery study. For each omic data, we used one dataset to select biomarker candidates, built a disease classifier and evaluated the performance of the classifier on an independent dataset. The biomarker candidates, selected by INDEED, were more reproducible across independent datasets, and led to a higher classification accuracy in predicting HCC cases and cirrhotic controls compared with those selected by separate DE and DN analyses. INDEED also identified some candidates previously reported to be relevant to HCC, such as intercellular adhesion molecule 2 (ICAM2) and c4b-binding protein alpha chain (C4BPA), which were missed by both DE and DN analyses. In addition, we applied INDEED for survival time prediction based on transcriptomic data acquired by analysis of samples from breast cancer patients. We selected biomarker candidates and built a regression model for survival time prediction based on a gene expression dataset and patients' survival records. We evaluated the performance of the regression model on an independent dataset. Compared with the biomarker candidates selected by DE and DN analyses, those selected through INDEED led to more accurate survival time prediction.


Subject(s)
Antigens, CD/genetics , Biomarkers, Tumor/genetics , Cell Adhesion Molecules/genetics , Complement C4b-Binding Protein/genetics , Proteomics/methods , Carcinoma, Hepatocellular/genetics , Carcinoma, Hepatocellular/metabolism , Chromatography, Liquid , Gene Expression Regulation, Neoplastic , Glycomics/methods , Humans , Liver Neoplasms/genetics , Liver Neoplasms/metabolism , Mass Spectrometry , Transcriptome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL