Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Comput Biol ; 20(1): e1011754, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38198519

RESUMEN

Cancer models are instrumental as a substitute for human studies and to expedite basic, translational, and clinical cancer research. For a given cancer type, a wide selection of models, such as cell lines, patient-derived xenografts, organoids and genetically modified murine models, are often available to researchers. However, how to quantify their congruence to human tumors and to select the most appropriate cancer model is a largely unsolved issue. Here, we present Congruence Analysis and Selection of CAncer Models (CASCAM), a statistical and machine learning framework for authenticating and selecting the most representative cancer models in a pathway-specific manner using transcriptomic data. CASCAM provides harmonization between human tumor and cancer model omics data, systematic congruence quantification, and pathway-based topological visualization to determine the most appropriate cancer model selection. The systems approach is presented using invasive lobular breast carcinoma (ILC) subtype and suggesting CAMA1 followed by UACC3133 as the most representative cell lines for ILC research. Two additional case studies for triple negative breast cancer (TNBC) and patient-derived xenograft/organoid (PDX/PDO) are further investigated. CASCAM is generalizable to any cancer subtype and will authenticate cancer models for faithful non-human preclinical research towards precision medicine.


Asunto(s)
Medicina de Precisión , Neoplasias de la Mama Triple Negativas , Humanos , Animales , Ratones , Ensayos Antitumor por Modelo de Xenoinjerto , Neoplasias de la Mama Triple Negativas/genética , Neoplasias de la Mama Triple Negativas/patología , Perfilación de la Expresión Génica , Análisis de Sistemas
2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34929734

RESUMEN

Since its selection as the method of the year in 2013, single-cell technologies have become mature enough to provide answers to complex research questions. With the growth of single-cell profiling technologies, there has also been a significant increase in data collected from single-cell profilings, resulting in computational challenges to process these massive and complicated datasets. To address these challenges, deep learning (DL) is positioned as a competitive alternative for single-cell analyses besides the traditional machine learning approaches. Here, we survey a total of 25 DL algorithms and their applicability for a specific step in the single cell RNA-seq processing pipeline. Specifically, we establish a unified mathematical representation of variational autoencoder, autoencoder, generative adversarial network and supervised DL models, compare the training strategies and loss functions for these models, and relate the loss functions of these models to specific objectives of the data processing step. Such a presentation will allow readers to choose suitable algorithms for their particular objective at each step in the pipeline. We envision that this survey will serve as an important information portal for learning the application of DL for scRNA-seq analysis and inspire innovative uses of DL to address a broader range of new challenges in emerging multi-omics and spatial single-cell sequencing.


Asunto(s)
Aprendizaje Profundo , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Humanos , Aprendizaje Automático , Análisis de Secuencia de ARN/métodos , Transcriptoma
3.
J Med Virol ; 95(8): e29009, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37563850

RESUMEN

Despite intensive studies during the last 3 years, the pathology and underlying molecular mechanism of coronavirus disease 2019 (COVID-19) remain poorly defined. In this study, we investigated the spatial single-cell molecular and cellular features of postmortem COVID-19 lung tissues using in situ sequencing (ISS). We detected 10 414 863 transcripts of 221 genes in whole-slide tissues and segmented them into 1 719 459 cells that were mapped to 18 major parenchymal and immune cell types, all of which were infected by SARS-CoV-2. Compared with the non-COVID-19 control, COVID-19 lungs exhibited reduced alveolar cells (ACs) and increased innate and adaptive immune cells. We also identified 19 differentially expressed genes in both infected and uninfected cells across the tissues, which reflected the altered cellular compositions. Spatial analysis of local infection rates revealed regions with high infection rates that were correlated with high cell densities (HIHD). The HIHD regions expressed high levels of SARS-CoV-2 entry-related factors including ACE2, FURIN, TMPRSS2 and NRP1, and co-localized with organizing pneumonia (OP) and lymphocytic and immune infiltration, which exhibited increased ACs and fibroblasts but decreased vascular endothelial cells and epithelial cells, mirroring the tissue damage and wound healing processes. Sparse nonnegative matrix factorization (SNMF) analysis of niche features identified seven signatures that captured structure and immune niches in COVID-19 tissues. Trajectory inference based on immune niche signatures defined two pathological routes. Trajectory A primarily progressed with increased NK cells and granulocytes, likely reflecting the complication of microbial infections. Trajectory B was marked by increased HIHD and OP, possibly accounting for the increased immune infiltration. The OP regions were marked by high numbers of fibroblasts expressing extremely high levels of COL1A1 and COL1A2. Examination of single-cell RNA-seq data (scRNA-seq) from COVID-19 lung tissues and idiopathic pulmonary fibrosis (IPF) identified similar cell populations consisting mainly of myofibroblasts. Immunofluorescence staining revealed the activation of IL6-STAT3 and TGF-ß-SMAD2/3 pathways in these cells, likely mediating the upregulation of COL1A1 and COL1A2 and excessive fibrosis in the lung tissues. Together, this study provides a spatial single-cell atlas of cellular and molecular signatures of fatal COVID-19 lungs, which reveals the complex spatial cellular heterogeneity, organization, and interactions that characterized the COVID-19 lung pathology.


Asunto(s)
COVID-19 , Humanos , COVID-19/patología , SARS-CoV-2/genética , Células Endoteliales , Análisis de Expresión Génica de una Sola Célula , Enzima Convertidora de Angiotensina 2/genética , Enzima Convertidora de Angiotensina 2/metabolismo , Pulmón/patología
4.
Am J Physiol Heart Circ Physiol ; 323(1): H130-H145, 2022 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-35657614

RESUMEN

Childhood cancer survivors (CCSs) face lifelong side effects related to their treatment with chemotherapy. Anthracycline agents, such as doxorubicin (DOX), are important in the treatment of childhood cancers but are associated with cardiotoxicity. Cardiac toxicities represent a significant source of chronic disability that cancer survivors face; despite this, the chronic cardiotoxicity phenotype and how it relates to acute toxicity remains poorly defined. To address this critical knowledge gap, we studied the acute effect of DOX on murine cardiac nonmyocytes in vivo. Determination of the acute cellular effects of DOX on nonmyocytes, a cell pool with finite replicative capacity, provides a basis for understanding the pathogenesis of the chronic heart disease that CCSs face. To investigate the acute cellular effects of DOX, we present single-cell RNA sequencing (scRNAseq) data from homeostatic cardiac nonmyocytes and compare it with preexisting datasets, as well as a novel CyTOF datasets. SCANPY, a python-based single-cell analysis, was used to assess the heterogeneity of cells detected in scRNAseq and CyTOF. To further assist in CyTOF data annotation, joint analyses of scRNAseq and CyTOF data using an artificial neural network known as sparse autoencoder for clustering, imputation, and embedding (SAUCIE) are performed. Lastly, the panel is tested on a mouse model of acute DOX exposure at two time points (24 and 72 h) after the last dose of doxorubicin and examined with joint clustering. In sum, we report the first ever CyTOF study of cardiac nonmyocytes and characterize the effect of acute DOX exposure with scRNAseq and CyTOF.NEW & NOTEWORTHY We describe the first mass cytometry studies of murine cardiac nonmyocytes. The mass cytometry panel is compared with single-cell RNA sequencing data. Homeostatic cardiac nonmyocytes are characterized by mass cytometry to identify and quantify four major cell populations: endothelial cells, fibroblasts, leukocytes, and pericytes. The single-cell acute nonmyocyte response to doxorubicin is studied at 24 and 72 h after doxorubicin exposure given daily for 5 days at a dose of 4 mg/kg/day.


Asunto(s)
Cardiotoxicidad , Células Endoteliales , Animales , Antibióticos Antineoplásicos/toxicidad , Doxorrubicina/toxicidad , Células Endoteliales/patología , Corazón , Ratones , Miocitos Cardíacos
5.
Brief Bioinform ; 21(6): 2066-2083, 2020 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-31813953

RESUMEN

The recent accumulation of cancer genomic data provides an opportunity to understand how a tumor's genomic characteristics can affect its responses to drugs. This field, called pharmacogenomics, is a key area in the development of precision oncology. Deep learning (DL) methodology has emerged as a powerful technique to characterize and learn from rapidly accumulating pharmacogenomics data. We introduce the fundamentals and typical model architectures of DL. We review the use of DL in classification of cancers and cancer subtypes (diagnosis and treatment stratification of patients), prediction of drug response and drug synergy for individual tumors (treatment prioritization for a patient), drug repositioning and discovery and the study of mechanism/mode of action of treatments. For each topic, we summarize current genomics and pharmacogenomics data resources such as pan-cancer genomics data for cancer cell lines (CCLs) and tumors, and systematic pharmacologic screens of CCLs. By revisiting the published literature, including our in-house analyses, we demonstrate the unprecedented capability of DL enabled by rapid accumulation of data resources to decipher complex drug response patterns, thus potentially improving cancer medicine. Overall, this review provides an in-depth summary of state-of-the-art DL methods and up-to-date pharmacogenomics resources and future opportunities and challenges to realize the goal of precision oncology.


Asunto(s)
Aprendizaje Profundo , Neoplasias , Farmacogenética , Medicina de Precisión , Reposicionamiento de Medicamentos , Genómica , Humanos , Oncología Médica , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Medicina de Precisión/métodos
6.
Methods ; 192: 120-130, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-33484826

RESUMEN

The survival rate of cancer has increased significantly during the past two decades for breast, prostate, testicular, and colon cancer, while the brain and pancreatic cancers have a much lower median survival rate that has not improved much over the last forty years. This has imposed the challenge of finding gene markers for early cancer detection and treatment strategies. Different methods including regression-based Cox-PH, artificial neural networks, and recently deep learning algorithms have been proposed to predict the survival rate for cancers. We established in this work a novel graph convolution neural network (GCNN) approach called Surv_GCNN to predict the survival rate for 13 different cancer types using the TCGA dataset. For each cancer type, 6 Surv_GCNN models with graphs generated by correlation analysis, GeneMania database, and correlation + GeneMania were trained with and without clinical data to predict the risk score (RS). The performance of the 6 Surv_GCNN models was compared with two other existing models, Cox-PH and Cox-nnet. The results showed that Cox-PH has the worst performance among 8 tested models across the 13 cancer types while Surv_GCNN models with clinical data reported the best overall performance, outperforming other competing models in 7 out of 13 cancer types including BLCA, BRCA, COAD, LUSC, SARC, STAD, and UCEC. A novel network-based interpretation of Surv_GCNN was also proposed to identify potential gene markers for breast cancer. The signatures learned by the nodes in the hidden layer of Surv_GCNN were identified and were linked to potential gene markers by network modularization. The identified gene markers for breast cancer have been compared to a total of 213 gene markers from three widely cited lists for breast cancer survival analysis. About 57% of gene markers obtained by Surv_GCNN with correlation + GeneMania graph either overlap or directly interact with the 213 genes, confirming the effectiveness of the identified markers by Surv_GCNN.


Asunto(s)
Redes Neurales de la Computación , Algoritmos , Neoplasias de la Mama/genética , Humanos , Masculino , Tasa de Supervivencia
7.
BMC Bioinformatics ; 22(1): 244, 2021 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-33980137

RESUMEN

BACKGROUND: The state-of-the-art deep learning based cancer type prediction can only predict cancer types whose samples are available during the training where the sample size is commonly large. In this paper, we consider how to utilize the existing training samples to predict cancer types unseen during the training. We hypothesize the existence of a set of type-agnostic expression representations that define the similarity/dissimilarity between samples of the same/different types and propose a novel one-shot learning model called CancerSiamese to learn this common representation. CancerSiamese accepts a pair of query and support samples (gene expression profiles) and learns the representation of similar or dissimilar cancer types through two parallel convolutional neural networks joined by a similarity function. RESULTS: We trained CancerSiamese for cancer type prediction for primary and metastatic tumors using samples from the Cancer Genome Atlas (TCGA) and MET500. Network transfer learning was utilized to facilitate the training of the CancerSiamese models. CancerSiamese was tested for different N-way predictions and yielded an average accuracy improvement of 8% and 4% over the benchmark 1-Nearest Neighbor (1-NN) classifier for primary and metastatic tumors, respectively. Moreover, we applied the guided gradient saliency map and feature selection to CancerSiamese to examine 100 and 200 top marker-gene candidates for the prediction of primary and metastatic cancers, respectively. Functional analysis of these marker genes revealed several cancer related functions between primary and metastatic tumors. CONCLUSION: This work demonstrated, for the first time, the feasibility of predicting unseen cancer types whose samples are limited. Thus, it could inspire new and ingenious applications of one-shot and few-shot learning solutions for improving cancer diagnosis, prognostic, and our understanding of cancer.


Asunto(s)
Neoplasias , Redes Neurales de la Computación , Humanos , Neoplasias/genética , Transcriptoma
8.
BMC Genomics ; 20(Suppl 1): 81, 2019 Feb 04.
Artículo en Inglés | MEDLINE | ID: mdl-30712511

RESUMEN

BACKGROUND: Cell lines form the cornerstone of cell-based experimentation studies into understanding the underlying mechanisms of normal and disease biology including cancer. However, it is commonly acknowledged that contamination of cell lines is a prevalent problem affecting biomedical science and available methods for cell line authentication suffer from limited access as well as being too daunting and time-consuming for many researchers. Therefore, a new and cost effective approach for authentication and quality control of cell lines is needed. RESULTS: We have developed a new RNA-seq based approach named CeL-ID for cell line authentication. CeL-ID uses RNA-seq data to identify variants and compare with variant profiles of other cell lines. RNA-seq data for 934 CCLE cell lines downloaded from NCI GDC were used to generate cell line specific variant profiles and pair-wise correlations were calculated using frequencies and depth of coverage values of all the variants. Comparative analysis of variant profiles revealed that variant profiles differ significantly from cell line to cell line whereas identical, synonymous and derivative cell lines share high variant identity and are highly correlated (ρ > 0.9). Our benchmarking studies revealed that CeL-ID method can identify a cell line with high accuracy and can be a valuable tool of cell line authentication in biomedical science. Finally, CeL-ID estimates the possible cross contamination using linear mixture model if no perfect match was detected. CONCLUSIONS: In this study, we show the utility of an RNA-seq based approach for cell line authentication. Our comparative analysis of variant profiles derived from RNA-seq data revealed that variant profiles of each cell line are distinct and overall share low variant identity with other cell lines whereas identical or synonymous cell lines show significantly high variant identity and hence variant profiles can be used as a discriminatory/identifying feature in cell authentication model.


Asunto(s)
Línea Celular , Código de Barras del ADN Taxonómico , Análisis de Secuencia de ARN , Algoritmos , Línea Celular Tumoral , Bases de Datos Factuales , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Modelos Estadísticos , Mutación , Polimorfismo de Nucleótido Simple
9.
BMC Genomics ; 20(Suppl 12): 1007, 2019 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-31888480

RESUMEN

BACKGROUND: Europeans and American Indians were major genetic ancestry of Hispanics in the U.S. These ancestral groups have markedly different incidence rates and outcomes in many types of cancers. Therefore, the genetic admixture may cause biased genetic association study with cancer susceptibility variants specifically in Hispanics. For example, the incidence rate of liver cancer has been shown with substantial disparity between Hispanic, Asian and non-Hispanic white populations. Currently, ancestry informative marker (AIM) panels have been widely utilized with up to a few hundred ancestry-informative single nucleotide polymorphisms (SNPs) to infer ancestry admixture. Notably, current available AIMs are predominantly located in intron and intergenic regions, while the whole exome sequencing (WES) protocols commonly used in translational research and clinical practice do not cover these markers. Thus, it remains challenging to accurately determine a patient's admixture proportion without additional DNA testing. RESULTS: In this study we designed an unique AIM panel that infers 3-way genetic admixture from three distinct and selective continental populations (African (AFR), European (EUR), and East Asian (EAS)) within evolutionarily conserved exonic regions. Initially, about 1 million exonic SNPs from selective three populations in the 1000 Genomes Project were trimmed by their linkage disequilibrium (LD), restricted to biallelic variants, and finally we optimized to an AIM panel with 250 SNP markers, or the UT-AIM250 panel, using their ancestral informativeness statistics. Comparing to published AIM panels, UT-AIM250 performed better accuracy when we tested with three ancestral populations (accuracy: 0.995 ± 0.012 for AFR, 0.997 ± 0.007 for EUR, and 0.994 ± 0.012 for EAS). We further demonstrated the performance of the UT-AIM250 panel to admixed American (AMR) samples of the 1000 Genomes Project and obtained similar results (AFR, 0.085 ± 0.098; EUR, 0.665 ± 0.182; and EAS, 0.250 ± 0.205) to previously published AIM panels (Phillips-AIM34: AFR, 0.096 ± 0.127, EUR, 0.575 ± 0.290, and EAS, 0.330 ± 0.315; Wei-AIM278: AFR, 0.070 ± 0.096, EUR, 0.537 ± 0.267, and EAS, 0.393 ± 0.300). Subsequently, we applied the UT-AIM250 panel to a clinical dataset of 26 self-reported Hispanic patients in South Texas with hepatocellular carcinoma (HCC). We estimated the admixture proportions using WES data of adjacent non-cancer liver tissues (AFR, 0.065 ± 0.043; EUR, 0.594 ± 0.150; and EAS, 0.341 ± 0.160). Similar admixture proportions were identified from corresponding tumor tissues. In addition, we estimated admixture proportions of The Cancer Genome Atlas (TCGA) collection of hepatocellular carcinoma (TCGA-LIHC) samples (376 patients) using the UT-AIM250 panel. The panel obtained consistent admixture proportions from tumor and matched normal tissues, identified 3 possible incorrectly reported race/ethnicity, and/or provided race/ethnicity determination if necessary. CONCLUSIONS: Here we demonstrated the feasibility of using evolutionarily conserved exonic regions to infer admixture proportions and provided a robust and reliable control for sample collection or patient stratification for genetic analysis. R implementation of UT-AIM250 is available at https://github.com/chenlabgccri/UT-AIM250.


Asunto(s)
Genoma Humano/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Carcinoma Hepatocelular/etnología , Carcinoma Hepatocelular/genética , Etnicidad/genética , Exones/genética , Frecuencia de los Genes , Pruebas Genéticas , Genética de Población , Genotipo , Humanos , Neoplasias Hepáticas/etnología , Neoplasias Hepáticas/genética , Polimorfismo de Nucleótido Simple , Programas Informáticos
10.
Eur J Haematol ; 103(4): 417-425, 2019 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-31356696

RESUMEN

OBJECTIVES: This study explored resistance functions and their interactions in de novo AML treated with the "7 + 3" induction regimen. METHODS: We analyzed RNA-sequencing profiles of whole bone marrow samples from 52 de novo AML patients who completed the "7 + 3" regimen and stratified patients into CR (n = 35) and non-CR (n = 17) groups. RESULTS: A systematic gene set analysis revealed significant associations between chemoresistance and mTOR (P < .001), myc (P < .001), mitochondrial oxidative phosphorylation (P < .001), and stemness (P = .002). These functions were independent with regard to gene contents and activity scores. An integration of these four functions showed a prediction of chemoresistance (area under the receiver operating characteristic curve = 0.815) superior to that of each function alone. Moreover, our proposed seven-gene scoring system significantly correlated with the four-function model (r = .97; P < .001) to predict chemoresistance to the "7 + 3" regimen. On multivariate analysis, a seven-gene score of ≥-0.027 (hazard ratio: 11.18; 95% confidence interval: 2.06-60.65; P = .005) was an independent risk factor for induction failure. CONCLUSIONS: Myc, OXPHOS, mTOR, and stemness were responsive for chemoresistance in AML. Treatments other than the "7 + 3" regimen need to be considered for de novo AML patients predicted to be refractory to the "7 + 3" regimen.


Asunto(s)
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Resistencia a Antineoplásicos , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/tratamiento farmacológico , Adulto , Protocolos de Quimioterapia Combinada Antineoplásica/efectos adversos , Biomarcadores , Biomarcadores de Tumor , Células de la Médula Ósea/metabolismo , Resistencia a Antineoplásicos/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Quimioterapia de Inducción , Leucemia Mieloide Aguda/genética , Masculino , Persona de Mediana Edad , Modelos Estadísticos , Pronóstico , Curva ROC , Reproducibilidad de los Resultados , Resultado del Tratamiento
11.
BMC Bioinformatics ; 18(1): 132, 2017 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-28241741

RESUMEN

BACKGROUND: Recent studies illuminated a novel role of microRNA (miRNA) in the competing endogenous RNA (ceRNA) interaction: two genes (ceRNAs) can achieve coexpression by competing for a pool of common targeting miRNAs. Individual biological investigations implied ceRNA interaction performs crucial oncogenic/tumor suppressive functions in glioblastoma multiforme (GBM). Yet, a systematic analysis has not been conducted to explore the functional landscape and prognostic significance of ceRNA interaction. RESULTS: Incorporating the knowledge that ceRNA interaction is highly condition-specific and modulated by the expressional abundance of miRNAs, we devised a ceRNA inference by differential correlation analysis to identify the miRNA-modulated ceRNA pairs. Analyzing sample-paired miRNA and gene expression profiles of GBM, our data showed that this alternative layer of gene interaction is essential in global information flow. Functional annotation analysis revealed its involvement in activated processes in brain, such as synaptic transmission, as well as critical tumor-associated functions. Notably, a systematic survival analysis suggested the strength of ceRNA-ceRNA interactions, rather than expressional abundance of individual ceRNAs, among three immune response genes (CCL22, IL2RB, and IRF4) is predictive of patient survival. The prognostic value was validated in two independent cohorts. CONCLUSIONS: This work addresses the lack of a comprehensive exploration into the functional and prognostic relevance of ceRNA interaction in GBM. The proposed efficient and reliable method revealed its significance in GBM-related functions and prognosis. The highlighted roles of ceRNA interaction provide a basis for further biological and clinical investigations.


Asunto(s)
Neoplasias Encefálicas/mortalidad , Glioblastoma/mortalidad , ARN Neoplásico/metabolismo , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/metabolismo , Quimiocina CCL22/genética , Epistasis Genética , Glioblastoma/genética , Glioblastoma/metabolismo , Humanos , Factores Reguladores del Interferón/genética , Subunidad beta del Receptor de Interleucina-2/genética , MicroARNs/metabolismo , Análisis de Supervivencia
12.
BMC Genomics ; 18(Suppl 6): 679, 2017 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-28984209

RESUMEN

BACKGROUND: With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. RESULTS: We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. CONCLUSIONS: Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.


Asunto(s)
Neoplasias de la Mama/genética , Redes Reguladoras de Genes , Genómica , Perfilación de la Expresión Génica , Humanos
13.
Development ; 141(12): 2402-13, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24850856

RESUMEN

The ability of adult stem cells to reside in a quiescent state is crucial for preventing premature exhaustion of the stem cell pool. However, the intrinsic epigenetic factors that regulate spermatogonial stem cell quiescence are largely unknown. Here, we investigate in mice how DNA methyltransferase 3-like (DNMT3L), an epigenetic regulator important for interpreting chromatin context and facilitating de novo DNA methylation, sustains the long-term male germ cell pool. We demonstrated that stem cell-enriched THY1(+) spermatogonial stem/progenitor cells (SPCs) constituted a DNMT3L-expressing population in postnatal testes. DNMT3L influenced the stability of promyelocytic leukemia zinc finger (PLZF), potentially by downregulating Cdk2/CDK2 expression, which sequestered CDK2-mediated PLZF degradation. Reduced PLZF in Dnmt3l KO THY1(+) cells released its antagonist, Sal-like protein 4A (SALL4A), which is associated with overactivated ERK and AKT signaling cascades. Furthermore, DNMT3L was required to suppress the cell proliferation-promoting factor SALL4B in THY1(+) SPCs and to prevent premature stem cell exhaustion. Our results indicate that DNMT3L is required to delicately balance the cycling and quiescence of SPCs. These findings reveal a novel role for DNMT3L in modulating postnatal SPC cell fate decisions.


Asunto(s)
Células Madre Adultas/metabolismo , ADN (Citosina-5-)-Metiltransferasas/fisiología , Regulación del Desarrollo de la Expresión Génica , Espermatogonias/metabolismo , Alelos , Animales , Proliferación Celular , ADN (Citosina-5-)-Metiltransferasas/genética , Metilación de ADN , Proteínas de Unión al ADN/metabolismo , Epigénesis Genética , Quinasas MAP Reguladas por Señal Extracelular/metabolismo , Heterocigoto , Masculino , Ratones , Ratones Noqueados , Proteínas Proto-Oncogénicas c-akt/metabolismo , Testículo/metabolismo , Factores de Transcripción/metabolismo , Dedos de Zinc
14.
Haematologica ; 102(6): 1044-1053, 2017 06.
Artículo en Inglés | MEDLINE | ID: mdl-28341738

RESUMEN

Homeodomain-only protein homeobox (HOPX) is the smallest homeodomain protein. It was regarded as a stem cell marker in several non-hematopoietic systems. While the prototypic homeobox genes such as the HOX family have been well characterized in acute myeloid leukemia (AML), the clinical and biological implications of HOPX in the disease remain unknown. Thus we analyzed HOPX and global gene expression patterns in 347 newly diagnosed de novo AML patients in our institute. We found that higher HOPX expression was closely associated with older age, higher platelet counts, lower white blood cell counts, lower lactate dehydrogenase levels, and mutations in RUNX1, IDH2, ASXL1, and DNMT3A, but negatively associated with acute promyelocytic leukemia, favorable karyotypes, CEBPA double mutations and NPM1 mutation. Patients with higher HOPX expression had a lower complete remission rate and shorter survival. The finding was validated in two independent cohorts. Multivariate analysis revealed that higher HOPX expression was an independent unfavorable prognostic factor irrespective of other known prognostic parameters and gene signatures derived from multiple cohorts. Gene set enrichment analysis showed higher HOPX expression was associated with both hematopoietic and leukemia stem cell signatures. While HOPX and HOX family genes showed concordant expression patterns in normal hematopoietic stem/progenitor cells, their expression patterns and associated clinical and biological features were distinctive in AML settings, demonstrating HOPX to be a unique homeobox gene. Therefore, HOPX is a distinctive homeobox gene with characteristic clinical and biological implications and its expression is a powerful predictor of prognosis in AML patients.


Asunto(s)
Proteínas de Homeodominio/metabolismo , Leucemia Mieloide Aguda/patología , Proteínas Supresoras de Tumor/metabolismo , Femenino , Perfilación de la Expresión Génica , Células Madre Hematopoyéticas , Proteínas de Homeodominio/análisis , Humanos , Leucemia Mieloide Aguda/diagnóstico , Células Madre Neoplásicas , Nucleofosmina , Pronóstico , Transcriptoma , Proteínas Supresoras de Tumor/análisis
15.
BMC Genomics ; 16 Suppl 4: S1, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25917195

RESUMEN

BACKGROUND: In addition to direct targeting and repressing mRNAs, recent studies reported that microRNAs (miRNAs) can bridge up an alternative layer of post-transcriptional gene regulatory networks. The competing endogenous RNA (ceRNA) regulation depicts the scenario where pairs of genes (ceRNAs) sharing, fully or partially, common binding miRNAs (miRNA program) can establish coexpression through competition for a limited pool of the miRNA program. While the dynamics of ceRNA regulation among cellular conditions have been verified based on in silico and in vitro experiments, comprehensive investigation into the strength of ceRNA regulation in human datasets remains largely unexplored. Furthermore, pan-cancer analysis of ceRNA regulation, to our knowledge, has not been systematically investigated. RESULTS: In the present study we explored optimal conditions for ceRNA regulation, investigated functions governed by ceRNA regulation, and evaluated pan-cancer effects. We started by investigating how essential factors, such as the size of miRNA programs, the number of miRNA program binding sites, and expression levels of miRNA programs and ceRNAs affect the ceRNA regulation capacity in tumors derived from glioblastoma multiforme patients captured by The Cancer Genome Atlas (TCGA). We demonstrated that increased numbers of common targeting miRNAs as well as the abundance of binding sites enhance ceRNA regulation and strengthen coexpression of ceRNA pairs. Also, our investigation revealed that the strength of ceRNA regulation is dependent on expression levels of both miRNA programs and ceRNAs. Through functional annotation analysis, our results indicated that ceRNA regulation is highly associated with essential cellular functions and diseases including cancer. Furthermore, the highly intertwined ceRNA regulatory relationship enables constitutive and effective intra-function regulation of genes in diverse types of cancer. CONCLUSIONS: Using gene and microRNA expression datasets from TCGA, we successfully quantified the optimal conditions for ceRNA regulation, which hinge on four essential parameters of ceRNAs. Our analysis suggests optimized ceRNA regulation is related to disease pathways and essential cellular functions. Furthermore, although the strength of ceRNA regulation is dynamic among cancers, its governing functions are stably maintained. The findings of this report contribute to better understanding of ceRNA dynamics and its crucial roles in cancers.


Asunto(s)
Neoplasias Encefálicas/genética , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Glioblastoma/genética , MicroARNs/genética , ARN Mensajero/química , Biología Computacional/métodos , Humanos , Modelos Genéticos , ARN Mensajero/genética , ARN Neoplásico/química , ARN Neoplásico/genética
16.
BMC Genomics ; 16 Suppl 7: S19, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26100352

RESUMEN

BACKGROUND: Gene regulation is dynamic across cellular conditions and disease subtypes. From the aspect of regulation under modulation, regulation strength between a pair of genes can be modulated by (dependent on) expression abundance of another gene (modulator gene). Previous studies have demonstrated the involvement of genes modulated by single modulator genes in cancers, including breast cancer. However, analysis of multi-modulator co-modulation that can further delineate the landscape of complex gene regulation is, to our knowledge, unexplored previously. In the present study we aim to explore the joint effects of multiple modulator genes in modulating global gene regulation and dissect the biological functions in breast cancer. RESULTS: To carry out the analysis, we proposed the Covariability-based Multiple Regression (CoMRe) method. The method is mainly built on a multiple regression model that takes expression levels of multiple modulators as inputs and regulation strength between genes as output. Pairs of genes were divided into groups based on their co-modulation patterns. Analyzing gene expression profiles from 286 breast cancer patients, CoMRe investigated ten candidate modulator genes that interacted and jointly determined global gene regulation. Among the candidate modulators, ESR1, ERBB2, and ADAM12 were found modulating the most numbers of gene pairs. The largest group of gene pairs was composed of ones that were modulated by merely ESR1. Functional annotation revealed that the group was significantly related to tumorigenesis and estrogen signaling in breast cancer. ESR1-ERBB2 co-modulation was the largest group modulated by more than one modulators. Similarly, the group was functionally associated with hormone stimulus, suggesting that functions of the two modulators are performed, at least partially, through modulation. The findings were validated in majorities of patients (> 99%) of two independent breast cancer datasets. CONCLUSIONS: We have showed CoMRe is a robust method to discover critical modulators in gene regulatory networks, and it is capable of achieving reproducible and biologically meaningful results. Our data reveal that gene regulatory networks modulated by single modulator or co-modulated by multiple modulators play important roles in breast cancer. Findings of this report illuminate complex and dynamic gene regulation under modulation and its involvement in breast cancer.


Asunto(s)
Proteínas ADAM/genética , Proteínas Adaptadoras Transductoras de Señales/genética , Neoplasias de la Mama/genética , Receptor alfa de Estrógeno/genética , Proteínas de la Membrana/genética , Proteína ADAM12 , Algoritmos , Bases de Datos Genéticas , Femenino , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Humanos , Modelos Genéticos
17.
Cancers (Basel) ; 16(9)2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38730604

RESUMEN

Despite significant advances in tumor biology and clinical therapeutics, metastasis remains the primary cause of cancer-related deaths. While RNA-seq technology has been used extensively to study metastatic cancer characteristics, challenges persist in acquiring adequate transcriptomic data. To overcome this challenge, we propose MetGen, a generative contrastive learning tool based on a deep learning model. MetGen generates synthetic metastatic cancer expression profiles using primary cancer and normal tissue expression data. Our results demonstrate that MetGen generates comparable samples to actual metastatic cancer samples, and the cancer and tissue classification yields performance rates of 99.8 ± 0.2% and 95.0 ± 2.3%, respectively. A benchmark analysis suggests that the proposed model outperforms traditional generative models such as the variational autoencoder. In metastatic subtype classification, our generated samples show 97.6% predicting power compared to true metastatic samples. Additionally, we demonstrate MetGen's interpretability using metastatic prostate cancer and metastatic breast cancer. MetGen has learned highly relevant signatures in cancer, tissue, and tumor microenvironments, such as immune responses and the metastasis process, which can potentially foster a more comprehensive understanding of metastatic cancer biology. The development of MetGen represents a significant step toward the study of metastatic cancer biology by providing a generative model that identifies candidate therapeutic targets for the treatment of metastatic cancer.

18.
Patterns (N Y) ; 5(4): 100949, 2024 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-38645769

RESUMEN

Large-scale cancer drug sensitivity data have become available for a collection of cancer cell lines, but only limited drug response data from patients are available. Bridging the gap in pharmacogenomics knowledge between in vitro and in vivo datasets remains challenging. In this study, we trained a deep learning model, Scaden-CA, for deconvoluting tumor data into proportions of cancer-type-specific cell lines. Then, we developed a drug response prediction method using the deconvoluted proportions and the drug sensitivity data from cell lines. The Scaden-CA model showed excellent performance in terms of concordance correlation coefficients (>0.9 for model testing) and the correctly deconvoluted rate (>70% across most cancers) for model validation using Cancer Cell Line Encyclopedia (CCLE) bulk RNA data. We applied the model to tumors in The Cancer Genome Atlas (TCGA) dataset and examined associations between predicted cell viability and mutation status or gene expression levels to understand underlying mechanisms of potential value for drug repurposing.

19.
bioRxiv ; 2024 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-38313267

RESUMEN

Motivation: Molecular Regulatory Pathways (MRPs) are crucial for understanding biological functions. Knowledge Graphs (KGs) have become vital in organizing and analyzing MRPs, providing structured representations of complex biological interactions. Current tools for mining KGs from biomedical literature are inadequate in capturing complex, hierarchical relationships and contextual information about MRPs. Large Language Models (LLMs) like GPT-4 offer a promising solution, with advanced capabilities to decipher the intricate nuances of language. However, their potential for end-to-end KG construction, particularly for MRPs, remains largely unexplored. Results: We present reguloGPT, a novel GPT-4 based in-context learning prompt, designed for the end-to-end joint name entity recognition, N-ary relationship extraction, and context predictions from a sentence that describes regulatory interactions with MRPs. Our reguloGPT approach introduces a context-aware relational graph that effectively embodies the hierarchical structure of MRPs and resolves semantic inconsistencies by embedding context directly within relational edges. We created a benchmark dataset including 400 annotated PubMed titles on N6-methyladenosine (m6A) regulations. Rigorous evaluation of reguloGPT on the benchmark dataset demonstrated marked improvement over existing algorithms. We further developed a novel G-Eval scheme, leveraging GPT-4 for annotation-free performance evaluation and demonstrated its agreement with traditional annotation-based evaluations. Utilizing reguloGPT predictions on m6A-related titles, we constructed the m6A-KG and demonstrated its utility in elucidating m6A's regulatory mechanisms in cancer phenotypes across various cancers. These results underscore reguloGPT's transformative potential for extracting biological knowledge from the literature. Availability and implementation: The source code of reguloGPT, the m6A title and benchmark datasets, and m6A-KG are available at: https://github.com/Huang-AI4Medicine-Lab/reguloGPT.

20.
Patterns (N Y) ; 5(2): 100894, 2024 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-38370127

RESUMEN

Advancing precision oncology requires accurate prediction of treatment response and accessible prediction models. To this end, we present shinyDeepDR, a user-friendly implementation of our innovative deep learning model, DeepDR, for predicting anti-cancer drug sensitivity. The web tool makes DeepDR more accessible to researchers without extensive programming experience. Using shinyDeepDR, users can upload mutation and/or gene expression data from a cancer sample (cell line or tumor) and perform two main functions: "Find Drug," which predicts the sample's response to 265 approved and investigational anti-cancer compounds, and "Find Sample," which searches for cell lines in the Cancer Cell Line Encyclopedia (CCLE) and tumors in The Cancer Genome Atlas (TCGA) with genomics profiles similar to those of the query sample to study potential effective treatments. shinyDeepDR provides an interactive interface to interpret prediction results and to investigate individual compounds. In conclusion, shinyDeepDR is an intuitive and free-to-use web tool for in silico anti-cancer drug screening.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA