RESUMEN
The nature of activation signals is essential in determining T cell subset differentiation; however, the features that determine T cell subset preference acquired during intrathymic development remain elusive. Here we show that naive CD4+ T cells generated in the mouse thymic microenvironment lacking Scd1, encoding the enzyme catalyzing oleic acid (OA) production, exhibit enhanced regulatory T (Treg) cell differentiation and attenuated development of experimental autoimmune encephalomyelitis. Scd1 deletion in K14+ thymic epithelia recapitulated the enhanced Treg cell differentiation phenotype of Scd1-deficient mice. The dearth of OA permitted DOT1L to increase H3K79me2 levels at the Atp2a2 locus of thymocytes at the DN2-DN3 transition stage. Such epigenetic modification persisted in naive CD4+ T cells and facilitated Atp2a2 expression. Upon T cell receptor activation, ATP2A2 enhanced the activity of the calcium-NFAT1-Foxp3 axis to promote naive CD4+ T cells to differentiate into Treg cells. Therefore, OA availability is critical for preprogramming thymocytes with Treg cell differentiation propensities in the periphery.
Asunto(s)
Ácido Oléico , Timocitos , Animales , Ratones , Ácido Oléico/metabolismo , Timo , Linfocitos T Reguladores , Diferenciación Celular , Factores de Transcripción Forkhead/genéticaRESUMEN
Lipid droplets (LDs) have increasingly been recognized as an essential organelle for eukaryotes. Although the biochemistry of lipid synthesis and degradation is well characterized, the regulation of LD dynamics, including its formation, maintenance, and secretion, is poorly understood. Here, we report that mice lacking Occludin (Ocln) show defective lipid metabolism. We show that LDs were larger than normal along its biogenesis and secretion pathway in Ocln null mammary cells. This defect in LD size control did not result from abnormal lipid synthesis or degradation; rather, it was because of secretion failure during the lactation stage. We found that OCLN was located on the LD membrane and was bound to essential regulators of lipid secretion, including BTN1a1 and XOR, in a C-terminus-dependent manner. Finally, OCLN was a phosphorylation target of Src kinase, whose loss causes lactation failure. Together, we demonstrate that Ocln is a downstream target of Src kinase and promotes LD secretion by binding to BTN1a1 and XOR.
Asunto(s)
Gotas Lipídicas/fisiología , Metabolismo de los Lípidos , Glándulas Mamarias Animales/metabolismo , Ocludina/metabolismo , Animales , Butirofilinas/metabolismo , Femenino , Lactancia/metabolismo , Ratones , Leche/metabolismo , Ocludina/genética , Familia-src Quinasas/antagonistas & inhibidores , Familia-src Quinasas/metabolismoRESUMEN
Preadipocytes can give rise to either white adipocytes or beige adipocytes. Owing to their distinct abilities in nutrient storage and energy expenditure, strategies that specifically promote "beiging" of adipocytes hold great promise for counterbalancing obesity and metabolic diseases. Yet, factors dictating the differentiation fate of adipocyte progenitors remain to be elucidated. We found that stearoyl-coenzyme A desaturase 1 (Scd1)-deficient mice, which resist metabolic stress, possess augmentation in beige adipocytes under basal conditions. Deletion of Scd1 in mature adipocytes expressing Fabp4 or Ucp1 did not affect thermogenesis in mice. Rather, Scd1 deficiency shifted the differentiation fate of preadipocytes from white adipogenesis to beige adipogenesis. Such effects are dependent on succinate accumulation in adipocyte progenitors, which fuels mitochondrial complex II activity. Suppression of mitochondrial complex II by Atpenin A5 or oxaloacetic acid reverted the differentiation potential of Scd1-deficient preadipocytes to white adipocytes. Furthermore, supplementation of succinate was found to increase beige adipocyte differentiation both in vitro and in vivo. Our data reveal an unappreciated role of Scd1 in determining the cell fate of adipocyte progenitors through succinate-dependent regulation of mitochondrial complex II.
Asunto(s)
Complejo II de Transporte de Electrones/metabolismo , Grasas/metabolismo , Obesidad/enzimología , Estearoil-CoA Desaturasa/genética , Ácido Succínico/metabolismo , Adipocitos Beige/citología , Adipocitos Beige/metabolismo , Adipogénesis , Animales , Metabolismo Energético , Proteínas de Unión a Ácidos Grasos/genética , Proteínas de Unión a Ácidos Grasos/metabolismo , Femenino , Humanos , Masculino , Ratones , Ratones Endogámicos BALB C , Ratones Noqueados , Obesidad/genética , Obesidad/metabolismo , Obesidad/fisiopatología , Estearoil-CoA Desaturasa/metabolismo , TermogénesisRESUMEN
BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic and contagious coronavirus that caused a global pandemic with 5.2 million fatalities to date. Questions concerning serologic features of long-term immunity, especially dominant epitopes mediating durable antibody responses after SARS-CoV-2 infection, remain to be elucidated. OBJECTIVE: We aimed to dissect the kinetics and longevity of immune responses in coronavirus disease 2019 (COVID-19) patients, as well as the epitopes responsible for sustained long-term humoral immunity against SARS-CoV-2. METHODS: We assessed SARS-CoV-2 immune dynamics up to 180 to 220 days after disease onset in 31 individuals who predominantly experienced moderate symptoms of COVID-19, then performed a proteome-wide profiling of dominant epitopes responsible for persistent humoral immune responses. RESULTS: Longitudinal analysis revealed sustained SARS-CoV-2 spike protein-specific antibodies and neutralizing antibodies in COVID-19 patients, along with activation of cytokine production at early stages after SARS-CoV-2 infection. Highly reactive epitopes that were capable of mediating long-term antibody responses were shown to be located at the spike and ORF1ab proteins. Key epitopes of the SARS-CoV-2 spike protein were mapped to the N-terminal domain of the S1 subunit and the S2 subunit, with varying degrees of sequence homology among endemic human coronaviruses and high sequence identity between the early SARS-CoV-2 (Wuhan-Hu-1) and current circulating variants. CONCLUSION: SARS-CoV-2 infection induces persistent humoral immunity in COVID-19-convalescent individuals by targeting dominant epitopes located at the spike and ORF1ab proteins that mediate long-term immune responses. Our findings provide a path to aid rational vaccine design and diagnostic development.
Asunto(s)
COVID-19 , Anticuerpos Antivirales , Epítopos , Humanos , Inmunidad Humoral , SARS-CoV-2 , Glicoproteína de la Espiga del CoronavirusRESUMEN
Mechanisms through which tissues are formed and maintained remain unknown but are fundamental aspects in biology. Tissue-specific gene expression is a valuable tool to study such mechanisms. But in many biomedical studies, cell lines, rather than human body tissues, are used to investigate biological mechanisms Whether or not cell lines maintain their tissue-specific characteristics after they are isolated and cultured outside the human body remains to be explored. In this study, we applied a novel computational method to identify core genes that contribute to the differentiation of cell lines from various tissues. Several advanced computational techniques, such as Monte Carlo feature selection method, incremental feature selection method, and support vector machine (SVM) algorithm, were incorporated in the proposed method, which extensively analyzed the gene expression profiles of cell lines from different tissues. As a result, we extracted a group of functional genes that can indicate the differences of cell lines in different tissues and built an optimal SVM classifier for identifying cell lines in different tissues. In addition, a set of rules for classifying cell lines were also reported, which can give a clearer picture of cell lines in different issues although its performance was not better than the optimal SVM classifier. Finally, we compared such genes with the tissue-specific genes identified by the Genotype-tissue Expression project. Results showed that most expression patterns between tissues remained in the derived cell lines despite some uniqueness that some genes show tissue specificity.
RESUMEN
Small nucleolar RNAs (snoRNAs) are a new type of functional small RNAs involved in the chemical modifications of rRNAs, tRNAs, and small nuclear RNAs. It is reported that they play important roles in tumorigenesis via various regulatory modes. snoRNAs can both participate in the regulation of methylation and pseudouridylation and regulate the expression pattern of their host genes. This research investigated the expression pattern of snoRNAs in eight major cancer types in TCGA via several machine learning algorithms. The expression levels of snoRNAs were first analyzed by a powerful feature selection method, Monte Carlo feature selection (MCFS). A feature list and some informative features were accessed. Then, the incremental feature selection (IFS) was applied to the feature list to extract optimal features/snoRNAs, which can make the support vector machine (SVM) yield best performance. The discriminative snoRNAs included HBII-52-14, HBII-336, SNORD123, HBII-85-29, HBII-420, U3, HBI-43, SNORD116, SNORA73B, SCARNA4, HBII-85-20, etc., on which the SVM can provide a Matthew's correlation coefficient (MCC) of 0.881 for predicting these eight cancer types. On the other hand, the informative features were fed into the Johnson reducer and repeated incremental pruning to produce error reduction (RIPPER) algorithms to generate classification rules, which can clearly show different snoRNAs expression patterns in different cancer types. The analysis results indicated that extracted discriminative snoRNAs can be important for identifying cancer samples in different types and the expression pattern of snoRNAs in different cancer types can be partly uncovered by quantitative recognition rules.
Asunto(s)
Regulación Neoplásica de la Expresión Génica , Aprendizaje Automático , Neoplasias/genética , ARN Nucleolar Pequeño/genética , Algoritmos , Humanos , Método de Montecarlo , Máquina de Vectores de SoporteRESUMEN
Adult neural stem cells (NSCs) are a group of multi-potent, self-renewing progenitor cells that contribute to the generation of new neurons and oligodendrocytes. Three subtypes of NSCs can be isolated based on the stages of the NSC lineage, including quiescent neural stem cells (qNSCs), activated neural stem cells (aNSCs) and neural progenitor cells (NPCs). Although it is widely accepted that these three groups of NSCs play different roles in the development of the nervous system, their molecular signatures are poorly understood. In this study, we applied the Monte-Carlo Feature Selection (MCFS) method to identify the gene expression signatures, which can yield a Matthews correlation coefficient (MCC) value of 0.918 with a support vector machine evaluated by ten-fold cross-validation. In addition, some classification rules yielded by the MCFS program for distinguishing above three subtypes were reported. Our results not only demonstrate a high classification capacity and subtype-specific gene expression patterns but also quantitatively reflect the pattern of the gene expression levels across the NSC lineage, providing insight into deciphering the molecular basis of NSC differentiation.
Asunto(s)
Astrocitos/citología , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Células-Madre Neurales/clasificación , Algoritmos , Linaje de la Célula , Células Cultivadas , Humanos , Método de Montecarlo , Máquina de Vectores de SoporteRESUMEN
Hematopoiesis is a complicated process involving a series of biological sub-processes that lead to the formation of various blood components. A widely accepted model of early hematopoiesis proceeds from long-term hematopoietic stem cells (LT-HSCs) to multipotent progenitors (MPPs) and then to lineage-committed progenitors. However, the molecular mechanisms of early hematopoiesis have not been fully characterized. In this study, we applied a computational strategy to identify the gene expression signatures distinguishing three types of closely related hematopoietic cells collected in recent studies: (1) hematopoietic stem cell/multipotent progenitor cells; (2) LT-HSCs; and (3) hematopoietic progenitor cells. Each cell in these cell types was represented by its gene expression profile among a total number of 20,475 genes. The expression features were analyzed by a Monte-Carlo Feature Selection (MCFS) method, resulting in a feature list. Then, the incremental feature selection (IFS) and a support vector machine (SVM) optimized with a sequential minimum optimization (SMO) algorithm were employed to access the optimal classifier with the highest Matthews correlation coefficient (MCC) value of 0.889, in which 6698 features were used to represent cells. In addition, through an updated program of MCFS method, seventeen decision rules can be obtained, which can classify the three cell types with an overall accuracy of 0.812. Using a literature review, both the rules and the top features used for building the optimal classifier were confirmed to be commonly used or potential biological markers for distinguishing the three cell types of HSPCs. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang.
Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica/fisiología , Células Madre Hematopoyéticas/metabolismo , Máquina de Vectores de Soporte , Transcriptoma/fisiología , HumanosRESUMEN
Epigenetic regulation has long been recognized as a significant factor in various biological processes, such as development, transcriptional regulation, spermatogenesis, and chromosome stabilization. Epigenetic alterations lead to many human diseases, including cancer, depression, autism, and immune system defects. Although efforts have been made to identify epigenetic regulators, it remains a challenge to systematically uncover all the components of the epigenetic regulation in the genome level using experimental approaches. The advances of constructing protein-protein interaction (PPI) networks provide an excellent opportunity to identify novel epigenetic factors computationally in the genome level. In this study, we identified potential epigenetic factors by using a computational method that applied the random walk with restart (RWR) algorithm on a protein-protein interaction (PPI) network using reported epigenetic factors as seed nodes. False positives were identified by their specific roles in the PPI network or by a low-confidence interaction and a weak functional relationship with epigenetic regulators. After filtering out the false positives, 26 candidate epigenetic factors were finally accessed. According to previous studies, 22 of these are thought to be involved in epigenetic regulation, suggesting the robustness of our method. Our study provides a novel computational approach which successfully identified 26 potential epigenetic factors, paving the way on deepening our understandings on the epigenetic mechanism.
Asunto(s)
Epigénesis Genética , Regulación de la Expresión Génica/genética , Mapas de Interacción de Proteínas/genética , Algoritmos , Biología Computacional , HumanosRESUMEN
OBJECTIVE: We performed closure of the patent ductus arteriosus (PDA) using a hybrid approach with an Amplatzer Duct Occluder. METHODS: Six patients (two males and four females) underwent PDA closure at a mean age of 7.8 months (range 2-24 months) and a mean weight of 6.6 kg (range 4.5-13 kg). The main pulmonary artery (MPA) was exposed via a minimally invasive left parasternal second intercostal space incision. Under transesophageal echocardiography guidance, the PDA occluder was implanted via direct puncture of the MPA. RESULTS: The procedure was successful in all patients with no residual shunt. There were no hospital deaths, and the postoperative course was uneventful. All patients were discharged on the 3rd to 4th day. There was no residual shunt in any patient on midterm follow-up. CONCLUSIONS: The novel hybrid approach is a safe, minimal invasive procedure. Further experience and longer follow-up of these patients is necessary to conclude whether this technique is applicable to all the patients with a PDA.
Asunto(s)
Procedimientos Quirúrgicos Cardíacos/métodos , Conducto Arterioso Permeable/cirugía , Procedimientos Quirúrgicos Mínimamente Invasivos/métodos , Dispositivo Oclusor Septal , Cirugía Asistida por Computador/métodos , Preescolar , Ecocardiografía Transesofágica , Femenino , Humanos , Lactante , Masculino , Resultado del TratamientoRESUMEN
BACKGROUND: Oncogenes are a type of genes that have the potential to cause cancer. Most normal cells undergo programmed cell death, namely apoptosis, but activated oncogenes can help cells avoid apoptosis and survive. Thus, studying oncogenes is helpful for obtaining a good understanding of the formation and development of various types of cancers. METHODS: In this study, we proposed a computational method, called OPM, for investigating oncogenes from the view of Gene Ontology (GO) and biological pathways. All investigated genes, including validated oncogenes retrieved from some public databases and other genes that have not been reported to be oncogenes thus far, were encoded into numeric vectors according to the enrichment theory of GO terms and KEGG pathways. Some popular feature selection methods, minimum redundancy maximum relevance and incremental feature selection, and an advanced machine learning algorithm, random forest, were adopted to analyze the numeric vectors to extract key GO terms and KEGG pathways. RESULTS: Along with the oncogenes, GO terms and KEGG pathways were discussed in terms of their relevance in this study. Some important GO terms and KEGG pathways were extracted using feature selection methods and were confirmed to be highly related to oncogenes. Additionally, the importance of these terms and pathways in predicting oncogenes was further demonstrated by finding new putative oncogenes based on them. CONCLUSIONS: This study investigated oncogenes based on GO terms and KEGG pathways. Some important GO terms and KEGG pathways were confirmed to be highly related to oncogenes. We hope that these GO terms and KEGG pathways can provide new insight for the study of oncogenes, particularly for building more effective prediction models to identify novel oncogenes. The program is available upon request. GENERAL SIGNIFICANCE: We hope that the new findings listed in this study may provide a new insight for the investigation of oncogenes. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang.
Asunto(s)
Neoplasias/genética , Oncogenes/genética , Transducción de Señal/genética , Algoritmos , Biología Computacional/métodos , Bases de Datos Genéticas , Ontología de Genes , HumanosRESUMEN
BACKGROUND: Hepatitis is a type of infectious disease that induces inflammation of the liver without pinpointing a particular pathogen or pathogenesis. Type C hepatitis, as a type of hepatitis, has been reported to induce cirrhosis and hepatocellular carcinoma within a very short amount of time. It is a great threat to human health. Some studies have revealed that trace elements are associated with infection with and immune rejection against hepatitis C virus (HCV). However, the mechanism underlying this phenomenon is still unclear. METHODS: In this study, we aimed to expand our knowledge of this phenomenon by designing a computational method to identify genes that may be related to both HCV and trace element metabolic processes. The searching procedure included three stages. First, a shortest path algorithm was applied to a large network, constructed by protein-protein interactions, to identify potential genes of interest. Second, a permutation test was executed to exclude false discoveries. Finally, some rules based on the betweenness and associations between candidate genes and HCV and trace elements were built to select core genes among the remaining genes. RESULTS: 12 lists of genes, corresponding to 12 types of trace elements, were obtained. These genes are deemed to be associated with HCV infection and trace elements metabolism. CONCLUSIONS: The analyses indicate that some genes may be related to both HCV and trace element metabolic processes, further confirming the associations between HCV and trace elements. The method was further tested on another set of HCV genes, the results indicate that this method is quite robustness. GENERAL SIGNIFICANCE: The newly found genes may partially reveal unknown mechanisms between HCV infection and trace element metabolism. This article is part of a Special Issue entitled "System Genetics" Guest Editor: Dr. Yudong Cai and Dr. Tao Huang.
Asunto(s)
Hepacivirus/patogenicidad , Hepatitis C/genética , Hepatitis C/metabolismo , Interacciones Huésped-Patógeno/genética , Mapas de Interacción de Proteínas/genética , Oligoelementos/efectos adversos , Oligoelementos/metabolismo , Algoritmos , Carcinoma Hepatocelular/etiología , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/metabolismo , Hepatitis C/complicaciones , Humanos , Hígado/metabolismo , Cirrosis Hepática/etiología , Cirrosis Hepática/genética , Cirrosis Hepática/metabolismo , Neoplasias Hepáticas/etiología , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismoRESUMEN
Numerous eukaryotic genes are alternatively spliced. Recently, deep transcriptome sequencing has skyrocketed proportion of alternatively spliced genes; over 95% human multi-exon genes are alternatively spliced. One fundamental question is: are all these alternative splicing (AS) events functional? To look into this issue, we studied the most common form of alternative 5' splice sites-GYNNGYs (Y = C/T), where both GYs can function as splice sites. Global analyses suggest that splicing noise (due to stochasticity of splicing process) can cause AS at GYNNGYs, evidenced by higher AS frequency in non-coding than in coding regions, in non-conserved than in conserved genes and in lowly expressed than in highly expressed genes. However, â¼20% AS GYNNGYs in humans and â¼3% in mice exhibit tissue-dependent regulation. Consistent with being functional, regulated GYNNGYs are more conserved than unregulated ones. And regulated GYNNGYs have distinctive sequence features which may confer regulation. Particularly, each regulated GYNNGY comprises two splice sites more resembling each other than unregulated GYNNGYs, and has more conserved downstream flanking intron. Intriguingly, most regulated GYNNGYs may tune gene expression through coupling with nonsense-mediated mRNA decay, rather than encode different proteins. In summary, AS at GYNNGY 5' splice sites is primarily splicing noise, and secondarily a way of regulation.
Asunto(s)
Empalme Alternativo , Sitios de Empalme de ARN , Animales , Secuencia de Bases , Secuencia Conservada , Humanos , Macaca mulatta , Ratones , Especificidad de ÓrganosRESUMEN
Cell-penetrating peptides, a group of short peptides, can traverse cell membranes to enter cells and thus facilitate the uptake of various molecular cargoes. Thus, they have the potential to become powerful drug delivery systems. The correct identification of peptides as cell-penetrating or non-cell-penetrating would accelerate this application. In this study, we determined which features were important for a peptide to be cell-penetrating or non-cell-penetrating and built a predictive model based on the key features extracted from this analysis. The investigated peptides were retrieved from a previous study, and each was encoded as a numeric vector according to six properties of amino acids-amino acid frequency, codon diversity, electrostatic charge, molecular volume, polarity, and secondary structure-by the pseudo-amino acid composition method. Methods of minimum redundancy maximum relevance and incremental feature selection were then employed to analyze these features, and some were found to be key determinants of cell penetration. In parallel, an optimal random forest prediction model was built. We hope that our findings will provide new resources for the study of cell-penetrating peptides.
Asunto(s)
Péptidos de Penetración Celular/química , Secuencia de Aminoácidos , Codón , Árboles de Decisión , Modelos Químicos , Modelos Moleculares , Estructura Secundaria de ProteínaRESUMEN
In female mammals most X-linked genes are subject to X-inactivation. However, in humans some X-linked genes escape silencing, these escapees being candidates for the phenotypic aberrations seen in polyX karyotypes. These escape genes have been reported to be under stronger purifying selection than other X-linked genes. Although it is known that escape from X-inactivation is much more common in humans than in mice, systematic assays of escape in humans have to date employed only interspecies somatic cell hybrids. Here we provide the first systematic next-generation sequencing analysis of escape in a human cell line. We analyzed RNA and genotype sequencing data obtained from B lymphocyte cell lines derived from Europeans (CEU) and Yorubans (YRI). By replicated detection of heterozygosis in the transcriptome, we identified 114 escaping genes, including 76 not previously known to be escapees. The newly described escape genes cluster on the X chromosome in the same chromosomal regions as the previously known escapees. There is an excess of escaping genes associated with mental retardation, consistent with this being a common phenotype of polyX phenotypes. We find both differences between populations and between individuals in the propensity to escape. Indeed, we provide the first evidence for there being both hyper- and hypo-escapee females in the human population, consistent with the highly variable phenotypic presentation of polyX karyotypes. Considering also prior data, we reclassify genes as being always, never, and sometimes escape genes. We fail to replicate the prior claim that genes that escape X-inactivation are under stronger purifying selection than others.
Asunto(s)
Expresión Génica , Genes Ligados a X , Discapacidad Intelectual/genética , Inactivación del Cromosoma X , Animales , Pueblo Asiatico/genética , Línea Celular , Evolución Molecular , Femenino , Variación Genética , Humanos , Masculino , Ratones , Tasa de Mutación , Fenotipo , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ARN , Población Blanca/genética , Cromosoma XRESUMEN
We report on a patient with a 47,XXY karyotype who presents a normal female phenotype, which is an extremely rare observation worldwide. The patient is infertile. Type B ultrasound scans and other tests suggested that her ovaries had completely failed. Microsatellite DNA marker analysis revealed that the 2 X chromosomes were derived from her mother and that this abnormality was caused by non-disjunction of the maternal X chromosomes during meiosis II. Copy number variation analysis identified 2 large de novo deletions in her Y chromosome. Remarkably, one of the deleted regions includes the SRY gene locus, which might explain her female phenotype. However, the genetic mechanism of her ovarian failure remains unclear. This paper is the first report of a 47,XXY female with ovarian failure.
Asunto(s)
Síndrome de Klinefelter/diagnóstico , Insuficiencia Ovárica Primaria/diagnóstico , Adulto , Cromosomas Humanos Y/genética , Femenino , Eliminación de Gen , Genes sry , Humanos , Síndrome de Klinefelter/genética , Técnicas de Diagnóstico Molecular , Linaje , Polimorfismo de Nucleótido Simple , Insuficiencia Ovárica Primaria/genéticaRESUMEN
Genome-wide interaction-based association (GWIBA) analysis has the potential to identify novel susceptibility loci. These interaction effects could be missed with the prevailing approaches in genome-wide association studies (GWAS). However, no convincing loci have been discovered exclusively from GWIBA methods, and the intensive computation involved is a major barrier for application. Here, we developed a fast, multi-thread/parallel program named "pair-wise interaction-based association mapping" (PIAM) for exhaustive two-locus searches. With this program, we performed a complete GWIBA analysis on seven diseases with stringent control for false positives, and we validated the results for three of these diseases. We identified one pair-wise interaction between a previously identified locus, C1orf106, and one new locus, TEC, that was specific for Crohn's disease, with a Bonferroni corrected P < 0.05 (P = 0.039). This interaction was replicated with a pair of proxy linked loci (P = 0.013) on an independent dataset. Five other interactions had corrected P < 0.5. We identified the allelic effect of a locus close to SLC7A13 for coronary artery disease. This was replicated with a linked locus on an independent dataset (P = 1.09 × 10â»7). Through a local validation analysis that evaluated association signals, rather than locus-based associations, we found that several other regions showed association/interaction signals with nominal P < 0.05. In conclusion, this study demonstrated that the GWIBA approach was successful for identifying novel loci, and the results provide new insights into the genetic architecture of common diseases. In addition, our PIAM program was capable of handling very large GWAS datasets that are likely to be produced in the future.
Asunto(s)
Enfermedad/genética , Sitios Genéticos/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Enfermedad de la Arteria Coronaria/genética , Enfermedad de Crohn/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Femenino , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Reproducibilidad de los ResultadosRESUMEN
Probabilistic latent variable models (PLVMs), such as probabilistic principal component analysis (PPCA), are widely employed in process monitoring and fault detection of industrial processes. This article proposes a novel deep PPCA (DePPCA) model, which has the advantages of both probabilistic modeling and deep learning. The construction of DePPCA includes a greedy layer-wise pretraining phase and a unified end-to-end fine-tuning phase. The former establishes a hierarchical deep structure based on cascading multiple layers of the PPCA module to extract high-level features. The latter builds an end-to-end connection between the raw inputs and the final outputs to further improve the representation of the model to high-level features. After constructing the model structure of DePPCA, we first present the detailed training processes of the pretraining and fine-tuning stages, then clarify the theoretical merits of the proposed model from the perspective of variational inference. For process monitoring purposes, we develop two statistics based on the established DePPCA. The monitoring performance of these two statistics can remain superior even if the features extracted by DePPCA are significantly compressed to univariate. This makes the feature extraction process and online monitoring procedure of DePPCA quite fast. In other words, the proposed DePPCA can achieve accurate and efficient process monitoring by only extracting one feature for each sample. Finally, the effectiveness of DePPCA is evaluated on the Tennessee Eastman (TE) process and the multiphase flow (MPF) facility.
RESUMEN
Tumor heterogeneity, the presence of multiple distinct subpopulations of cancer cells between patients or among the same tumors, poses a major challenge to current targeted therapies. The way these different subpopulations interact among themselves and the stromal niche environment, and how such interactions affect cancer stem cell behavior has remained largely unknown. Here, it is shown that an FGF-BMP7-INHBA signaling positive feedback loop integrates interactions among different cell populations, including mammary gland stem cells, luminal epithelial and stromal fibroblast niche components not only in organ regeneration but also, with certain modifications, in cancer progression. The reciprocal dependence of basal stem cells and luminal epithelium is based on basal-derived BMP7 and luminal-derived INHBA, which promote their respective expansion, and is regulated by stromal-epithelial FGF signaling. Targeting this interaction loop, for example, by reducing the function of one or more of its components, inhibits organ regeneration and breast cancer progression. The results have profound implications for overcoming drug resistance because of tumor heterogeneity in future targeted therapies.
Asunto(s)
Neoplasias de la Mama , Nicho de Células Madre , Humanos , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Neoplasias de la Mama/genética , Animales , Femenino , Nicho de Células Madre/fisiología , Células Madre Neoplásicas/metabolismo , Transducción de Señal , Ratones , Células Epiteliales/metabolismo , Proteína Morfogenética Ósea 7/metabolismo , Proteína Morfogenética Ósea 7/genética , Microambiente TumoralRESUMEN
Due to the high heterogeneity of ovarian cancer (OC), it occupies the main cause of cancer-related death among women. As the most aggressive and frequent subtype of OC, high-grade serous cancer (HGSC) represents around 70 % of all patients. With the booming progress of single-cell RNA sequencing (scRNA-seq), unique and subtle changes among different cell states have been identified including novel risk genes and pathways. Here, our present study aims to identify differentially correlated core genes between normal and tumor status through HGSC scRNA-seq data analysis. R package high-dimension Weighted Gene Co-expression Network Analysis (hdWGCNA) was implemented for building gene interaction networks based on HGSC scRNA-seq data. DiffCorr was integrated for identifying differentially correlated genes between tumor and their adjacent normal counterparts. Software Cytoscape was implemented for constructing and visualizing biological networks. Real-time qPCR (RT-qPCR) was utilized to confirm expression pattern of new genes. We introduced ScHGSC-IGDC (Identifying Genes with Differential Correlations of HGSC based on scRNA-seq analysis), an in silico framework for identifying core genes in the development of HGSC. We detected thirty-four modules in the network. Scores of new genes with opposite correlations with others such as NDUFS5, TMSB4X, SERPINE2 and ITPR2 were identified. Further survival and literature validation emphasized their great values in the HGSC management. Meanwhile, RT-qPCR verified expression pattern of NDUFS5, TMSB4X, SERPINE2 and ITPR2 in human OC cell lines and tissues. Our research offered novel perspectives on the gene modulatory mechanisms from single cell resolution, guiding network based algorithms in cancer etiology field.