Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 132
Filtrar
1.
Biol Sex Differ ; 15(1): 62, 2024 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-39107837

RESUMEN

BACKGROUND: Lung adenocarcinoma (LUAD) has been observed to have significant sex differences in incidence, prognosis, and response to therapy. However, the molecular mechanisms responsible for these disparities have not been investigated extensively. METHODS: Sample-specific gene regulatory network methods were used to analyze RNA sequencing data from non-cancerous human lung samples from The Genotype Tissue Expression Project (GTEx) and lung adenocarcinoma primary tumor samples from The Cancer Genome Atlas (TCGA); results were validated on independent data. RESULTS: We found that genes associated with key biological pathways including cell proliferation, immune response and drug metabolism are differentially regulated between males and females in both healthy lung tissue and tumor, and that these regulatory differences are further perturbed by tobacco smoking. We also discovered significant sex bias in transcription factor targeting patterns of clinically actionable oncogenes and tumor suppressor genes, including AKT2 and KRAS. Using differentially regulated genes between healthy and tumor samples in conjunction with a drug repurposing tool, we identified several small-molecule drugs that might have sex-biased efficacy as cancer therapeutics and further validated this observation using an independent cell line database. CONCLUSIONS: These findings underscore the importance of including sex as a biological variable and considering gene regulatory processes in developing strategies for disease prevention and management.


Lung adenocarcinoma (LUAD) is a disease that affects males and females differently. Biological sex not only influences chances of developing the disease, but also how the disease progresses and how effective various therapies may be. We analyzed sex-specific gene regulatory networks consisting of transcription factors and the genes they regulate in both healthy lung tissue and in LUAD and identified sex-biased differences. We found that genes associated with cell proliferation, immune response, and drug metabolism are differentially targeted by transcription factors between males and females. We also found that several genes that are drug targets in LUAD, are also regulated differently between males and females. Importantly, these differences are also influenced by an individual's smoking history. Extending our analysis using a drug repurposing tool, we found candidate drugs with evidence that they might work better for one sex or the other. These results demonstrate that considering the differences in gene regulation between males and females will be essential if we are to develop precision medicine strategies for preventing and treating LUAD.


Asunto(s)
Adenocarcinoma del Pulmón , Redes Reguladoras de Genes , Adenocarcinoma del Pulmón/diagnóstico , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/terapia , Factores Sexuales , Regulación Neoplásica de la Expresión Génica/genética , Pulmón/metabolismo , Fumar Tabaco/efectos adversos , Pronóstico , Inmunoterapia , Terapia Molecular Dirigida , Línea Celular Tumoral , Humanos , Masculino , Femenino , Descubrimiento de Drogas
2.
Front Genet ; 15: 1403587, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39192888

RESUMEN

Introduction: The tumor microenvironment and IRGs are highly correlated with tumor occurrence, progression, and prognosis. However, their roles in grade II and III gliomas, termed LGGs in this study, remain to be fully elucidated. Our research aims to develop immune-related features for risk stratification and prognosis prediction in LGG. Methods: Using the ssGSEA method, we assessed the immune characteristics of the LGG population. We conducted differential analysis using LGG samples from the TCGA database and normal samples from GTEx, identifying 412 differentially expressed immune-related genes (DEIRGs). Subsequently, we utilized univariate Cox, LASSO, and multivariate Cox regression analyses to establish both a gene predictive model and a nomogram predictive model. Results: Here, we found that the ESTIMATE score, immune score and stromal score of high-immunity, high-grade and isocitrate dehydrogenase (IDH) wild-type glioma were higher than those of the corresponding group, and the tumor purity was lower. Higher ESTIMATE scores, stromal scores and immune scores indicated a poor prognosis in patients with LGG. Our four-gene prognostic model demonstrated superior accuracy compared to other molecular features. Validation using the CGGA as a testing set and the combined TCGA and CGGA cohort confirmed its robust prognostic value. Additionally, a nomogram integrating the prognostic model and clinical variables showed enhanced predictive capability. Discussion: Our study highlights the prognostic significance of the identified four DEIRGs (KLRC3, MR1, PDIA2, and RFXAP) in LGG patients. The predictive model and nomogram developed herein offer valuable tools for personalized treatment strategies in LGG. Future research should focus on further validating these findings and exploring the functional roles of these DEIRGs within the LGG tumor microenvironment.

3.
Cell Genom ; 4(8): 100605, 2024 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-38981476

RESUMEN

Crosstalk between N6-methyladenosine (m6A) and epigenomes is crucial for gene regulation, but its regulatory directionality and disease significance remain unclear. Here, we utilize quantitative trait loci (QTLs) as genetic instruments to delineate directional maps of crosstalk between m6A and two epigenomic traits, DNA methylation (DNAme) and H3K27ac. We identify 47 m6A-to-H3K27ac and 4,733 m6A-to-DNAme and, in the reverse direction, 106 H3K27ac-to-m6A and 61,775 DNAme-to-m6A regulatory loci, with differential genomic location preference observed for different regulatory directions. Integrating these maps with complex diseases, we prioritize 20 genome-wide association study (GWAS) loci for neuroticism, depression, and narcolepsy in brain; 1,767 variants for asthma and expiratory flow traits in lung; and 249 for coronary artery disease, blood pressure, and pulse rate in muscle. This study establishes disease regulatory paths, such as rs3768410-DNAme-m6A-asthma and rs56104944-m6A-DNAme-hypertension, uncovering locus-specific crosstalk between m6A and epigenomic layers and offering insights into regulatory circuits underlying human diseases.


Asunto(s)
Adenosina , Metilación de ADN , Epigenómica , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Humanos , Adenosina/análogos & derivados , Adenosina/metabolismo , Adenosina/genética , Epigenómica/métodos , Epigénesis Genética , Epigenoma/genética , Transcriptoma , Histonas/metabolismo , Histonas/genética
4.
Mol Cell Probes ; 76: 101971, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38977039

RESUMEN

OBJECTIVE: This study aimed to conclude the effect and mechanism of ZIC2 on immune infiltration in lung adenocarcinoma (LUAD). METHODS: Expression of ZIC2 in several kinds of normal tissues of TCGA data was analyzed and its correlation with the baseline characteristic of LUAD patients were analyzed. The immune infiltration analysis of LUAD patients was performed by CIBERSORT algorithm. The correlation analysis between ZIC2 and immune cell composition was performed. Additionally, the potential upstream regulatory mechanisms of ZIC2 were predicted to identify the possible miRNAs and lncRNAs that regulated ZIC2 in LUAD. In vitro and in vivo experiments were also conducted to confirm the potential effect of ZIC2 on cell proliferation and invasion ability of LUAD cells. RESULTS: ZIC2 expression was decreased in various normal tissues, but increased in multiple tumors, including LUAD, and correlated with the prognosis of LUAD patients. Enrichment by GO and KEGG suggested the possible association of ZIC2 with cell cycle and p53 signal pathway. ZIC2 expression was significantly correlated with T cells CD4 memory resting, Macrophages M1, and plasma cells, indicating that dysregulated ZIC2 expression in LUAD may directly influence immune infiltration. ZIC2 might be regulated by several different lncRNA-mediated ceRNA mechanisms. In vitro experiments validated the promotive effect of ZIC2 on cell viability and invasion ability of LUAD cells. In vivo experiments validated ZIC2 can accelerate tumor growth in nude mouse. CONCLUSION: ZIC2 regulated by different lncRNA-mediated ceRNA mechanisms may play a critical regulatory role in LUAD through mediating the composition of immune cells in tumor microenvironment.


Asunto(s)
Adenocarcinoma del Pulmón , Proliferación Celular , Biología Computacional , Regulación Neoplásica de la Expresión Génica , Neoplasias Pulmonares , MicroARNs , ARN Largo no Codificante , Factores de Transcripción , Humanos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/inmunología , Adenocarcinoma del Pulmón/patología , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/inmunología , Neoplasias Pulmonares/patología , Animales , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , MicroARNs/genética , MicroARNs/metabolismo , Proliferación Celular/genética , Línea Celular Tumoral , Ratones , Ratones Desnudos , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , ARN Endógeno Competitivo
5.
Sci Rep ; 14(1): 12454, 2024 05 30.
Artículo en Inglés | MEDLINE | ID: mdl-38816574

RESUMEN

Housekeeping protein-coding genes are stably expressed genes in cells and tissues that are thought to be engaged in fundamental cellular biological functions. They are often utilized as normalization references in molecular biology research and are especially important in integrated bioinformatic investigations. Prior studies have examined human housekeeping protein-coding genes by analyzing various gene expression datasets. The inclusion of different tissue types significantly impacted the discovery of housekeeping genes. In this report, we investigated particularly individual human subject expression differences in protein-coding genes across different tissue types. We used GTEx V8 gene expression datasets obtained from more than 16,000 human normal tissue samples. Furthermore, the Gini index is utilized to investigate the expression variations of protein-coding genes between tissue and individual donor subjects. Housekeeping protein-coding genes found using Gini index profiles may vary depending on the tissue subtypes investigated, particularly given the diverse sample size collections across the GTEx tissue subtypes. We subsequently selected major tissues and identified subsets of housekeeping genes with stable expression levels among human donors within those tissues. In this work, we provide alternative sets of housekeeping protein-coding genes that show more consistent expression patterns in human subjects across major solid organs. Weblink: https://hpsv.ibms.sinica.edu.tw .


Asunto(s)
Genes Esenciales , Humanos , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Especificidad de Órganos/genética , Bases de Datos Genéticas
6.
Genomics ; 116(3): 110852, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38703969

RESUMEN

Autophagy, a highly conserved process of protein and organelle degradation, has emerged as a critical regulator in various diseases, including cancer progression. In the context of liver cancer, the predictive value of autophagy-related genes remains ambiguous. Leveraging chip datasets from the TCGA and GTEx databases, we identified 23 differentially expressed autophagy-related genes in liver cancer. Notably, five key autophagy genes, PRKAA2, BIRC5, MAPT, IGF1, and SPNS1, were highlighted as potential prognostic markers, with MAPT showing significant overexpression in clinical samples. In vitro cellular assays further demonstrated that MAPT promotes liver cancer cell proliferation, migration, and invasion by inhibiting autophagy and suppressing apoptosis. Subsequent in vivo studies further corroborated the pro-tumorigenic role of MAPT by suppressing autophagy. Collectively, our model based on the five key genes provides a promising tool for predicting liver cancer prognosis, with MAPT emerging as a pivotal factor in tumor progression through autophagy modulation.


Asunto(s)
Autofagia , Neoplasias Hepáticas , Proteínas tau , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patología , Neoplasias Hepáticas/metabolismo , Autofagia/genética , Proteínas tau/genética , Proteínas tau/metabolismo , Pronóstico , Línea Celular Tumoral , Survivin/genética , Survivin/metabolismo , Proliferación Celular , Animales , Factor I del Crecimiento Similar a la Insulina/genética , Factor I del Crecimiento Similar a la Insulina/metabolismo , Biomarcadores de Tumor/genética , Movimiento Celular , Ratones , Apoptosis , Regulación Neoplásica de la Expresión Génica , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patología , Carcinoma Hepatocelular/metabolismo
7.
Traffic ; 25(4): e12933, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38600522

RESUMEN

Macroautophagy/autophagy is an essential catabolic process that targets a wide variety of cellular components including proteins, organelles, and pathogens. ATG7, a protein involved in the autophagy process, plays a crucial role in maintaining cellular homeostasis and can contribute to the development of diseases such as cancer. ATG7 initiates autophagy by facilitating the lipidation of the ATG8 proteins in the growing autophagosome membrane. The noncanonical isoform ATG7(2) is unable to perform ATG8 lipidation; however, its cellular regulation and function are unknown. Here, we uncovered a distinct regulation and function of ATG7(2) in contrast with ATG7(1), the canonical isoform. First, affinity-purification mass spectrometry analysis revealed that ATG7(2) establishes direct protein-protein interactions (PPIs) with metabolic proteins, whereas ATG7(1) primarily interacts with autophagy machinery proteins. Furthermore, we identified that ATG7(2) mediates a decrease in metabolic activity, highlighting a novel splice-dependent function of this important autophagy protein. Then, we found a divergent expression pattern of ATG7(1) and ATG7(2) across human tissues. Conclusively, our work uncovers the divergent patterns of expression, protein interactions, and function of ATG7(2) in contrast to ATG7(1). These findings suggest a molecular switch between main catabolic processes through isoform-dependent expression of a key autophagy gene.


Asunto(s)
Autofagia , Metabolismo Energético , Humanos , Autofagosomas/metabolismo , Proteínas Relacionadas con la Autofagia/metabolismo , Proteínas Asociadas a Microtúbulos/metabolismo , Isoformas de Proteínas/metabolismo
8.
Lab Invest ; 104(6): 102069, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38670317

RESUMEN

Tissue gene expression studies are impacted by biological and technical sources of variation, which can be broadly classified into wanted and unwanted variation. The latter, if not addressed, results in misleading biological conclusions. Methods have been proposed to reduce unwanted variation, such as normalization and batch correction. A more accurate understanding of all causes of variation could significantly improve the ability of these methods to remove unwanted variation while retaining variation corresponding to the biological question of interest. We used 17,282 samples from 49 human tissues in the Genotype-Tissue Expression data set (v8) to investigate patterns and causes of expression variation. Transcript expression was transformed to z-scores, and only the most variable 2% of transcripts were evaluated and clustered based on coexpression patterns. Clustered gene sets were assigned to different biological or technical causes based on histologic appearances and metadata elements. We identified 522 variable transcript clusters (median: 11 per tissue) among the samples. Of these, 63% were confidently explained, 16% were likely explained, 7% were low confidence explanations, and 14% had no clear cause. Histologic analysis annotated 46 clusters. Other common causes of variability included sex, sequencing contamination, immunoglobulin diversity, and compositional tissue differences. Less common biological causes included death interval (Hardy score), disease status, and age. Technical causes included blood draw timing and harvesting differences. Many of the causes of variation in bulk tissue expression were identifiable in the Tabula Sapiens data set of single-cell expression. This is among the largest explorations of the underlying sources of tissue expression variation. It uncovered expected and unexpected causes of variable gene expression and demonstrated the utility of matched histologic specimens. It further demonstrated the value of acquiring meaningful tissue harvesting metadata elements to use for improved normalization, batch correction, and analysis of both bulk and single-cell RNA-seq data.


Asunto(s)
Perfilación de la Expresión Génica , Humanos , Especificidad de Órganos , Análisis por Conglomerados
9.
Physiol Rep ; 12(7): e15995, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38561245

RESUMEN

Exercise has different effects on different tissues in the body, the sum of which may determine the response to exercise and the health benefits. In the present study, we aimed to investigate whether physical training regulates transcriptional network communites common to both skeletal muscle (SM) and subcutaneous adipose tissue (SAT). Eight such shared transcriptional communities were found in both tissues. Eighteen young overweight adults voluntarily participated in 7 weeks of combined strength and endurance training (five training sessions per week). Biopsies were taken from SM and SAT before and after training. Five of the network communities were regulated by training in SM but showed no change in SAT. One community involved in insulin- AMPK signaling and glucose utilization was upregulated in SM but downregulated in SAT. This diverging exercise regulation was confirmed in two independent studies and was also associated with BMI and diabetes in an independent cohort. Thus, the current finding is consistent with the differential responses of different tissues and suggests that body composition may influence the observed individual whole-body metabolic response to exercise training and help explain the observed attenuated whole-body insulin sensitivity after exercise training, even if it has significant effects on the exercising muscle.


Asunto(s)
Resistencia a la Insulina , Obesidad , Adulto , Humanos , Obesidad/metabolismo , Músculo Esquelético/metabolismo , Ejercicio Físico/fisiología , Grasa Subcutánea/metabolismo , Insulina/metabolismo , Resistencia a la Insulina/fisiología , Expresión Génica , Tejido Adiposo/metabolismo
10.
Front Neurosci ; 18: 1358998, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38445255

RESUMEN

Alzheimer's disease (AD) is a progressive neurodegenerative disorder that affects over 50 million elderly individuals worldwide. Although the pathogenesis of AD is not fully understood, based on current research, researchers are able to identify potential biomarker genes and proteins that may serve as effective targets against AD. This article aims to present a comprehensive overview of recent advances in AD biomarker identification, with highlights on the use of various algorithms, the exploration of relevant biological processes, and the investigation of shared biomarkers with co-occurring diseases. Additionally, this article includes a statistical analysis of key genes reported in the research literature, and identifies the intersection with AD-related gene sets from databases such as AlzGen, GeneCard, and DisGeNet. For these gene sets, besides enrichment analysis, protein-protein interaction (PPI) networks utilized to identify central genes among the overlapping genes. Enrichment analysis, protein interaction network analysis, and tissue-specific connectedness analysis based on GTEx database performed on multiple groups of overlapping genes. Our work has laid the foundation for a better understanding of the molecular mechanisms of AD and more accurate identification of key AD markers.

11.
bioRxiv ; 2024 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-38328080

RESUMEN

Background: Gene co-expression networks (GCNs) describe relationships among expressed genes key to maintaining cellular identity and homeostasis. However, the small sample size of typical RNA-seq experiments which is several orders of magnitude fewer than the number of genes is too low to infer GCNs reliably. recount3, a publicly available dataset comprised of 316,443 uniformly processed human RNA-seq samples, provides an opportunity to improve power for accurate network reconstruction and obtain biological insight from the resulting networks. Results: We compared alternate aggregation strategies to identify an optimal workflow for GCN inference by data aggregation and inferred three consensus networks: a universal network, a non-cancer network, and a cancer network in addition to 27 tissue context-specific networks. Central network genes from our consensus networks were enriched for evolutionarily constrained genes and ubiquitous biological pathways, whereas central context-specific network genes included tissue-specific transcription factors and factorization based on the hubs led to clustering of related tissue contexts. We discovered that annotations corresponding to context-specific networks inferred from aggregated data were enriched for trait heritability beyond known functional genomic annotations and were significantly more enriched when we aggregated over a larger number of samples. Conclusion: This study outlines best practices for network GCN inference and evaluation by data aggregation. We recommend estimating and regressing confounders in each data set before aggregation and prioritizing large sample size studies for GCN reconstruction. Increased statistical power in inferring context-specific networks enabled the derivation of variant annotations that were enriched for concordant trait heritability independent of functional genomic annotations that are context-agnostic. While we observed strictly increasing held-out log-likelihood with data aggregation, we noted diminishing marginal improvements. Future directions aimed at alternate methods for estimating confounders and integrating orthogonal information from modalities such as Hi-C and ChIP-seq can further improve GCN inference.

12.
Mol Cell Proteomics ; 23(2): 100719, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38242438

RESUMEN

Although the human gene annotation has been continuously improved over the past 2 decades, numerous studies demonstrated the existence of a "dark proteome", consisting of proteins that were critical for biological processes but not included in widely used gene catalogs. The Genotype-Tissue Expression project generated more than 15,000 RNA-seq datasets from multiple tissues, which modeled 30 million transcripts in the human genome. To provide a resource of high-confidence novel proteins from the dark proteome, we screened 50,000 mass spectrometry runs from over 900 projects to identify proteins translated from the Genotype-Tissue Expression transcript model with proteomic support. We also integrated 3.8 million common genetic variants from the gnomAD database to improve peptide identification. As a result, we identified 170,529 novel peptides with proteomic evidence, of which 6048 passed the strictest standard we defined and were supported by PepQuery. We provided a user-friendly website (https://ncorf.genes.fun/) for researchers to check the evidence of novel peptides from their studies. The findings will improve our understanding of coding genes and facilitate genomic data interpretation in biomedical research.


Asunto(s)
Proteogenómica , Humanos , Proteogenómica/métodos , Proteoma/metabolismo , Proteómica/métodos , Péptidos/genética , Genoma Humano
13.
Heliyon ; 9(11): e21283, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37920490

RESUMEN

Human endogenous retroviruses (HERVs) are remnants of ancient retroviral infections in the human genome. RNA expression of individual HERVs has frequently been observed in various pathologic conditions, but some activity can also be seen in healthy individuals, e.g. in the blood. To quantitate the basal expression levels of HERVs in the brain, we now used high-throughput sequencing-based metagenomic analysis to characterize the expression profiles of the HERV-K (HML-2) family proviruses in different brain regions of healthy brain tissue. To this end, RNA-seq data from the Genotype-Tissue Expression (GTEx) project was used. The GTEx project is a public resource to study tissue-specific gene expression and regulation, consisting of a large selection of sequenced samples from different tissues. The GTEx data used in this study consisted of 378 samples taken from 13 brain regions from 55 individuals. The data demonstrated that out of 99 intact proviruses in the family 58 were expressed, but the expression profiles were highly divergent and there were no significant differences in the expression profiles between the various anatomic regions of the brain. It is known that the brain contains a variety of infiltrating immune cells, which are probably of great importance both in the normal defense mechanisms as well as in the various pathogenic processes. Digital cytometry (CIBERSORTx) was used to quantify the proportions of the infiltrating immune cells in the same brain samples. Six most abundant (>5 % of the total population) cell types were observed to be CD4 memory resting T cells, M0 macrophages, plasma cells, CD8 T cells, CD4 memory activated T cells, and monocytes. Analysis of the correlations between the individual HERVs and infiltrating cell types indicated that a cluster of 6 HERVs had a notable correlation signature between T cell type infiltrating cell proportions and HERV RNA expression intensity. The correlations between inflammatory type infiltrating cells were negative or weak. Taken together, these data indicate that the expression of HERVs is associated with a T cell type immunity.

14.
bioRxiv ; 2023 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-37790409

RESUMEN

Lung adenocarcinoma (LUAD) has been observed to have significant sex differences in incidence, prognosis, and response to therapy. However, the molecular mechanisms responsible for these disparities have not been investigated extensively. Sample-specific gene regulatory network methods were used to analyze RNA sequencing data from non-cancerous human lung samples from The Genotype Tissue Expression Project (GTEx) and lung adenocarcinoma primary tumor samples from The Cancer Genome Atlas (TCGA); results were validated on independent data. We observe that genes associated with key biological pathways including cell proliferation, immune response and drug metabolism are differentially regulated between males and females in both healthy lung tissue, as well as in tumor, and that these regulatory differences are further perturbed by tobacco smoking. We also uncovered significant sex bias in transcription factor targeting patterns of clinically actionable oncogenes and tumor suppressor genes, including AKT2 and KRAS. Using differentially regulated genes between healthy and tumor samples in conjunction with a drug repurposing tool, we identified several small-molecule drugs that might have sex-biased efficacy as cancer therapeutics and further validated this observation using an independent cell line database. These findings underscore the importance of including sex as a biological variable and considering gene regulatory processes in developing strategies for disease prevention and management.

15.
HGG Adv ; 4(4): 100223, 2023 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-37576186

RESUMEN

Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data, while the second category uses whole-blood expression data. Both data types can be easily collected from blood, avoiding invasive tissue biopsies. In this study, we attempted to build an optimal predictive model for imputing tissue-specific gene expression by combining the genotype and whole-blood expression data. We first evaluated the imputation performance of each standalone model (using genotype data [GEN model] and using whole-blood expression data [WBE model]) using their respective data types across 47 human tissues. The WBE model outperformed the GEN model in most tissues by a large gain. Then, we developed several combined models that leverage both types of predictors to further improve imputation performance. We tried various strategies, including utilizing a merged dataset of the two data types (MERGED models) and integrating the imputation outcomes of the two standalone models (inverse variance-weighted [IVW] models). We found that one of the MERGED models noticeably outperformed the standalone models. This model involved a fixed ratio between the two regularization penalty factors for the two predictor types so that the contribution of the whole-blood transcriptome is upweighted compared with the genotype. Our study suggests that one can improve the imputation of tissue-specific gene expression by combining the genotype and whole-blood expression, but the improvement can be largely dependent on the combination strategy chosen.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Transcriptoma/genética , Fenotipo , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Polimorfismo de Nucleótido Simple , Genotipo
16.
Front Cell Dev Biol ; 11: 1208315, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37457300

RESUMEN

Objectives: RNA-binding proteins (RBPs) have diverse and essential biological functions, but their role in cartilage health and disease is largely unknown. The objectives of this study were (i) map the global landscape of RBPs expressed and enriched in healthy cartilage and dysregulated in osteoarthritis (OA); (ii) prioritize RBPs for their potential role in cartilage and in OA pathogenesis and as therapeutic targets. Methods: Our published bulk RNA-sequencing (RNA-seq) data of healthy and OA human cartilage, and a census of 1,542 RBPs were utilized to identify RBPs that are expressed in healthy cartilage and differentially expressed (DE) in OA. Next, our comparison of healthy cartilage RNA-seq data to 37 transcriptomes in the Genotype-Tissue Expression (GTEx) database was used to determine RBPs that are enriched in cartilage. Finally, expression of RBPs was analyzed in our single cell RNA-sequencing (scRNA-seq) data from healthy and OA human cartilage. Results: Expression of RBPs was higher than nonRBPs in healthy cartilage. In OA cartilage, 188 RBPs were differentially expressed, with a greater proportion downregulated. Ribosome biogenesis was enriched in the upregulated RBPs, while splicing and transport were enriched in the downregulated. To further prioritize RBPs, we selected the top 10% expressed RBPs in healthy cartilage and those that were cartilage-enriched according to GTEx. Intersecting these criteria, we identified Tetrachlorodibenzodioxin (TCDD) Inducible Poly (ADP-Ribose) Polymerase (TIPARP) as a candidate RBP. TIPARP was downregulated in OA. scRNA-seq data revealed TIPARP was most significantly downregulated in the "pathogenic cluster". Conclusion: Our global analyses reveal expression patterns of RBPs in healthy and OA cartilage. We also identified TIPARP and other RBPs as novel mediators in OA pathogenesis and as potential therapeutic targets.

17.
Stat Med ; 42(18): 3145-3163, 2023 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-37458069

RESUMEN

Expression quantitative trait loci (eQTL) studies utilize regression models to explain the variance of gene expressions with genetic loci or single nucleotide polymorphisms (SNPs). However, regression models for eQTL are challenged by the presence of high dimensional non-sparse and correlated SNPs with small effects, and nonlinear relationships between responses and SNPs. Principal component analyses are commonly conducted for dimension reduction without considering responses. Because of that, this non-supervised learning method often does not work well when the focus is on discovery of the response-covariate relationship. We propose a new supervised structural dimensional reduction method for semiparametric regression models with high dimensional and correlated covariates; we extract low-dimensional latent features from a vast number of correlated SNPs while accounting for their relationships, possibly nonlinear, with gene expressions. Our model identifies important SNPs associated with gene expressions and estimates the association parameters via a likelihood-based algorithm. A GTEx data application on a cancer related gene is presented with 18 novel eQTLs detected by our method. In addition, extensive simulations show that our method outperforms the other competing methods in bias, efficiency, and computational cost.


Asunto(s)
Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Humanos , Sitios de Carácter Cuantitativo/genética , Funciones de Verosimilitud , Estudio de Asociación del Genoma Completo/métodos
18.
Genome Biol ; 24(1): 164, 2023 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-37434206

RESUMEN

BACKGROUND: Nonsense-mediated mRNA decay (NMD) was originally conceived as an mRNA surveillance mechanism to prevent the production of potentially deleterious truncated proteins. Research also shows NMD is an important post-transcriptional gene regulation mechanism selectively targeting many non-aberrant mRNAs. However, how natural genetic variants affect NMD and modulate gene expression remains elusive. RESULTS: Here we elucidate NMD regulation of individual genes across human tissues through genetical genomics. Genetic variants corresponding to NMD regulation are identified based on GTEx data through unique and robust transcript expression modeling. We identify genetic variants that influence the percentage of NMD-targeted transcripts (pNMD-QTLs), as well as genetic variants regulating the decay efficiency of NMD-targeted transcripts (dNMD-QTLs). Many such variants are missed in traditional expression quantitative trait locus (eQTL) mapping. NMD-QTLs show strong tissue specificity especially in the brain. They are more likely to overlap with disease single-nucleotide polymorphisms (SNPs). Compared to eQTLs, NMD-QTLs are more likely to be located within gene bodies and exons, especially the penultimate exons from the 3' end. Furthermore, NMD-QTLs are more likely to be found in the binding sites of miRNAs and RNA binding proteins. CONCLUSIONS: We reveal the genome-wide landscape of genetic variants associated with NMD regulation across human tissues. Our analysis results indicate important roles of NMD in the brain. The preferential genomic positions of NMD-QTLs suggest key attributes for NMD regulation. Furthermore, the overlap with disease-associated SNPs and post-transcriptional regulatory elements implicates regulatory roles of NMD-QTLs in disease manifestation and their interactions with other post-transcriptional regulators.


Asunto(s)
MicroARNs , Degradación de ARNm Mediada por Codón sin Sentido , Humanos , Sitios de Carácter Cuantitativo , Sitios de Unión , Encéfalo
19.
Genetics ; 224(4)2023 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-37348055

RESUMEN

Exonic variants present some of the strongest links between genotype and phenotype. However, these variants can have significant inter-individual pathogenicity differences, known as variable penetrance. In this study, we propose a model where genetically controlled mRNA splicing modulates the pathogenicity of exonic variants. By first cataloging exonic inclusion from RNA-sequencing data in GTEx V8, we find that pathogenic alleles are depleted on highly included exons. Using a large-scale phased whole genome sequencing data from the TOPMed consortium, we observe that this effect may be driven by common splice-regulatory genetic variants, and that natural selection acts on haplotype configurations that reduce the transcript inclusion of putatively pathogenic variants, especially when limiting to haploinsufficient genes. Finally, we test if this effect may be relevant for autism risk using families from the Simons Simplex Collection, but find that splicing of pathogenic alleles has a penetrance reducing effect here as well. Overall, our results indicate that common splice-regulatory variants may play a role in reducing the damaging effects of rare exonic variants.


Asunto(s)
Sitios de Empalme de ARN , Empalme del ARN , Penetrancia , Exones , Genotipo , ARN Mensajero/genética , Empalme Alternativo
20.
Cell ; 186(7): 1493-1511.e40, 2023 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-37001506

RESUMEN

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × âˆ¼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.


Asunto(s)
Epigenoma , Sitios de Carácter Cuantitativo , Estudio de Asociación del Genoma Completo , Genómica , Fenotipo , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA