RESUMEN
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality, with large disparities in incidence rates between Black and White Americans. Polygenic risk scores (PRSs) limited to variants discovered in genome-wide association studies in European-ancestry samples can identify European-ancestry individuals at high risk of VTE. However, there is limited evidence on whether high-dimensional PRS constructed using more sophisticated methods and more diverse training data can enhance the predictive ability and their utility across diverse populations. We developed PRSs for VTE using summary statistics from the International Network against Venous Thrombosis (INVENT) consortium genome-wide association studies meta-analyses of European- (71 771 cases and 1 059 740 controls) and African-ancestry samples (7482 cases and 129 975 controls). We used LDpred2 and PRS-CSx to construct ancestry-specific and multi-ancestry PRSs and evaluated their performance in an independent European- (6781 cases and 103 016 controls) and African-ancestry sample (1385 cases and 12 569 controls). Multi-ancestry PRSs with weights tuned in European-ancestry samples slightly outperformed ancestry-specific PRSs in European-ancestry test samples (e.g. the area under the receiver operating curve [AUC] was 0.609 for PRS-CSx_combinedEUR and 0.608 for PRS-CSxEUR [P = 0.00029]). Multi-ancestry PRSs with weights tuned in African-ancestry samples also outperformed ancestry-specific PRSs in African-ancestry test samples (PRS-CSxAFR: AUC = 0.58, PRS-CSx_combined AFR: AUC = 0.59), although this difference was not statistically significant (P = 0.34). The highest fifth percentile of the best-performing PRS was associated with 1.9-fold and 1.68-fold increased risk for VTE among European- and African-ancestry subjects, respectively, relative to those in the middle stratum. These findings suggest that the multi-ancestry PRS might be used to improve performance across diverse populations to identify individuals at highest risk for VTE.
Asunto(s)
Puntuación de Riesgo Genético , Tromboembolia Venosa , Femenino , Humanos , Masculino , Negro o Afroamericano/genética , Estudios de Casos y Controles , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Tromboembolia Venosa/genética , Tromboembolia Venosa/epidemiología , Blanco/genéticaRESUMEN
We describe the Mitochondrial and Nuclear rRNA fragment database (MINRbase), a knowledge repository aimed at facilitating the study of ribosomal RNA-derived fragments (rRFs). MINRbase provides interactive access to the profiles of 130 238 expressed rRFs arising from the four human nuclear rRNAs (18S, 5.8S, 28S, 5S), two mitochondrial rRNAs (12S, 16S) or four spacers of 45S pre-rRNA. We compiled these profiles by analyzing 11 632 datasets, including the GEUVADIS and The Cancer Genome Atlas (TCGA) repositories. MINRbase offers a user-friendly interface that lets researchers issue complex queries based on one or more criteria, such as parental rRNA identity, nucleotide sequence, rRF minimum abundance and metadata keywords (e.g. tissue type, disease). A 'summary' page for each rRF provides a granular breakdown of its expression by tissue type, disease, sex, ancestry and other variables; it also allows users to create publication-ready plots at the click of a button. MINRbase has already allowed us to generate support for three novel observations: the internal spacers of 45S are prolific producers of abundant rRFs; many abundant rRFs straddle the known boundaries of rRNAs; rRF production is regimented and depends on 'personal attributes' (sex, ancestry) and 'context' (tissue type, tissue state, disease). MINRbase is available at https://cm.jefferson.edu/MINRbase/.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN Mitocondrial , ARN Ribosómico , Humanos , Secuencia de Bases , Mitocondrias/genética , Ribosomas , ARN Mitocondrial/genética , ARN Ribosómico/genéticaRESUMEN
BACKGROUND: MicroRNA isoforms (isomiRs), tRNA-derived fragments (tRFs), and rRNA-derived fragments (rRFs) represent most of the small non-coding RNAs (sncRNAs) found in cells. Members of these three classes modulate messenger RNA (mRNA) and protein abundance and are dysregulated in diseases. Experimental studies to date have assumed that the subcellular distribution of these molecules is well-understood, independent of cell type, and the same for all isoforms of a sncRNA. RESULTS: We tested these assumptions by investigating the subcellular distribution of isomiRs, tRFs, and rRFs in biological replicates from three cell lines from the same tissue and same-sex donors that model the same cancer subtype. In each cell line, we profiled the isomiRs, tRFs, and rRFs in the nucleus, cytoplasm, whole mitochondrion (MT), mitoplast (MP), and whole cell. Using a rigorous mathematical model we developed, we accounted for cross-fraction contamination and technical errors and adjusted the measured abundances accordingly. Analyses of the adjusted abundances show that isomiRs, tRFs, and rRFs exhibit complex patterns of subcellular distributions. These patterns depend on each sncRNA's exact sequence and the cell type. Even in the same cell line, isoforms of the same sncRNA whose sequences differ by a few nucleotides (nts) can have different subcellular distributions. CONCLUSIONS: SncRNAs with similar sequences have different subcellular distributions within and across cell lines, suggesting that each isoform could have a different function. Future computational and experimental studies of isomiRs, tRFs, and rRFs will need to distinguish among each molecule's various isoforms and account for differences in each isoform's subcellular distribution in the cell line at hand. While the findings add to a growing body of evidence that isomiRs, tRFs, rRFs, tRNAs, and rRNAs follow complex intracellular trafficking rules, further investigation is needed to exclude alternative explanations for the observed subcellular distribution of sncRNAs.
Asunto(s)
MicroARNs , ARN Ribosómico , ARN de Transferencia , MicroARNs/genética , MicroARNs/metabolismo , ARN de Transferencia/genética , ARN de Transferencia/metabolismo , Humanos , ARN Ribosómico/genética , ARN Ribosómico/metabolismo , Secuencia de Bases , Isoformas de ARN/genética , Línea Celular Tumoral , Línea CelularRESUMEN
BACKGROUND: The advent of next generation sequencing (NGS) has allowed the discovery of short and long non-coding RNAs (ncRNAs) in an unbiased manner using reverse genetics approaches, enabling the discovery of multiple categories of ncRNAs and characterization of the way their expression is regulated. We previously showed that the identities and abundances of microRNA isoforms (isomiRs) and transfer RNA-derived fragments (tRFs) are tightly regulated, and that they depend on a person's sex and population origin, as well as on tissue type, tissue state, and disease type. Here, we characterize the regulation and distribution of fragments derived from ribosomal RNAs (rRNAs). rRNAs form a group that includes four (5S, 5.8S, 18S, 28S) rRNAs encoded by the human nuclear genome and two (12S, 16S) by the mitochondrial genome. rRNAs constitute the most abundant RNA type in eukaryotic cells. RESULTS: We analyzed rRNA-derived fragments (rRFs) across 434 transcriptomic datasets obtained from lymphoblastoid cell lines (LCLs) derived from healthy participants of the 1000 Genomes Project. The 434 datasets represent five human populations and both sexes. We examined each of the six rRNAs and their respective rRFs, and did so separately for each population and sex. Our analysis shows that all six rRNAs produce rRFs with unique identities, normalized abundances, and lengths. The rRFs arise from the 5'-end (5'-rRFs), the interior (i-rRFs), and the 3'-end (3'-rRFs) or straddle the 5' or 3' terminus of the parental rRNA (x-rRFs). Notably, a large number of rRFs are produced in a population-specific or sex-specific manner. Preliminary evidence suggests that rRF production is also tissue-dependent. Of note, we find that rRF production is not affected by the identity of the processing laboratory or the library preparation kit. CONCLUSIONS: Our findings suggest that rRFs are produced in a regimented manner by currently unknown processes that are influenced by both ubiquitous as well as population-specific and sex-specific factors. The properties of rRFs mirror the previously reported properties of isomiRs and tRFs and have implications for the study of homeostasis and disease.
Asunto(s)
MicroARNs/genética , ARN Ribosómico/genética , Anciano , Línea Celular , Femenino , Humanos , Masculino , MicroARNs/metabolismo , Persona de Mediana Edad , ARN Ribosómico/metabolismo , Factores Sexuales , TranscriptomaRESUMEN
CONTEXT: Patients with PCOS are at high risk of depression, anxiety, and metabolic syndrome (MetSyn), a key predictor of cardiovascular disease. The impact of depression and/or anxiety on MetSyn is unknown in this population. OBJECTIVE: To compare the risk of developing MetSyn in patients with PCOS with and without a history of depression and/or anxiety. DESIGN: Retrospective longitudinal cohort study (2008-2022) with median follow-up of 7 years. SETTING: Tertiary care ambulatory practice. PATIENTS OR OTHER PARTICIPANTS: Patients with hyperandrogenic PCOS and at least 2 evaluations for MetSyn ≥3 years apart (n=321). INTERVENTION(S): N/A. MAIN OUTCOME MEASURE(S): The primary outcome was risk of developing MetSyn. We hypothesized that this risk would be higher with a history of depression and/or anxiety. RESULTS: At the first visit, 33.0% had a history of depression and/or anxiety, with a third prescribed antidepressants or anxiolytics. Depression and/or anxiety increased risk of developing MetSyn during the study period (adjusted hazard ratio [aHR] 1.45, 95% CI 1.02-2.06, p=0.04) with an incidence of MetSyn of 75.3 compared to 47.6 cases per 100 person-years among those without (p=0.002). This was primarily driven by depression (aHR 1.56, 95% CI 1.10-2.20, p=0.01). CONCLUSIONS: Patients with PCOS and depression and/or anxiety have a high risk of developing MetSyn, with a stronger association between depression and MetSyn. Our findings highlight the urgent need for guideline-directed screening for depression and anxiety at time of diagnosis of PCOS as well as screening at subsequent visits to facilitate risk stratification for metabolic monitoring and early intervention in this high-risk group.
RESUMEN
Metabolic dysfunction-associated Fatty Liver Disease (MAFLD) has emerged as one of the leading cardiometabolic diseases. Friend of GATA2 (FOG2) is a transcriptional co-regulator that has been shown to regulate hepatic lipid metabolism and accumulation. Using meta-analysis from several different biobank datasets, we identified a coding variant of FOG2 (rs28374544, A1969G, S657G) predominantly found in individuals of African ancestry (minor allele frequency~20%), which is associated with liver failure/cirrhosis phenotype and liver injury. To gain insight into potential pathways associated with this variant, we interrogated a previously published genomics dataset of 38 human induced pluripotent stem cell (iPSCs) lines differentiated into hepatocytes (iHeps). Using Differential Gene Expression Analysis and Gene Set Enrichment Analysis, we identified the mTORC1 pathway as differentially regulated between iHeps from individuals with and without the variant. Transient lipid-based transfections were performed on the human hepatoma cell line (Huh7) using wild-type FOG2 and FOG2S657G and demonstrated that FOG2S657G increased mTORC1 signaling, de novo lipogenesis, and cellular triglyceride synthesis and mass. In addition, we observed a significant downregulation of oxidative phosphorylation in FOG2S657G cells in fatty acid-loaded cells but not untreated cells, suggesting that FOG2S657G may also reduce fatty acid to promote lipid accumulation. Taken together, our multi-pronged approach suggests a model whereby the FOG2S657G may promote MAFLD through mTORC1 activation, increased de novo lipogenesis, and lipid accumulation. Our results provide insights into the molecular mechanisms by which FOG2S657G may affect the complex molecular landscape underlying MAFLD.
Asunto(s)
Proteínas de Unión al ADN , Diana Mecanicista del Complejo 1 de la Rapamicina , Transducción de Señal , Factores de Transcripción , Humanos , Diana Mecanicista del Complejo 1 de la Rapamicina/genética , Diana Mecanicista del Complejo 1 de la Rapamicina/metabolismo , Transducción de Señal/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Hepatocitos/metabolismo , Polimorfismo de Nucleótido Simple , Células Madre Pluripotentes Inducidas/metabolismo , Metabolismo de los Lípidos/genética , Línea Celular Tumoral , Genotipo , Hepatopatías/genética , Hepatopatías/metabolismo , Hepatopatías/patologíaRESUMEN
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality, with large disparities in incidence rates between Black and White Americans. Polygenic risk scores (PRSs) limited to variants discovered in genome-wide association studies in European-ancestry samples can identify European-ancestry individuals at high risk of VTE. However, there is limited evidence on whether high-dimensional PRS constructed using more sophisticated methods and more diverse training data can enhance the predictive ability and their utility across diverse populations. We developed PRSs for VTE using summary statistics from the International Network against Venous Thrombosis (INVENT) consortium GWAS meta-analyses of European- (71,771 cases and 1,059,740 controls) and African-ancestry samples (7,482 cases and 129,975 controls). We used LDpred2 and PRSCSx to construct ancestry-specific and multi-ancestry PRSs and evaluated their performance in an independent European- (6,261 cases and 88,238 controls) and African-ancestry sample (1,385 cases and 12,569 controls). Multi-ancestry PRSs with weights tuned in European- and African-ancestry samples, respectively, outperformed ancestry-specific PRSs in European- (PRSCSXEUR: AUC=0.61 (0.60, 0.61), PRSCSX_combinedEUR: AUC=0.61 (0.60, 0.62)) and African-ancestry test samples (PRSCSXAFR: AUC=0.58 (0.57, 0.6), PRSCSX_combined AFR: AUC=0.59 (0.57, 0.60)). The highest fifth percentile of the best-performing PRS was associated with 1.9-fold and 1.68-fold increased risk for VTE among European- and African-ancestry subjects, respectively, relative to those in the middle stratum. These findings suggest that the multi-ancestry PRS may be used to identify individuals at highest risk for VTE and provide guidance for the most effective treatment strategy across diverse populations.
RESUMEN
Transfer RNA-derived fragments (tRFs) are noncoding RNAs that arise from either mature transfer RNAs (tRNAs) or their precursors. One important category of tRFs comprises the tRNA halves, which are generated through cleavage at the anticodon. A given tRNA typically gives rise to several co-expressed 5'-tRNA halves (5'-tRHs) that differ in the location of their 3' ends. These 5'-tRHs, even though distinct, have traditionally been treated as indistinguishable from one another due to their near-identical sequences and lengths. We focused on co-expressed 5'-tRHs that arise from the same tRNA and systematically examined their exact sequences and abundances across 10 different human tissues. To this end, we manually curated and analyzed several hundred human RNA-seq datasets from NCBI's Sequence Run Archive (SRA). We grouped datasets from the same tissue into their own collection and examined each group separately. We found that a given tRNA produces different groups of co-expressed 5'-tRHs in different tissues, different cell lines, and different diseases. Importantly, the co-expressed 5'-tRHs differ in their sequences, absolute abundances, and relative abundances, even among tRNAs with near-identical sequences from the same isodecoder or isoacceptor group. The findings suggest that co-expressed 5'-tRHs that are produced from the same tRNA or closely related tRNAs have distinct, context-dependent roles. Moreover, our analyses show that cell lines modeling the same tissue type and disease may not be interchangeable when it comes to experimenting with tRFs.
RESUMEN
Genome-wide association studies (GWAS) have yielded significant insights into the genetic architecture of myocardial infarction (MI), although studies in non-European populations are still lacking. Saudi Arabian cohorts offer an opportunity to discover novel genetic variants impacting disease risk due to a high rate of consanguinity. Genome-wide genotyping (GWG), imputation and GWAS followed by meta-analysis were performed based on two independent Saudi Arabian studies comprising 3950 MI patients and 2324 non-MI controls. Meta-analyses were then performed with these two Saudi MI studies and the CardioGRAMplusC4D and UK BioBank GWAS as controls. Meta-analyses of the two Saudi MI studies resulted in 17 SNPs with genome-wide significance. Meta-analyses of all 4 studies revealed 66 loci with genome-wide significance levels of p < 5 × 10-8. All of these variants, except rs2764203, have previously been reported as MI-associated loci or to have high linkage disequilibrium with known loci. One SNP association in Shisa family member 5 (SHISA5) (rs11707229) was evident at a much higher frequency in the Saudi MI populations (> 12% MAF). In conclusion, our results replicated many MI associations, whereas in Saudi-only GWAS (meta-analyses), several new loci were implicated that require future validation and functional analyses.
Asunto(s)
Estudio de Asociación del Genoma Completo , Infarto del Miocardio , Humanos , Estudio de Asociación del Genoma Completo/métodos , Arabia Saudita , Genotipo , Infarto del Miocardio/genética , Polimorfismo de Nucleótido Simple , Predisposición Genética a la EnfermedadRESUMEN
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. © 2022 Wiley Periodicals LLC.
Asunto(s)
Estudio de Asociación del Genoma Completo , Proyectos de Investigación , Humanos , Control de Calidad , Genotipo , Aberraciones Cromosómicas SexualesRESUMEN
Nonalcoholic fatty liver disease is common and highly heritable. Genetic studies of hepatic fat have not sufficiently addressed non-European and rare variants. In a medical biobank, we quantitate hepatic fat from clinical computed tomography (CT) scans via deep learning in 10,283 participants with whole-exome sequences available. We conduct exome-wide associations of single variants and rare predicted loss-of-function (pLOF) variants with CT-based hepatic fat and perform cross-modality replication in the UK Biobank (UKB) by linking whole-exome sequences to MRI-based hepatic fat. We confirm single variants previously associated with hepatic fat and identify several additional variants, including two (FGD5 H600Y and CITED2 S198_G199del) that replicated in UKB. A burden of rare pLOF variants in LMF2 is associated with increased hepatic fat and replicates in UKB. Quantitative phenotypes generated from clinical imaging studies and intersected with genomic data in medical biobanks have the potential to identify molecular pathways associated with human traits and disease.
Asunto(s)
Exoma , Enfermedad del Hígado Graso no Alcohólico , Humanos , Exoma/genética , Bancos de Muestras Biológicas , Fenotipo , Tomografía Computarizada por Rayos X , Enfermedad del Hígado Graso no Alcohólico/diagnóstico por imagen , Enfermedad del Hígado Graso no Alcohólico/genética , Proteínas Represoras/genética , Transactivadores/genéticaRESUMEN
We sought to determine whether commercial quantitative polymerase chain reaction (qPCR) methods are capable of distinguishing isomiRs: variants of mature microRNAs (miRNAs) with sequence endpoint differences. We used two commercially available miRNA qPCR methods to quantify miR-21-5p in both synthetic and real cell contexts. We find that although these miRNA qPCR methods possess high sensitivity for specific sequences, they also pick up background signals from closely related isomiRs, which influences the reliable quantification of individual isomiRs. We conclude that these methods do not possess the requisite specificity for reliable isomiR quantification.