RESUMEN
Biobank projects are generating genomic data for many thousands of individuals. Computational methods are needed to handle these massive data sets, including genetic ancestry (GA) inference tools. Current methods for GA inference do not scale to biobank-size genomic datasets. We present Rye-a new algorithm for GA inference at biobank scale. We compared the accuracy and runtime performance of Rye to the widely used RFMix, ADMIXTURE and iAdmix programs and applied it to a dataset of 488221 genome-wide variant samples from the UK Biobank. Rye infers GA based on principal component analysis of genomic variant samples from ancestral reference populations and query individuals. The algorithm's accuracy is powered by Metropolis-Hastings optimization and its speed is provided by non-negative least squares regression. Rye produces highly accurate GA estimates for three-way admixed populations-African, European and Native American-compared to RFMix and ADMIXTURE (${R}^2 = \ 0.998 - 1.00$), and shows 50× runtime improvement compared to ADMIXTURE on the UK Biobank dataset. Rye analysis of UK Biobank samples demonstrates how it can be used to infer GA at both continental and subcontinental levels. We discuss user consideration and options for the use of Rye; the program and its documentation are distributed on the GitHub repository: https://github.com/healthdisparities/rye.
Asunto(s)
Genética de Población , Secale , Humanos , Secale/genética , Bancos de Muestras Biológicas , Algoritmos , Genómica , Polimorfismo de Nucleótido SimpleRESUMEN
SUMMARY: The quantification of RNA sequencing (RNA-seq) abundance using a normalization method that calculates transcripts per million (TPM) is a key step to compare multiple samples from different experiments. TPMCalculator is a one-step software to process RNA-seq alignments in BAM format and reports TPM values, raw read counts and feature lengths for genes, transcripts, exons and introns. The program describes the genomic features through a model generated from the gene transfer format file used during alignments reporting of the TPM values and the raw read counts for each feature. In this paper, we show the correlation for 1256 samples from the TCGA-BRCA project between TPM and FPKM reported by TPMCalculator and RSeQC. We also show the correlation for raw read counts reported by TPMCalculator, HTSeq and featureCounts. AVAILABILITY AND IMPLEMENTATION: TPMCalculator is freely available at https://github.com/ncbi/TPMCalculator. It is implemented in C++14 and supported on Mac OS X, Linux and MS Windows. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genómica , Programas Informáticos , Exones , ARN Mensajero , Análisis de Secuencia de ARNRESUMEN
BACKGROUND: Banana is one of the most important crops in tropical and sub-tropical regions. To meet the demands of international markets, banana plantations require high amounts of chemical fertilizers which translate into high farming costs and are hazardous to the environment when used excessively. Beneficial free-living soil bacteria that colonize the rhizosphere are known as plant growth-promoting rhizobacteria (PGPR). PGPR affect plant growth in direct or indirect ways and hold great promise for sustainable agriculture. RESULTS: PGPR of the genera Bacillus and Pseudomonas in banana cv. Williams were evaluated. These plants were produced through in vitro culture and inoculated individually with two rhizobacteria, Bacillus amyloliquefaciens strain Bs006 and Pseudomonas fluorescens strain Ps006. Control plants without microbial inoculum were also evaluated. These plants were kept in a controlled climate growth room with conditions required to favor plant-microorganism interactions. These interactions were evaluated at 1-, 48- and 96-h using transcriptome sequencing after inoculation to establish differentially expressed genes (DEGs) in plants elicited by the interaction with the two rhizobacteria. Additionally, droplet digital PCR was performed at 1, 48, 96 h, and also at 15 and 30 days to validate the expression patterns of selected DEGs. The banana cv. Williams transcriptome reported differential expression in a large number of genes of which 22 were experimentally validated. Genes validated experimentally correspond to growth promotion and regulation of specific functions (flowering, photosynthesis, glucose catabolism and root growth) as well as plant defense genes. This study focused on the analysis of 18 genes involved in growth promotion, defense and response to biotic or abiotic stress. CONCLUSIONS: Differences in banana gene expression profiles in response to the rhizobacteria evaluated here (Bacillus amyloliquefaciens Bs006 and Pseudomonas fluorescens Ps006) are influenced by separate bacterial colonization processes and levels that stimulate distinct groups of genes at various points in time.
Asunto(s)
Bacillus amyloliquefaciens/fisiología , Perfilación de la Expresión Génica/métodos , Musa/crecimiento & desarrollo , Proteínas de Plantas/genética , Pseudomonas fluorescens/fisiología , Regulación de la Expresión Génica de las Plantas , Ontología de Genes , Musa/genética , Musa/microbiología , Análisis de Secuencia de ARN , Microbiología del Suelo , Estrés FisiológicoRESUMEN
Transposable elements (TEs) are an important source of human genetic variation with demonstrable effects on phenotype. Recently, a number of computational methods for the detection of polymorphic TE (polyTE) insertion sites from next-generation sequence data have been developed. The use of such tools will become increasingly important as the pace of human genome sequencing accelerates. For this report, we performed a comparative benchmarking and validation analysis of polyTE detection tools in an effort to inform their selection and use by the TE research community. We analyzed a core set of seven tools with respect to ease of use and accessibility, polyTE detection performance and runtime parameters. An experimentally validated set of 893 human polyTE insertions was used for this purpose, along with a series of simulated data sets that allowed us to assess the impact of sequence coverage on tool performance. The recently developed tool MELT showed the best overall performance followed by Mobster and then RetroSeq. PolyTE detection tools can best detect Alu insertion events in the human genome with reduced reliability for L1 insertions and substantially lowered performance for SVA insertions. We also show evidence that different polyTE detection tools are complementary with respect to their ability to detect a complete set of insertion events. Accordingly, a combined approach, coupled with manual inspection of individual results, may yield the best overall performance. In addition to the benchmarking results, we also provide notes on tool installation and usage as well as suggestions for future polyTE detection algorithm development.
Asunto(s)
Benchmarking/métodos , Elementos Transponibles de ADN , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Genoma Humano , HumanosRESUMEN
Transposable element (TE) derived sequences are known to contribute to the regulation of the human genome. The majority of known TE-derived regulatory sequences correspond to relatively ancient insertions, which are fixed across human populations. The extent to which human genetic variation caused by recent TE activity leads to regulatory polymorphisms among populations has yet to be thoroughly explored. In this study, we searched for associations between polymorphic TE (polyTE) loci and human gene expression levels using an expression quantitative trait loci (eQTL) approach. We compared locus-specific polyTE insertion genotypes to B cell gene expression levels among 445 individuals from 5 human populations. Numerous human polyTE loci correspond to both cis and trans eQTL, and their regulatory effects are directly related to cell type-specific function in the immune system. PolyTE loci are associated with differences in expression between European and African population groups, and a single polyTE loci is indirectly associated with the expression of numerous genes via the regulation of the B cell-specific transcription factor PAX5. The polyTE-gene expression associations we found indicate that human TE genetic variation can have important phenotypic consequences. Our results reveal that TE-eQTL are involved in population-specific gene regulation as well as transcriptional network modification.
Asunto(s)
Linfocitos B/metabolismo , Elementos Transponibles de ADN/inmunología , Redes Reguladoras de Genes , Genoma Humano , Sitios de Carácter Cuantitativo , Linfocitos B/inmunología , Población Negra , Citocinas/genética , Citocinas/inmunología , Regulación de la Expresión Génica , Sitios Genéticos , Humanos , Inmunidad Innata , Factor de Transcripción PAX5/genética , Factor de Transcripción PAX5/inmunología , Polimorfismo de Nucleótido Simple , Receptores de Antígenos de Linfocitos T/genética , Receptores de Antígenos de Linfocitos T/inmunología , Población BlancaRESUMEN
BACKGROUND: Modern Latin American populations were formed via genetic admixture among ancestral source populations from Africa, the Americas and Europe. We are interested in studying how combinations of genetic ancestry in admixed Latin American populations may impact genomic determinants of health and disease. For this study, we characterized the impact of ancestry and admixture on genetic variants that underlie health- and disease-related phenotypes in population genomic samples from Colombia, Mexico, Peru, and Puerto Rico. RESULTS: We analyzed a total of 347 admixed Latin American genomes along with 1102 putative ancestral source genomes from Africans, Europeans, and Native Americans. We characterized the genetic ancestry, relatedness, and admixture patterns for each of the admixed Latin American genomes, finding a spectrum of ancestry proportions within and between populations. We then identified single nucleotide polymorphisms (SNPs) with anomalous ancestry-enrichment patterns, i.e. SNPs that exist in any given Latin American population at a higher frequency than expected based on the population's genetic ancestry profile. For this set of ancestry-enriched SNPs, we inspected their phenotypic impact on disease, metabolism, and the immune system. All four of the Latin American populations show ancestry-enrichment for a number of shared pathways, yielding evidence of similar selection pressures on these populations during their evolution. For example, all four populations show ancestry-enriched SNPs in multiple genes from immune system pathways, such as the cytokine receptor interaction, T cell receptor signaling, and antigen presentation pathways. We also found SNPs with excess African or European ancestry that are associated with ancestry-specific gene expression patterns and play crucial roles in the immune system and infectious disease responses. Genes from both the innate and adaptive immune system were found to be regulated by ancestry-enriched SNPs with population-specific regulatory effects. CONCLUSIONS: Ancestry-enriched SNPs in Latin American populations have a substantial effect on health- and disease-related phenotypes. The concordant impact observed for same phenotypes across populations points to a process of adaptive introgression, whereby ancestry-enriched SNPs with specific functional utility appear to have been retained in modern populations by virtue of their effects on health and fitness.
Asunto(s)
Enfermedad/etnología , Enfermedad/genética , Genética de Población , Genoma Humano , Genómica/métodos , Polimorfismo de Nucleótido Simple , Población Negra , Etnicidad/genética , Estado de Salud , Humanos , América Latina , Población BlancaRESUMEN
This study assesses racial and ethnic differences in overall burden of firearm-related mortality and in change in firearm-related mortality among youths from 1999 to 2020.
Asunto(s)
Armas de Fuego , Heridas por Arma de Fuego , Adolescente , Niño , Humanos , Etnicidad/estadística & datos numéricos , Armas de Fuego/estadística & datos numéricos , Homicidio/etnología , Homicidio/estadística & datos numéricos , Suicidio/etnología , Suicidio/estadística & datos numéricos , Estados Unidos/epidemiología , Heridas por Arma de Fuego/epidemiología , Heridas por Arma de Fuego/etnología , Heridas por Arma de Fuego/mortalidad , Grupos Raciales/estadística & datos numéricosRESUMEN
BACKGROUND: Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near the TSS, RMs can co-localize TFBSs with each other and the TSS. The proportion of TFBS positional preferences due to TFBS co-localization within RMs is unknown, however. ChIP experiments confirm co-localization of some TFBSs genome-wide, including near the TSS, but they typically examine only a few TFs at a time, using non-physiological conditions that can vary from lab to lab. In contrast, sequence analysis can examine many TFs uniformly and methodically, broadly surveying the co-localization of TFBSs with tight positional preferences relative to the TSS. RESULTS: Our statistics found 43 significant sets of human motifs in the JASPAR TF Database with positional preferences relative to the TSS, with 38 preferences tight (±5 bp). Each set of motifs corresponded to a gene group of 135 to 3304 genes, with 42/43 (98%) gene groups independently validated by DAVID, a gene ontology database, with FDR < 0.05. Motifs corresponding to two TFBSs in a RM should co-occur more than by chance alone, enriching the intersection of the gene groups corresponding to the two TFs. Thus, a gene-group intersection systematically enriched beyond chance alone provides evidence that the two TFs participate in an RM. Of the 903 = 43*42/2 intersections of the 43 significant gene groups, we found 768/903 (85%) pairs of gene groups with significantly enriched intersections, with 564/768 (73%) intersections independently validated by DAVID with FDR < 0.05. A user-friendly web site at http://go.usa.gov/3kjsH permits biologists to explore the interaction network of our TFBSs to identify candidate subunit RMs. CONCLUSIONS: Gene duplication and convergent evolution within a genome provide obvious biological mechanisms for replicating an RM near the TSS that binds a particular TF subunit. Of all intersections of our 43 significant gene groups, 85% were significantly enriched, with 73% of the significant enrichments independently validated by gene ontology. The co-localization of TFBSs within RMs therefore likely explains much of the tight TFBS positional preferences near the TSS.
Asunto(s)
ADN/metabolismo , Regulación de la Expresión Génica , Elementos Reguladores de la Transcripción/genética , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/metabolismo , Sitio de Iniciación de la Transcripción , Sitios de Unión , ADN/química , ADN/genética , Ontología de Genes , Humanos , Unión ProteicaRESUMEN
The All of Us Research Program ("All of Us") is an initiative led by the National Institutes of Health whose goal is to advance research on personalized medicine and health equity through the collection of genetic, environmental, demographic, and health data from volunteer participants who reside in the USA. The program's emphasis on recruiting a diverse participant cohort makes "All of Us" an effective platform for investigating health disparities. In this work, we analyzed participant electronic health record (EHR) data to identify the diseases and disease categories in the "All of Us" cohort for which racial and ethnic prevalence disparities can be observed. In conjunction with these analyses, we developed the US Health Disparities Browser as an interactive web application that enables users to visualize differences in race- and ethnic-group-specific prevalence estimates for 1755 different diseases: https://usdisparities.biosci.gatech.edu/. The web application features a catalog of all diseases represented in the browser, which can be sorted by overall prevalence as well as the variance in prevalence across racial and ethnic groups. The analyses outlined here provide details on the nature and extent of racial and ethnic health disparities in the "All of Us" participant cohort, and the accompanying browser can serve as a resource through which researchers can explore these disparities Database URL: https://usdisparities.biosci.gatech.edu.
Asunto(s)
Etnicidad , Disparidades en el Estado de Salud , Grupos Raciales , Femenino , Humanos , Masculino , Registros Electrónicos de Salud , Etnicidad/genética , Grupos Raciales/genética , Estados UnidosRESUMEN
AIDS remains a significant global health challenge since its emergence in 1981, with millions of deaths and new cases every year. The CCR5 ∆32 genetic deletion confers immunity to HIV infection by altering a cell membrane protein crucial for viral entry. Stem cell transplants from homozygous carriers of this mutation to HIV-infected individuals have resulted in viral load reduction and disease remission, suggesting a potential therapeutic avenue. This study aims to investigate the relationship between genetic ancestry and the frequency of the CCR5 ∆32 mutation in Colombian populations, exploring the feasibility of targeted donor searches based on ancestry composition. Utilizing genomic data from the CÓDIGO-Colombia consortium, comprising 532 individuals, the study assessed the presence of the CCR5 ∆32 mutation and examined if the population was on Hardy-Weinberg equilibrium. Individuals were stratified into clusters based on African, American, and European ancestry percentages, with logistic regression analysis performed to evaluate the association between ancestry and mutation frequency. Additionally, global genomic databases were utilized to visualize the worldwide distribution of the mutation. The findings revealed a significant positive association between European ancestry and the CCR5 ∆32 mutation frequency, underscoring its relevance in donor selection. African and American ancestry showed negative but non-significant associations with CCR5 ∆32 frequency, which may be attributed to the study's limitations. These results emphasize the potential importance of considering ancestry in donor selection strategies, reveal the scarcity of potential donors in Colombia, and underscore the need to consider donors from other populations with mainly European ancestry if the CCR5 ∆32 stem cell transplant becomes a routine treatment for HIV/AIDS in Colombia.
RESUMEN
Splicing factor 3b subunit 1 (SF3B1) is the largest subunit and core component of the spliceosome. Inhibition of SF3B1 was associated with an increase in broad intron retention (IR) on most transcripts, suggesting that IR can be used as a marker of spliceosome inhibition in chronic lymphocytic leukemia (CLL) cells. Furthermore, we separately analyzed exonic and intronic mapped reads on annotated RNA-sequencing transcripts obtained from B cells (n = 98 CLL patients) and healthy volunteers (n = 9). We measured intron/exon ratio to use that as a surrogate for alternative RNA splicing (ARS) and found that 66% of CLL-B cell transcripts had significant IR elevation compared with normal B cells (NBCs) and that correlated with mRNA downregulation and low expression levels. Transcripts with the highest IR levels belonged to biological pathways associated with gene expression and RNA splicing. A >2-fold increase of active pSF3B1 was observed in CLL-B cells compared with NBCs. Additionally, when the CLL-B cells were treated with macrolides (pladienolide-B), a significant decrease in pSF3B1, but not total SF3B1 protein, was observed. These findings suggest that IR/ARS is increased in CLL, which is associated with SF3B1 phosphorylation and susceptibility to SF3B1 inhibitors. These data provide additional support to the relevance of ARS in carcinogenesis and evidence of pSF3B1 participation in this process.
RESUMEN
The genus Mycobacterium comprises more than 150 species, including important pathogens for humans which cause major public health problems. The vast majority of efforts to understand the genus have been addressed in studies with Mycobacterium tuberculosis. The biological differentiation between M. tuberculosis and nontuberculous mycobacteria (NTM) is important because there are distinctions in the sources of infection, treatments, and the course of disease. Likewise, the importance of studying NTM is not only due to its clinical significance but also due to the mechanisms by which some species are pathogenic while others are not. Mycobacterium avium complex (MAC) is the most important group of NTM opportunistic pathogens, since it is the second largest medical complex in the genus after the M. tuberculosis complex. Here, we evaluated the virulence and immune response of M. avium subsp. avium and Mycobacterium colombiense, using experimental models of progressive pulmonary tuberculosis and subcutaneous infection in BALB/c mice. Mice infected intratracheally with a high dose of MAC strains showed high expression of tumor necrosis factor alpha (TNF-α) and inducible nitric oxide synthase with rapid bacillus elimination and numerous granulomas, but without lung consolidation during late infection in coexistence with high expression of anti-inflammatory cytokines. In contrast, subcutaneous infection showed high production of the proinflammatory cytokines TNF-α and gamma interferon with relatively low production of anti-inflammatory cytokines such as interleukin-10 (IL-10) or IL-4, which efficiently eliminate the bacilli but maintain extensive inflammation and fibrosis. Thus, MAC infection evokes different immune and inflammatory responses depending on the MAC species and affected tissue.
Asunto(s)
Infecciones por Mycobacterium/inmunología , Complejo Mycobacterium avium/inmunología , Complejo Mycobacterium avium/patogenicidad , Tuberculosis Cutánea/inmunología , Tuberculosis Pulmonar/inmunología , Animales , Citocinas/metabolismo , Modelos Animales de Enfermedad , Humanos , Pulmón/patología , Masculino , Ratones , Ratones Endogámicos BALB C , Infecciones por Mycobacterium/microbiología , Óxido Nítrico Sintasa de Tipo II/biosíntesis , Piel/patología , Tuberculosis Cutánea/microbiología , Tuberculosis Pulmonar/microbiologíaRESUMEN
The UK Biobank (UKB), a large-scale biomedical database that includes demographic and electronic health record data for more than half a million ethnically diverse participants, is a potentially valuable resource for the study of health disparities. However, publicly accessible databases that catalog health disparities in the UKB do not exist. We developed the UKB Health Disparities Browser with the aims of (i) facilitating the exploration of the landscape of health disparities in the UK and (ii) directing the attention to areas of disparities research that might have the greatest public health impact. Health disparities were characterized for UKB participant groups defined by age, country of residence, ethnic group, sex and socioeconomic deprivation. We defined disease cohorts for UKB participants by mapping participant International Classification of Diseases, Tenth Revision (ICD-10) diagnosis codes to phenotype codes (phecodes). For each of the population attributes used to define population groups, disease percent prevalence values were computed for all groups from phecode case-control cohorts, and the magnitude of the disparities was calculated by both the difference and ratio of the range of disease prevalence values among groups to identify high- and low-prevalence disparities. We identified numerous diseases and health conditions with disparate prevalence values across population attributes, and we deployed an interactive web browser to visualize the results of our analysis: https://ukbatlas.health-disparities.org. The interactive browser includes overall and group-specific prevalence data for 1513 diseases based on a cohort of >500 000 participants from the UKB. Researchers can browse and sort by disease prevalence and prevalence differences to visualize health disparities for each of the five population attributes, and users can search for diseases of interest by disease names or codes. Database URL https://ukbatlas.health-disparities.org/.
Asunto(s)
Bancos de Muestras Biológicas , Humanos , Reino Unido/epidemiologíaRESUMEN
The relevance of race and ethnicity to genetics and medicine has long been a matter of debate. An emerging consensus holds that race and ethnicity are social constructs and thus poor proxies for genetic diversity. The goal of this study was to evaluate the relationship between race, ethnicity, and clinically relevant pharmacogenomic variation in cosmopolitan populations. We studied racially and ethnically diverse cohorts of 65,120 participants from the United States All of Us Research Program (All of Us) and 31,396 participants from the United Kingdom Biobank (UKB). Genome-wide patterns of pharmacogenomic variation-6311 drug response-associated variants for All of Us and 5966 variants for UKB-were analyzed with machine learning classifiers to predict participants' self-identified race and ethnicity. Pharmacogenomic variation predicts race/ethnicity with averages of 92.1% accuracy for All of Us and 94.3% accuracy for UKB. Group-specific prediction accuracies range from 99.0% for the White group in UKB to 92.9% for the Hispanic group in All of Us. Prediction accuracies are substantially lower for individuals who identified with more than one group in All of Us (16.7%) or as Mixed in UKB (70.7%). There are numerous individual pharmacogenomic variants with large allele frequency differences between race/ethnicity groups in both cohorts. Frequency differences for toxicity-associated variants predict hundreds of adverse drug reactions per 1000 treated participants for minority groups in All of Us. Our results indicate that race and ethnicity can be used to stratify pharmacogenomic risk in the US and UK populations and should not be discounted when making treatment decisions. We resolve the contradiction between the results reported here and the orthodoxy of race and ethnicity as non-genetic, social constructs by emphasizing the distinction between global and local patterns of human genetic diversity, and we stress the current and future limitations of race and ethnicity as proxies for pharmacogenomic variation.
RESUMEN
Health equity means the opportunity for all people and populations to attain optimal health, and it requires intentional efforts to promote fairness in patient treatments and outcomes. Pharmacogenomic variants are genetic differences associated with how patients respond to medications, and their presence can inform treatment decisions. In this perspective, we contend that the study of pharmacogenomic variation within and between human populations-population pharmacogenomics-can and should be leveraged in support of health equity. The key observation in support of this contention is that racial and ethnic groups exhibit pronounced differences in the frequencies of numerous pharmacogenomic variants, with direct implications for clinical practice. The use of race and ethnicity to stratify pharmacogenomic risk provides a means to avoid potential harm caused by biases introduced when treatment regimens do not consider genetic differences between population groups, particularly when majority group genetic profiles are assumed to hold for minority groups. We focus on the mitigation of adverse drug reactions as an area where population pharmacogenomics can have a direct and immediate impact on public health.
Asunto(s)
Equidad en Salud , Farmacogenética , Humanos , Etnicidad/genética , Variantes Farmacogenómicas , Grupos MinoritariosRESUMEN
Introduction: The Rose hypothesis predicts that since genetic variation is greater within than between populations, genetic risk factors will be associated with individuals' risk of disease but not population disparities, and since socioenvironmental variation is greater between than within populations, socioenvironmental risk factors will be associated with population disparities but not individuals' disease risk. Methods: We used the UK Biobank to test the Rose hypothesis for type 2 diabetes (T2D) ethnic disparities in the UK. Our cohort consists of 26 912 participants from Asian, black and white ethnic groups. Participants were characterised as T2D cases or controls based on the presence or absence of T2D diagnosis codes in electronic health records. T2D genetic risk was measured using a polygenic risk score (PRS), and socioeconomic deprivation was measured with the Townsend Index (TI). The variation of genetic (PRS) and socioeconomic (TI) risk factors within and between ethnic groups was calculated using analysis of variance. Multivariable logistic regression was used to associate PRS and TI with T2D cases, and mediation analysis was used to analyse the effect of PRS and TI on T2D ethnic group disparities. Results: T2D prevalence differs for Asian 23.34% (OR=5.14, CI=4.68 to 5.65), black 16.64% (OR=3.81, CI=3.44 to 4.22) and white 7.35% (reference) ethnic groups in the UK. Both genetic and socioenvironmental T2D risk factors show greater within (w) than between (b) ethnic group variation: PRS w=64.60%, b=35.40%; TI w=71.18%, b=28.19%. Nevertheless, both genetic risk (PRS OR=1.96, CI=1.87 to 2.07) and socioeconomic deprivation (TI OR=1.09, CI=1.08 to 1.10) are associated with T2D individual risk and mediate T2D ethnic disparities (Asian PRS=22.5%, TI=9.8%; black PRS=32.0%, TI=25.3%). Conclusion: A relative excess of within-group versus between-group variation does not preclude T2D risk factors from contributing to T2D ethnic disparities. Our results support an integrative approach to health disparities research that includes both genetic and socioenvironmental risk factors.
RESUMEN
Background: Diabetes is a common disease with a major burden on morbidity, mortality, and productivity. Type 2 diabetes (T2D) accounts for roughly 90% of all diabetes cases in the United States and has greater observed prevalence among those who identify as Black or Hispanic. Methods: The aims of this study were to determine whether T2D racial and ethnic disparities can be observed in data from the All of Us Research Program and to measure associations of genetic ancestry (GA) and socioeconomic deprivation with T2D. The All of Us Researcher Workbench was used to calculate T2D prevalence and to model T2D associations with GA, individual-level (iSDI) and zip code-based (zSDI) socioeconomic deprivation indices within and between participant self-identified race and ethnicity (SIRE) groups. Results: The study cohort of 86,488 participants from the four largest SIRE groups in All of Us: Asian (n=2,311), Black (n=16,282), Hispanic (n=16,966), and White (n=50,292). SIRE groups show characteristic genetic ancestry patterns, consistent with their diverse origins, together with a continuum of ancestry fractions within and between groups. The Black and Hispanic groups show the highest median SDI values, followed by the Asian and White groups. Black participants show the highest age- and sex-adjusted T2D prevalence (21.9%), followed by the Hispanic (19.9%), Asian (15.1%), and White (14.8%) groups. Minority SIRE groups and socioeconomic deprivation are positively associated with T2D, when the entire cohort is analyzed together. However, SIRE and GA both show negative interaction effects with SDI on T2D. Higher levels of SDI are negatively associated with T2D in the Black and Hispanic groups, and higher levels of SDI are negatively associated with T2D at high levels of African and Native American ancestry. Conclusion: Socioeconomic deprivation is positively associated with the SIRE group T2D disparities observed here but negatively associated with T2D within the Black and Hispanic groups that show the highest T2D prevalence. These results are paradoxical and have not been reported elsewhere. We discuss possible explanations for this paradox related to the nature of the All of Us data along with SIRE group differences in access to healthcare, diet, and lifestyle.
RESUMEN
Background: Diabetes is a common disease with a major burden on morbidity, mortality, and productivity. Type 2 diabetes (T2D) accounts for roughly 90% of all diabetes cases in the USA and has a greater observed prevalence among those who identify as Black or Hispanic. Methods: This study aimed to assess T2D racial and ethnic disparities using the All of Us Research Program data and to measure associations between genetic ancestry (GA), socioeconomic deprivation, and T2D. We used the All of Us Researcher Workbench to analyze T2D prevalence and model its associations with GA, individual-level (iSDI), and zip code-based (zSDI) socioeconomic deprivation indices among participant self-identified race and ethnicity (SIRE) groups. Results: The study cohort of 86,488 participants from the four largest SIRE groups in All of Us: Asian (n = 2311), Black (n = 16,282), Hispanic (n = 16,966), and White (n = 50,292). SIRE groups show characteristic genetic ancestry patterns, consistent with their diverse origins, together with a continuum of ancestry fractions within and between groups. The Black and Hispanic groups show the highest levels of socioeconomic deprivation, followed by the Asian and White groups. Black participants show the highest age- and sex-adjusted T2D prevalence (21.9%), followed by the Hispanic (19.9%), Asian (15.1%), and White (14.8%) groups. Minority SIRE groups and socioeconomic deprivation, both iSDI and zSDI, are positively associated with T2D, when the entire cohort is analyzed together. However, SIRE and GA both show negative interaction effects with iSDI and zSDI on T2D. Higher levels of iSDI and zSDI are negatively associated with T2D in the Black and Hispanic groups, and higher levels of iSDI and zSDI are negatively associated with T2D at high levels of African and Native American ancestry. Conclusions: Socioeconomic deprivation is associated with a higher prevalence of T2D in Black and Hispanic minority groups, compared to the majority White group. Nonetheless, socioeconomic deprivation is associated with reduced T2D risk within the Black and Hispanic groups. These results are paradoxical and have not been reported elsewhere, with possible explanations related to the nature of the All of Us data along with SIRE group differences in access to healthcare, diet, and lifestyle.
RESUMEN
Despite a substantial overall decrease in mortality, disparities among ethnic minorities in developed countries persist. This study investigated mortality disparities and their associated risk factors for the three largest ethnic groups in the United Kingdom: Asian, Black, and White. Study participants were sampled from the UK Biobank (UKB), a prospective cohort enrolled between 2006 and 2010. Genetics, biological samples, and health information and outcomes data of UKB participants were downloaded and data-fields were prioritized based on participants with death registry records. Kaplan-Meier method was used to evaluate survival differences among ethnic groups; survival random forest feature selection followed by Cox proportional-hazard modeling was used to identify and estimate the effects of shared and ethnic group-specific mortality risk factors. The White ethnic group showed significantly worse survival probability than the Asian and Black groups. In all three ethnic groups, endoscopy and colonoscopy procedures showed significant protective effects on overall mortality. Asian and Black women show lower relative risk of mortality than men, whereas no significant effect of sex was seen for the White group. The strongest ethnic group-specific mortality associations were ischemic heart disease for Asians, COVID-19 for Blacks, and cancers of respiratory/intrathoracic organs for Whites. Mental health-related diagnoses, including substance abuse, anxiety, and depression, were a major risk factor for overall mortality in the Asian group. The effect of mental health on Asian mortality, particularly for digestive cancers, was exacerbated by an observed hesitance to answer mental health questions, possibly related to cultural stigma. C-reactive protein (CRP) serum levels were associated with both overall and cause-specific mortality due to COVID-19 and digestive cancers in the Black group, where elevated CRP has previously been linked to psychosocial stress due to discrimination. Our results point to mortality risk factors that are group-specific and modifiable, supporting targeted interventions towards greater health equity.