Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 6.948
Filtrar
1.
Methods Mol Biol ; 2852: 223-253, 2025.
Artigo em Inglês | MEDLINE | ID: mdl-39235748

RESUMO

One of the main challenges in food microbiology is to prevent the risk of outbreaks by avoiding the distribution of food contaminated by bacteria. This requires constant monitoring of the circulating strains throughout the food production chain. Bacterial genomes contain signatures of natural evolution and adaptive markers that can be exploited to better understand the behavior of pathogen in the food industry. The monitoring of foodborne strains can therefore be facilitated by the use of these genomic markers capable of rapidly providing essential information on isolated strains, such as the source of contamination, risk of illness, potential for biofilm formation, and tolerance or resistance to biocides. The increasing availability of large genome datasets is enhancing the understanding of the genetic basis of complex traits such as host adaptation, virulence, and persistence. Genome-wide association studies have shown very promising results in the discovery of genomic markers that can be integrated into rapid detection tools. In addition, machine learning has successfully predicted phenotypes and classified important traits. Genome-wide association and machine learning tools have therefore the potential to support decision-making circuits intending at reducing the burden of foodborne diseases. The aim of this chapter review is to provide knowledge on the use of these two methods in food microbiology and to recommend their use in the field.


Assuntos
Bactérias , Microbiologia de Alimentos , Doenças Transmitidas por Alimentos , Estudo de Associação Genômica Ampla , Aprendizado de Máquina , Humanos , Bactérias/genética , Doenças Transmitidas por Alimentos/microbiologia , Doenças Transmitidas por Alimentos/genética , Variação Genética , Genoma Bacteriano , Estudo de Associação Genômica Ampla/métodos , Fenótipo
2.
Methods Mol Biol ; 2856: 3-9, 2025.
Artigo em Inglês | MEDLINE | ID: mdl-39283443

RESUMO

Recent analyses revealed the essential function of chromatin structure in maintaining and regulating genomic information. Advancements in microscopy, nuclear structure observation techniques, and the development of methods utilizing next-generation sequencers (NGSs) have significantly progressed these discoveries. Methods utilizing NGS enable genome-wide analysis, which is challenging with microscopy, and have elucidated concepts of important chromatin structures such as a loop structure, a domain structure called topologically associating domains (TADs), and compartments. In this chapter, I introduce chromatin interaction techniques using NGS and outline the principles and features of each method.


Assuntos
Cromatina , Sequenciamento de Nucleotídeos em Larga Escala , Cromatina/genética , Cromatina/metabolismo , Cromatina/química , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genômica/métodos , Estudo de Associação Genômica Ampla/métodos , Animais
3.
Front Endocrinol (Lausanne) ; 15: 1449668, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39351539

RESUMO

Background: The proteome is a crucial reservoir of targets for cancer treatment. While some targeted therapies have been developed, there are still significant challenges in early diagnosis and treatment, highlighting the need to identify new biomarkers and therapeutic targets for breast cancer. Therefore, we conducted a comprehensive proteome-wide Mendelian randomization (MR) study to identify novel biomarkers and potential therapeutic targets for breast cancer. Methods: Protein quantitative trait locus (pQTL) data were extracted from two published plasma proteome-wide association studies. Genetic variants associated with breast cancer were obtained from the Breast Cancer Association Consortium, which included 133,384 cases and 113,789 controls, and the Finnish cohort study, comprising 18,786 cases and 182,927 controls. We employed summary-based MR and colocalization methods to identify potential drug targets for breast cancer, which were subsequently validated using a two-sample MR approach. Finally, a protein-protein interaction (PPI) network was constructed to detect interactions between the identified proteins and existing cancer drug targets. Results: Gene-predicted levels of ten proteins were associated with breast cancer risk. Decreased levels of CASP8, DDX58, CPNE1, ULK3, PARK7, and BTN2A1, as well as increased levels of TNFRSF9, TNXB, DNPH1, and TLR1, were linked to an elevated risk of breast cancer. Among these, CASP8 and DDX58 were supported by tier-one evidence, while CPNE1, ULK3, PARK7, and TNFRSF9 received tier-two evidence support. The remaining proteins, TNXB, BTN2A1, DNPH1, and TLR1, were supported by tier-three evidence. CASP8, DDX58, CPNE1, ULK3, PARK7, and TNFRSF9 have already been identified as targets in drug development and potential therapeutic targets for breast cancer treatment. Additionally, ULK3 showed promise as a prognostic biomarker for breast cancer. Conclusions: The present study identified several novel potential drug targets and biomarkers for breast cancer, providing new insights into its diagnosis and treatment. The integration of PPI and druggability evaluations enhances the prioritization of these therapeutic targets, paving the way for future drug development efforts.


Assuntos
Biomarcadores Tumorais , Neoplasias da Mama , Análise da Randomização Mendeliana , Proteômica , Locos de Características Quantitativas , Humanos , Neoplasias da Mama/genética , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/sangue , Neoplasias da Mama/metabolismo , Feminino , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/sangue , Biomarcadores Tumorais/metabolismo , Proteômica/métodos , Proteoma/metabolismo , Mapas de Interação de Proteínas , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único
4.
Artigo em Inglês | MEDLINE | ID: mdl-39353864

RESUMO

Epigenome-wide association studies (EWAS) are susceptible to widespread confounding caused by population structure and genetic relatedness. Nevertheless, kinship estimation is challenging in EWAS without genotyping data. Here, we proposed MethylGenotyper, a method that for the first time enables accurate genotyping at thousands of single nucleotide polymorphisms (SNPs) directly from commercial DNA methylation microarrays. We modeled the intensities of methylation probes near SNPs with a mixture of three beta distributions corresponding to different genotypes and estimated parameters with an expectation-maximization algorithm. We conducted extensive simulations to demonstrate the performance of the method. When applying MethylGenotyper to the Infinium EPIC array data of 4662 Chinese samples, we obtained genotypes at 4319 SNPs with a concordance rate of 98.26%, enabling the identification of 255 pairs of close relatedness. Furthermore, we showed that MethylGenotyper allows for the estimation of both population structure and cryptic relatedness among 702 Australians of diverse ancestry. We also implemented MethylGenotyper in a publicly available R package (https://github.com/Yi-Jiang/MethylGenotyper) to facilitate future large-scale EWAS.


Assuntos
Metilação de DNA , Genótipo , Polimorfismo de Nucleotídeo Único , Polimorfismo de Nucleotídeo Único/genética , Metilação de DNA/genética , Humanos , Software , Estudo de Associação Genômica Ampla/métodos , Algoritmos , Povo Asiático/genética
5.
Genes Brain Behav ; 23(5): e70003, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39377282

RESUMO

Grip strength (GS) is a proxy measure for muscular strength and a predictor for bone fracture risk among other diseases. Previous genome-wide association studies (GWASs) have been conducted in large cohorts of adults focusing on scores collected for the dominant hand, therefore increasing the likelihood of confounding effects by environmental factors. Here, we perform the first GWAS meta-analyses on maximal GS with the dominant (GSD) and non-dominant (GSND) hand in two cohorts of children (ALSPAC, N = 5450; age range = 10.65-13.61; Raine Study, N = 1162, age range: 9.42-12.38 years). We identified a novel significant association for GSND (rs9546244, LINC02465, p = 3.43e-08) and replicated associations previously reported in adults including with a HOXB3 gene marker that shows an expression quantitative trait locus (eQTL) effect. Despite a much smaller sample (~3%) compared with the UK Biobank we replicated correlation analyses previously reported in this much larger adult cohort, such as a negative correlation with coronary artery disease. Although the results from the polygenic risk score (PRS) analyses did not survive multiple testing correction, we observed nominally significant associations between GS and risk of overall fracture, as previously reported, as well ADHD which will require further investigations. Finally, we observed a higher SNP-heritability (24%-41%) compared with previous studies (4%-24%) in adults. Overall, our results suggest that cohorts of children might be better suited for genetic studies of grip strength, possibly due to the shorter exposure to confounding environmental factors compared with adults.


Assuntos
Estudo de Associação Genômica Ampla , Força da Mão , Humanos , Estudo de Associação Genômica Ampla/métodos , Masculino , Feminino , Criança , Adolescente , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Estudos de Coortes
6.
Cell Genom ; 4(10): 100669, 2024 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-39389018

RESUMO

Non-invasive prenatal testing (NIPT) employs ultra-low-pass sequencing of maternal plasma cell-free DNA to detect fetal trisomy. Its global adoption has established NIPT as a large human genetic resource for exploring genetic variations and their associations with phenotypes. Here, we present methods for analyzing large-scale, low-depth NIPT data, including customized algorithms and software for genetic variant detection, genotype imputation, family relatedness, population structure inference, and genome-wide association analysis of maternal genomes. Our results demonstrate accurate allele frequency estimation and high genotype imputation accuracy (R2>0.84) for NIPT sequencing depths from 0.1× to 0.3×. We also achieve effective classification of duplicates and first-degree relatives, along with robust principal-component analysis. Additionally, we obtain an R2>0.81 for estimating genetic effect sizes across genotyping and sequencing platforms with adequate sample sizes. These methods offer a robust theoretical and practical foundation for utilizing NIPT data in medical genetic research.


Assuntos
Estudo de Associação Genômica Ampla , Humanos , Feminino , Gravidez , Estudo de Associação Genômica Ampla/métodos , Teste Pré-Natal não Invasivo/métodos , Diagnóstico Pré-Natal/métodos , Frequência do Gene , Algoritmos , Genótipo , Análise de Sequência de DNA/métodos , Polimorfismo de Nucleotídeo Único , Software
7.
Genome Biol ; 25(1): 260, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39379999

RESUMO

BACKGROUND: Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning. RESULTS: We introduce an innovative statistical framework to optimize and benchmark PRS models using summary statistics of genome-wide association studies. This framework builds upon our previous work and can fine-tune virtually all existing PRS models while accounting for linkage disequilibrium. In addition, we provide an ensemble learning strategy named PUMAS-ensemble to combine multiple PRS models into an ensemble score without requiring external data for model fitting. Through extensive simulations and analysis of many complex traits in the UK Biobank, we demonstrate that this approach closely approximates gold-standard analytical strategies based on external validation, and substantially outperforms state-of-the-art PRS methods. CONCLUSIONS: Our method is a powerful and general modeling technique that can continue to combine the best-performing PRS methods out there through ensemble learning and could become an integral component for all future PRS applications.


Assuntos
Benchmarking , Estudo de Associação Genômica Ampla , Herança Multifatorial , Estudo de Associação Genômica Ampla/métodos , Humanos , Modelos Genéticos , Predisposição Genética para Doença , Desequilíbrio de Ligação , Estratificação de Risco Genético
8.
Am J Hum Genet ; 111(10): 2139-2149, 2024 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-39366334

RESUMO

Gene-based burden tests are a popular and powerful approach for analysis of exome-wide association studies. These approaches combine sets of variants within a gene into a single burden score that is then tested for association. Typically, a range of burden scores are calculated and tested across a range of annotation classes and frequency bins. Correlation between these tests can complicate the multiple testing correction and hamper interpretation of the results. We introduce a method called the sparse burden association test (SBAT) that tests the joint set of burden scores under the assumption that causal burden scores act in the same effect direction. The method simultaneously assesses the significance of the model fit and selects the set of burden scores that best explain the association at the same time. Using simulated data, we show that the method is well calibrated and highlight scenarios where the test outperforms existing gene-based tests. We apply the method to 73 quantitative traits from the UK Biobank, showing that SBAT is a valuable additional gene-based test when combined with other existing approaches. This test is implemented in the REGENIE software.


Assuntos
Estudo de Associação Genômica Ampla , Humanos , Estudo de Associação Genômica Ampla/métodos , Análise dos Mínimos Quadrados , Software , Modelos Genéticos , Exoma/genética , Variação Genética , Simulação por Computador
9.
Respir Res ; 25(1): 374, 2024 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-39415140

RESUMO

BACKGROUND: Community-acquired pneumonia (CAP) is associated with high morbidity and hospitalization rate. In infectious diseases, host genetics plays a critical role in susceptibility and immune response, and the immune pathways involved are highly dependent on the microorganism and its route of infection. Here we aimed to identify genetic risk loci for CAP using a case-control genome-wide association study (GWAS). METHODS: We performed a GWAS on 3,765 Spanish individuals, including 257 adult patients hospitalized with CAP and 3,508 population controls. Pneumococcal CAP was documented in 30% of patients; the remaining 70% were selected among patients with unidentified microbiological etiology. We tested 7,6 million imputed genotypes using logistic regressions. UK Biobank GWAS of bacterial pneumonia were used for results validation. Subsequently, we prioritized genes and likely causal variants based on Bayesian fine mapping and functional evidence. Imputation and association of classical HLA alleles and amino acids were also conducted. RESULTS: Six independent sentinel variants reached the genome-wide significance (p < 5 × 10-8), three on chromosome 6p21.32, and one for each of the chromosomes 4q28.2, 11p12, and 20q11.22. Only one variant at 6p21.32 was validated in independent GWAS of bacterial and pneumococcal pneumonia. Our analyses prioritized C4orf33 on 4q28.2, TAPBP on 6p21.32, and ZNF341 on 20q11.22. Interestingly, genetic defects of TAPBP and ZNF341 are previously known inborn errors of immunity predisposing to bacterial pneumonia, including pneumococcus and Haemophilus influenzae. Associations were all non-significant for the classical HLA alleles. CONCLUSIONS: We completed a GWAS of CAP and identified four novel risk loci involved in CAP susceptibility.


Assuntos
Infecções Comunitárias Adquiridas , Estudo de Associação Genômica Ampla , Humanos , Infecções Comunitárias Adquiridas/genética , Infecções Comunitárias Adquiridas/epidemiologia , Estudo de Associação Genômica Ampla/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Estudos de Casos e Controles , Predisposição Genética para Doença/genética , Pneumonia/genética , Pneumonia/epidemiologia , Pneumonia/diagnóstico , Pneumonia/imunologia , Adulto , Polimorfismo de Nucleotídeo Único/genética , Espanha/epidemiologia
10.
Clin Epigenetics ; 16(1): 134, 2024 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-39334501

RESUMO

BACKGROUND: This work delves into the relationship between cardiovascular health (CVH) and aging. Previous studies have shown an association of ideal CVH with a slower aging rate, measured by epigenetic age acceleration (EAA). However, the causal relationship between CVH and EAA has remained unexplored. METHODS AND RESULTS: We performed genome-wide association studies (GWAS) on the (12-point) CVH score and its components using the Taiwan Biobank data, in which weighted genetic risk scores were treated as instrumental variables. Subsequently, we conducted a one-sample Mendelian Randomization (MR) analysis with the two-stage least-squares method on 2383 participants to examine the causal relationship between the (12-point) CVH score and EAA. As a result, we observed a significant causal effect of the CVH score on GrimAge acceleration (GrimEAA) (ß [SE]: - 0.993 [0.363] year; p = 0.0063) and DNA methylation-based plasminogen activator inhibitor-1 (DNAmPAI-1) (ß [SE]: - 0.294 [0.099] standard deviation (sd) of DNAmPAI-1; p = 0.0030). Digging individual CVH components in depth, the ideal total cholesterol score (0 [poor], 1 [intermediate], or 2 [ideal]) was causally associated with DNAmPAI-1 (ß [SE]: - 0.452 [0.150] sd of DNAmPAI-1; false discovery rate [FDR] q = 0.0102). The ideal body mass index (BMI) score was causally associated with GrimEAA (ß [SE]: - 2.382 [0.952] years; FDR q = 0.0498) and DunedinPACE (ß [SE]: - 0.097 [0.030]; FDR q = 0.0044). We also performed a two-sample MR analysis using the summary statistics from European GWAS. We observed that the (12-point) CVH score exhibits a significant causal effect on Horvath's intrinsic epigenetic age acceleration (ß [SE]: - 0.389 [0.186] years; p = 0.036) and GrimEAA (ß [SE]: - 0.526 [0.244] years; p = 0.031). Furthermore, we detected causal effects of BMI (ß [SE]: 0.599 [0.081] years; q = 2.91E-12), never smoking (ß [SE]: - 2.981 [0.524] years; q = 1.63E-7), walking (ß [SE]: - 4.313 [1.236] years; q = 0.004), and dried fruit intake (ß [SE]: - 1.523 [0.504] years; q = 0.013) on GrimEAA in the European population. CONCLUSIONS: Our research confirms the causal link between maintaining an ideal CVH and epigenetic age. It provides a tangible pathway for individuals to improve their health and potentially slow aging.


Assuntos
Doenças Cardiovasculares , Metilação de DNA , Epigênese Genética , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Humanos , Epigênese Genética/genética , Estudo de Associação Genômica Ampla/métodos , Feminino , Masculino , Análise da Randomização Mendeliana/métodos , Doenças Cardiovasculares/genética , Metilação de DNA/genética , Pessoa de Meia-Idade , Idoso , Taiwan , Inibidor 1 de Ativador de Plasminogênio/genética , Envelhecimento/genética , Adulto
11.
Sci Rep ; 14(1): 22124, 2024 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-39333190

RESUMO

Polygenic risk scores (PRSs) hold promise in their potential translation into clinical settings to improve disease risk prediction. An important consideration in integrating PRSs into clinical settings is to gain an understanding of how to identify which subpopulations of individuals most benefit from PRSs for risk prediction. In this study, using the UK Biobank dataset, we trained logistic regression models to predict the 10 year incident risk of myocardial infarction, breast cancer, and schizophrenia using either just clinical features or clinical features combined with PRSs. For each disease, we identified the top 10% subgroup with the greatest magnitude of improvement in risk prediction accuracy attributed to PRSs in the multi-modal model. Using up to ~ 3.6 k demographic, lifestyle, diagnostic, lab, and physical measurement features from the UK Biobank dataset of ~ 500 k individuals, we characterized these subgroups based on various clinical, lifestyle, and demographic characteristics. The incident cases in the top 10% subgroup for each disease represent distinct phenotypes that differ from other cases and that are strongly correlated with genetic predisposition. Our findings provide insights into disease subtypes and can encourage future studies aimed at classifying these individuals to enhance the targeting of polygenic risk scoring in practice.


Assuntos
Predisposição Genética para Doença , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Feminino , Masculino , Esquizofrenia/genética , Esquizofrenia/epidemiologia , Medição de Risco/métodos , Fatores de Risco , Neoplasias da Mama/genética , Neoplasias da Mama/epidemiologia , Infarto do Miocárdio/genética , Infarto do Miocárdio/epidemiologia , Pessoa de Meia-Idade , Idoso , Estudo de Associação Genômica Ampla/métodos , Estratificação de Risco Genético
12.
BMC Bioinformatics ; 25(1): 298, 2024 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-39261754

RESUMO

One of the aims of population genetics is to identify genetic differences/similarities among individuals of multiple ancestries. Many approaches including principal component analysis, clustering, and maximum likelihood techniques can be used to assign individuals to a given ancestry based on their genetic makeup. Although there are several tools that implement such algorithms, there is a lack of interactive visual platforms to run a variety of algorithms in one place. Therefore, we developed PopMLvis, a platform that offers an interactive environment to visualize genetic similarity data using several algorithms, and generate figures that can be easily integrated into scientific articles.


Assuntos
Algoritmos , Genética Populacional , Estudo de Associação Genômica Ampla , Genótipo , Software , Estudo de Associação Genômica Ampla/métodos , Genética Populacional/métodos , Humanos , Análise de Componente Principal
13.
Genet Sel Evol ; 56(1): 62, 2024 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-39266998

RESUMO

BACKGROUND: Mitochondrial genomes differ from the nuclear genome and in humans it is known that mitochondrial variants contribute to genetic disorders. Prior to genomics, some livestock studies assessed the role of the mitochondrial genome but these were limited and inconclusive. Modern genome sequencing provides an opportunity to re-evaluate the potential impact of mitochondrial variation on livestock traits. This study first evaluated the empirical accuracy of mitochondrial sequence imputation and then used real and imputed mitochondrial sequence genotypes to study the role of mitochondrial variants on milk production traits of dairy cattle. RESULTS: The empirical accuracy of imputation from Single Nucleotide Polymorphism (SNP) panels to mitochondrial sequence genotypes was assessed in 516 test animals of Holstein, Jersey and Red breeds using Beagle software and a sequence reference of 1883 animals. The overall accuracy estimated as the Pearson's correlation squared (R2) between all imputed and real genotypes across all animals was 0.454. The low accuracy was attributed partly to the majority of variants having low minor allele frequency (MAF < 0.005) but also due to variants in the hypervariable D-loop region showing poor imputation accuracy. Beagle software provides an internal estimate of imputation accuracy (DR2), and 10 percent of the total 1927 imputed positions showed DR2 greater than 0.9 (N = 201). There were 151 sites with empirical R2 > 0.9 (of 954 variants segregating in the test animals) and 138 of these overlapped the sites with DR2 > 0.9. This suggests that the DR2 statistic is a reasonable proxy to select sites that are imputed with higher accuracy for downstream analyses. Accordingly, in the second part of the study mitochondrial sequence variants were imputed from real mitochondrial SNP panel genotypes of 9515 Australian Holstein, Jersey and Red dairy cattle. Then, using only sites with DR2 > 0.900 and real genotypes, we undertook a genome-wide association study (GWAS) for milk, fat and protein yields. The GWAS mitochondrial SNP effects were not significant. CONCLUSION: The accuracy of imputation of mitochondrial genotypes from the SNP panel to sequence was generally low. The Beagle DR2 statistic enabled selection of sites imputed with higher empirical accuracy. We recommend building larger reference populations with mitochondrial sequence to improve the accuracy of imputing less common variants and ensuring that SNP panels include common variants in the D-loop region.


Assuntos
Leite , Polimorfismo de Nucleotídeo Único , Animais , Bovinos/genética , Leite/metabolismo , Genótipo , Genoma Mitocondrial , Frequência do Gene , Feminino , DNA Mitocondrial/genética , Estudo de Associação Genômica Ampla/métodos , Software
14.
PLoS Comput Biol ; 20(9): e1012301, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39226325

RESUMO

Clustering is widely used in bioinformatics and many other fields, with applications from exploratory analysis to prediction. Many types of data have associated uncertainty or measurement error, but this is rarely used to inform the clustering. We present Dirichlet Process Mixtures with Uncertainty (DPMUnc), an extension of a Bayesian nonparametric clustering algorithm which makes use of the uncertainty associated with data points. We show that DPMUnc out-performs existing methods on simulated data. We cluster immune-mediated diseases (IMD) using GWAS summary statistics, which have uncertainty linked with the sample size of the study. DPMUnc separates autoimmune from autoinflammatory diseases and isolates other subgroups such as adult-onset arthritis. We additionally consider how DPMUnc can be used to cluster gene expression datasets that have been summarised using gene signatures. We first introduce a novel procedure for generating a summary of a gene signature on a dataset different to the one where it was discovered, which incorporates a measure of the variability in expression across signature genes within each individual. We summarise three public gene expression datasets containing patients with a range of IMD, using three relevant gene signatures. We find association between disease and the clusters returned by DPMUnc, with clustering structure replicated across the datasets. The significance of this work is two-fold. Firstly, we demonstrate that when data has associated uncertainty, this uncertainty should be used to inform clustering and we present a method which does this, DPMUnc. Secondly, we present a procedure for using gene signatures in datasets other than where they were originally defined. We show the value of this procedure by summarising gene expression data from patients with immune-mediated diseases using relevant gene signatures, and clustering these patients using DPMUnc.


Assuntos
Algoritmos , Teorema de Bayes , Biologia Computacional , Humanos , Análise por Conglomerados , Incerteza , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Perfilação da Expressão Gênica/estatística & dados numéricos , Perfilação da Expressão Gênica/métodos , Bases de Dados Genéticas/estatística & dados numéricos , Simulação por Computador
15.
Int J Mol Sci ; 25(17)2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39273703

RESUMO

Caviar yield, caviar color, and body weight are crucial economic traits in sturgeon breeding. Understanding the molecular mechanisms behind these traits is essential for their genetic improvement. In this study, we performed whole-genome sequencing on 673 Russian sturgeons, renowned for their high-quality caviar. With an average sequencing depth of 13.69×, we obtained approximately 10.41 million high-quality single nucleotide polymorphisms (SNPs). Using a genome-wide association study (GWAS) with a single-marker regression model, we identified SNPs and genes associated with these traits. Our findings revealed several candidate genes for each trait: caviar yield: TFAP2A, RPS6KA3, CRB3, TUBB, H2AFX, morc3, BAG1, RANBP2, PLA2G1B, and NYAP1; caviar color: NFX1, OTULIN, SRFBP1, PLEK, INHBA, and NARS; body weight: ACVR1, HTR4, fmnl2, INSIG2, GPD2, ACVR1C, TANC1, KCNH7, SLC16A13, XKR4, GALR2, RPL39, ACVR2A, ADCY10, and ZEB2. Additionally, using the genomic feature BLUP (GFBLUP) method, which combines linkage disequilibrium (LD) pruning markers with GWAS prior information, we improved genomic prediction accuracy by 2%, 1.9%, and 3.1% for caviar yield, caviar color, and body weight traits, respectively, compared to the GBLUP method. In conclusion, this study enhances our understanding of the genetic mechanisms underlying caviar yield, caviar color, and body weight traits in sturgeons, providing opportunities for genetic improvement of these traits through genomic selection.


Assuntos
Peso Corporal , Peixes , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Estudo de Associação Genômica Ampla/métodos , Animais , Peso Corporal/genética , Peixes/genética , Sequenciamento Completo do Genoma/métodos , Locos de Características Quantitativas , Genômica/métodos , Fenótipo , Característica Quantitativa Herdável
16.
Commun Biol ; 7(1): 1171, 2024 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-39294434

RESUMO

Staphylococcus aureus (S. aureus) can cause various infections in humans and animals, contributing to high morbidity and mortality. To prevent and control cross-species transmission of S. aureus, it is necessary to understand the host-associated genetic variants. We performed a two-stage genome-wide association study (GWAS) including initial screening and further validation to compare genomic differences between human and pig S. aureus, aiming to identify host-associated determinants. Our multiple GWAS analyses found six consensus significant k-mers associated with host species, providing novel genetic evidence for distinguishing human from pig S. aureus. The best k-mer predictor achieved a high classification accuracy of 98.12% on its own and had extremely high resolution similar to the SNPs-based phylogeny, offering a very simple target for predicting the cross-species transmission risk of S. aureus. The final k-mer model revealed that 90% of S. aureus isolates from farm workers were predicted as livestock origin, suggesting a high risk of cross-species transmission. Bayesian inference revealed different cross-species transmission directions, with the human-to-pig transmission for ST5 and the pig-to-human transmission for ST398. Our findings provide a simple and accurate k-mer model for identifying and predicting the cross-species transmission risk of S. aureus.


Assuntos
Estudo de Associação Genômica Ampla , Infecções Estafilocócicas , Staphylococcus aureus , Doenças dos Suínos , Animais , Humanos , Suínos , Staphylococcus aureus/genética , Staphylococcus aureus/classificação , Staphylococcus aureus/isolamento & purificação , Infecções Estafilocócicas/microbiologia , Infecções Estafilocócicas/transmissão , Infecções Estafilocócicas/veterinária , Estudo de Associação Genômica Ampla/métodos , Doenças dos Suínos/microbiologia , Genômica/métodos , Polimorfismo de Nucleotídeo Único , Filogenia , Teorema de Bayes , Genoma Bacteriano
17.
Clin Epigenetics ; 16(1): 133, 2024 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-39300457

RESUMO

BACKGROUND: Epigenetic age accelerations (EAAs) are a promising new avenue of research, yet their investigation in subacute thyroiditis (SAT) remains scarce. Our study endeavors to fill this void by exploring the potential causal association between EAAs and SAT. METHODS: Our study utilized publicly available genome-wide association study (GWAS) data of European ancestry to conduct a bidirectional Mendelian randomization (MR) study. Five MR methods were employed to measure causal association between EAAs and SAT multiple analyses were utilized to perform quality control. RESULTS: Our study evaluated causal association between SAT and four EAAs, included GrimAge acceleration (GrimAA), Hannum age acceleration (HannumAA), PhenoAge acceleration (PhenoAA), intrinsic epigenetic age acceleration (IEAA). Results showed that there is a significant causal association between PhenoAA and SAT (OR 1.109, 95% CI 1.000-1.228, p = 0.049, by IVW method). On the contrary, SAT was associated with IEAA (OR 0.933, 95% CI 0.884-0.984, p = 0.011, by IVW method; OR 0.938, 95% CI 0.881-0.998, p = 0.043, by weighted median method). Leave-one-out sensitivity analysis, heterogeneity test, pleiotropy test, and MR-PRESSO analysis provide good quality control. CONCLUSION: The bidirectional MR analysis concluded that an increase in PhenoAA was correlated with a higher risk of SAT, indicating a potential causal relationship between PhenoAA and risk of SAT. Conversely, SAT was found to be closely associated with IEAA, suggesting that SAT may accelerate the aging process. Slowing down biological aging has emerged as a new research direction in curbing SAT.


Assuntos
Epigênese Genética , Estudo de Associação Genômica Ampla , Análise da Randomização Mendeliana , Tireoidite Subaguda , Humanos , Análise da Randomização Mendeliana/métodos , Estudo de Associação Genômica Ampla/métodos , Epigênese Genética/genética , Tireoidite Subaguda/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Feminino , Metilação de DNA/genética , Masculino , Fatores de Risco , Envelhecimento/genética
18.
BMC Genomics ; 25(1): 878, 2024 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-39294559

RESUMO

BACKGROUND: As precision medicine advances, polygenic scores (PGS) have become increasingly important for clinical risk assessment. Many methods have been developed to create polygenic models with increased accuracy for risk prediction. Our select and shrink with summary statistics (S4) PGS method has previously been shown to accurately predict the polygenic risk of epithelial ovarian cancer. Here, we applied S4 PGS to 12 phenotypes for UK Biobank participants, and compared it with the LDpred2 and a combined S4 + LDpred2 method. RESULTS: The S4 + LDpred2 method provided overall improved PGS accuracy across a variety of phenotypes for UK Biobank participants. Additionally, the S4 + LDpred2 method had the best estimated PGS accuracy in Finnish and Japanese populations. We also addressed the challenge of limited genotype level data by developing the PGS models using only GWAS summary statistics. CONCLUSIONS: Taken together, the S4 + LDpred2 method represents an improvement in overall PGS accuracy across multiple phenotypes and populations.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único , Modelos Genéticos , Feminino
19.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39288231

RESUMO

Set-based association analysis is a valuable tool in studying the etiology of complex diseases in genome-wide association studies, as it allows for the joint testing of variants in a region or group. Two common types of single nucleotide polymorphism (SNP)-disease functional models are recognized when evaluating the joint function of a set of SNP: the cumulative weak signal model, in which multiple functional variants with small effects contribute to disease risk, and the dominating strong signal model, in which a few functional variants with large effects contribute to disease risk. However, existing methods have two main limitations that reduce their power. Firstly, they typically only consider one disease-SNP association model, which can result in significant power loss if the model is misspecified. Secondly, they do not account for the high-dimensional nature of SNPs, leading to low power or high false positives. In this study, we propose a solution to these challenges by using a high-dimensional inference procedure that involves simultaneously fitting many SNPs in a regression model. We also propose an omnibus testing procedure that employs a robust and powerful P-value combination method to enhance the power of SNP-set association. Our results from extensive simulation studies and a real data analysis demonstrate that our set-based high-dimensional inference strategy is both flexible and computationally efficient and can substantially improve the power of SNP-set association analysis. Application to a real dataset further demonstrates the utility of the testing strategy.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Humanos , Predisposição Genética para Doença , Modelos Genéticos , Algoritmos , Simulação por Computador
20.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39222061

RESUMO

Harnessing the power of single-cell genomics technologies, single-cell Hi-C (scHi-C) and its derived technologies provide powerful tools to measure spatial proximity between regulatory elements and their target genes in individual cells. Using a global background model, we propose SnapHiC-G, a computational method, to identify long-range enhancer-promoter interactions from scHi-C data. We applied SnapHiC-G to scHi-C datasets generated from mouse embryonic stem cells and human brain cortical cells. SnapHiC-G achieved high sensitivity in identifying long-range enhancer-promoter interactions. Moreover, SnapHiC-G can identify putative target genes for noncoding genome-wide association study (GWAS) variants, and the genetic heritability of neuropsychiatric diseases is enriched for single-nucleotide polymorphisms (SNPs) within SnapHiC-G-identified interactions in a cell-type-specific manner. In sum, SnapHiC-G is a powerful tool for characterizing cell-type-specific enhancer-promoter interactions from complex tissues and can facilitate the discovery of chromatin interactions important for gene regulation in biologically relevant cell types.


Assuntos
Elementos Facilitadores Genéticos , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Análise de Célula Única , Animais , Humanos , Camundongos , Análise de Célula Única/métodos , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Biologia Computacional/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...