Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Am J Hum Genet ; 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38908374

RESUMEN

Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.

2.
Nat Commun ; 15(1): 5007, 2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38866767

RESUMEN

Polygenic scores (PGSs) offer the ability to predict genetic risk for complex diseases across the life course; a key benefit over short-term prediction models. To produce risk estimates relevant to clinical and public health decision-making, it is important to account for varying effects due to age and sex. Here, we develop a novel framework to estimate country-, age-, and sex-specific estimates of cumulative incidence stratified by PGS for 18 high-burden diseases. We integrate PGS associations from seven studies in four countries (N = 1,197,129) with disease incidences from the Global Burden of Disease. PGS has a significant sex-specific effect for asthma, hip osteoarthritis, gout, coronary heart disease and type 2 diabetes (T2D), with all but T2D exhibiting a larger effect in men. PGS has a larger effect in younger individuals for 13 diseases, with effects decreasing linearly with age. We show for breast cancer that, relative to individuals in the bottom 20% of polygenic risk, the top 5% attain an absolute risk for screening eligibility 16.3 years earlier. Our framework increases the generalizability of results from biobank studies and the accuracy of absolute risk estimates by appropriately accounting for age- and sex-specific PGS effects. Our results highlight the potential of PGS as a screening tool which may assist in the early prevention of common diseases.


Asunto(s)
Predisposición Genética a la Enfermedad , Herencia Multifactorial , Humanos , Masculino , Femenino , Herencia Multifactorial/genética , Incidencia , Persona de Mediana Edad , Adulto , Anciano , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/epidemiología , Factores de Riesgo , Medición de Riesgo/métodos , Carga Global de Enfermedades , Factores Sexuales , Factores de Edad
3.
Nat Aging ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38914859

RESUMEN

Short-term mortality risk, which is indicative of individual frailty, serves as a marker for aging. Previous age clocks focused on predicting either chronological age or longer-term mortality. Aging clocks predicting short-term mortality are lacking and their algorithmic fairness remains unexamined. We developed a deep learning model to predict 1-year mortality using nationwide longitudinal data from the Finnish population (FinRegistry; n = 5.4 million), incorporating more than 8,000 features spanning up to 50 years. We achieved an area under the curve (AUC) of 0.944, outperforming a baseline model that included only age and sex (AUC = 0.897). The model generalized well to different causes of death (AUC > 0.800 for 45 of 50 causes), including coronavirus disease 2019, which was absent in the training data. Performance varied among demographics, with young females exhibiting the best and older males the worst results. Extensive prediction fairness analyses highlighted disparities among disadvantaged groups, posing challenges to equitable integration into public health interventions. Our model accurately identified short-term mortality risk, potentially serving as a population-wide aging marker.

4.
Lancet Digit Health ; 5(11): e821-e830, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37890904

RESUMEN

BACKGROUND: Novel immunisation methods against respiratory syncytial virus (RSV) are emerging, but knowledge of risk factors for severe RSV disease is insufficient for optimal targeting of interventions against them. Our aims were to identify predictors for RSV hospital admission from registry-based data and to develop and validate a clinical prediction model to guide RSV immunoprophylaxis for infants younger than 1 year. METHODS: In this model development and validation study, we studied all infants born in Finland between June 1, 1997, and May 31, 2020, and in Sweden between June 1, 2006, and May 31, 2020, along with the data for their parents and siblings. Infants were excluded if they died or were admitted to hospital for RSV within the first 7 days of life. The outcome was hospital admission due to RSV bronchiolitis during the first year of life. The Finnish study population was divided into a development dataset (born between June 1, 1997, and May 31, 2017) and a temporal hold-out validation dataset (born between June 1, 2017, and May 31, 2020). The development dataset was used for predictor discovery and selection in which we screened 1511 candidate predictors from the infants', parents', and siblings' data, and developed a logistic regression model with the 16 most important predictors. This model was then validated using the Finnish hold-out validation dataset and the Swedish dataset. FINDINGS: In total, there were 1 124 561 infants in the Finnish development dataset, 130 352 infants in the Finnish hold-out validation dataset, and 1 459 472 infants in the Swedish dataset. In addition to known predictors such as severe congenital heart defects (adjusted odds ratio 2·89, 95% CI 2·28-3·65), we confirmed some less established predictors for RSV hospital admission, most notably oesophageal malformations (3·11, 1·86-5·19) and lower complexity congenital heart defects (1·43, 1·25-1·63). The prediction model's C-statistic was 0·766 (95% CI 0·742-0·789) in Finnish data and 0·737 (0·710-0·762) in Swedish validation data. The infants in the highest decile of predicted RSV hospital admission probability had 4·5 times higher observed risk compared with others. Calibration varied according to epidemic intensity. The model's performance was similar to a machine learning (XGboost) model using all 1511 candidate predictors (C-statistic in Finland 0·771, 95% CI 0·754-0·788). The prediction model showed clinical utility in decision curve analysis and in hypothetical number needed to treat calculations for immunisation, and its C-statistic was similar across different strata of parental income. INTERPRETATION: The identified predictors and the prediction model can be used in guiding RSV immunoprophylaxis in infants, or as a basis for further immunoprophylaxis targeting tools. FUNDING: Sigrid Jusélius Foundation, European Research Council, Pediatric Research Foundation, and Academy of Finland.


Asunto(s)
Cardiopatías Congénitas , Infecciones por Virus Sincitial Respiratorio , Lactante , Niño , Humanos , Infecciones por Virus Sincitial Respiratorio/epidemiología , Infecciones por Virus Sincitial Respiratorio/prevención & control , Modelos Estadísticos , Pronóstico , Virus Sincitiales Respiratorios , Factores de Riesgo
6.
Nat Hum Behav ; 7(7): 1069-1083, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37081098

RESUMEN

Understanding factors associated with COVID-19 vaccination can highlight issues in public health systems. Using machine learning, we considered the effects of 2,890 health, socio-economic and demographic factors in the entire Finnish population aged 30-80 and genome-wide information from 273,765 individuals. The strongest predictors of vaccination status were labour income and medication purchase history. Mental health conditions and having unvaccinated first-degree relatives were associated with reduced vaccination. A prediction model combining all predictors achieved good discrimination (area under the receiver operating characteristic curve, 0.801; 95% confidence interval, 0.799-0.803). The 1% of individuals with the highest predicted risk of not vaccinating had an observed vaccination rate of 18.8%, compared with 90.3% in the study population. We identified eight genetic loci associated with vaccination uptake and derived a polygenic score, which was a weak predictor in an independent subset. Our results suggest that individuals at higher risk of suffering the worst consequences of COVID-19 are also less likely to vaccinate.


Asunto(s)
COVID-19 , Humanos , Finlandia , Vacunas contra la COVID-19 , Renta , Vacunación
7.
Nat Genet ; 54(3): 283-294, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35190730

RESUMEN

DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF-TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression.


Asunto(s)
Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción , Sitios de Unión/genética , Genoma Humano/genética , Humanos , Unión Proteica , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
8.
Nat Commun ; 9(1): 3664, 2018 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-30202008

RESUMEN

Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed). The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth. Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations. Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.


Asunto(s)
Desequilibrio Alélico , Neoplasias Colorrectales/genética , Predisposición Genética a la Enfermedad , Sistemas CRISPR-Cas , Aberraciones Cromosómicas , Cromosomas Humanos Par 8 , Neoplasias Colorrectales/patología , Variaciones en el Número de Copia de ADN , Dinamarca , Perfilación de la Expresión Génica , Genómica , Genotipo , Humanos , Pérdida de Heterocigocidad , Repeticiones de Microsatélite , Fenotipo , Mutación Puntual , Proteínas Proto-Oncogénicas p21(ras)/genética , ARN Interferente Pequeño/genética , Factores de Transcripción/genética , Proteína p53 Supresora de Tumor/genética , Secuenciación Completa del Genoma
9.
Bioinformatics ; 32(17): i629-i638, 2016 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-27587683

RESUMEN

MOTIVATION: Transcription factor (TF) binding can be studied accurately in vivo with ChIP-exo and ChIP-Nexus experiments. Only fraction of TF binding mechanisms are yet fully understood and accurate knowledge of binding locations and patterns of TFs is key to understanding binding that is not explained by simple positional weight matrix models. ChIP-exo/Nexus experiments can also offer insight on the effect of single nucleotide polymorphism (SNP) at TF binding sites on expression of the target genes. This is an important mechanism of action for disease-causing SNPs at non-coding genomic regions. RESULTS: We describe a peak caller PeakXus that is specifically designed to leverage the increased resolution of ChIP-exo/Nexus and developed with the aim of making as few assumptions of the data as possible to allow discoveries of novel binding patterns. We apply PeakXus to ChIP-Nexus and ChIP-exo experiments performed both in Homo sapiens and in Drosophila melanogaster cell lines. We show that PeakXus consistently finds more peaks overlapping with a TF-specific recognition sequence than published methods. As an application example we demonstrate how PeakXus can be coupled with unique molecular identifiers (UMIs) to measure the effect of a SNP overlapping with a TF binding site on the in vivo binding of the TF. AVAILABILITY AND IMPLEMENTATION: Source code of PeakXus is available at https://github.com/hartonen/PeakXus CONTACT: tuomo.hartonen@helsinki.fi or jussi.taipale@ki.se.


Asunto(s)
Sitios de Unión , Factores de Transcripción , Animales , Inmunoprecipitación de Cromatina , Biología Computacional , Simulación por Computador , Drosophila melanogaster , Perfilación de la Expresión Génica , Sitios Genéticos , Humanos , Unión Proteica , Mapeo de Interacción de Proteínas , Análisis de Secuencia de ADN
10.
PLoS One ; 7(12): e51353, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23300542

RESUMEN

Cellular phones are now offering an ubiquitous means for scientists to observe life: how people act, move and respond to external influences. They can be utilized as measurement devices of individual persons and for groups of people of the social context and the related interactions. The picture of human life that emerges shows complexity, which is manifested in such data in properties of the spatiotemporal tracks of individuals. We extract from smartphone-based data for a set of persons important locations such as "home", "work" and so forth over fixed length time-slots covering the days in the data-set (see also [1], [2]). This set of typical places is heavy-tailed, a power-law distribution with an exponent close to -1.7. To analyze the regularities and stochastic features present, the days are classified for each person into regular, personal patterns. To this are superimposed fluctuations for each day. This randomness is measured by "life" entropy, computed both before and after finding the clustering so as to subtract the contribution of a number of patterns. The main issue that we then address is how predictable individuals are in their mobility. The patterns and entropy are reflected in the predictability of the mobility of the life both individually and on average. We explore the simple approaches to guess the location from the typical behavior, and of exploiting the transition probabilities with time from location or activity A to B. The patterns allow an enhanced predictability, at least up to a few hours into the future from the current location. Such fixed habits are most clearly visible in the working-day length.


Asunto(s)
Teléfono Celular/estadística & datos numéricos , Entropía , Estilo de Vida , Actividad Motora/fisiología , Valor Predictivo de las Pruebas , Análisis por Conglomerados , Ambiente , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...