Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Am J Hum Genet ; 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38908374

RESUMO

Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.

2.
Nat Commun ; 15(1): 5007, 2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38866767

RESUMO

Polygenic scores (PGSs) offer the ability to predict genetic risk for complex diseases across the life course; a key benefit over short-term prediction models. To produce risk estimates relevant to clinical and public health decision-making, it is important to account for varying effects due to age and sex. Here, we develop a novel framework to estimate country-, age-, and sex-specific estimates of cumulative incidence stratified by PGS for 18 high-burden diseases. We integrate PGS associations from seven studies in four countries (N = 1,197,129) with disease incidences from the Global Burden of Disease. PGS has a significant sex-specific effect for asthma, hip osteoarthritis, gout, coronary heart disease and type 2 diabetes (T2D), with all but T2D exhibiting a larger effect in men. PGS has a larger effect in younger individuals for 13 diseases, with effects decreasing linearly with age. We show for breast cancer that, relative to individuals in the bottom 20% of polygenic risk, the top 5% attain an absolute risk for screening eligibility 16.3 years earlier. Our framework increases the generalizability of results from biobank studies and the accuracy of absolute risk estimates by appropriately accounting for age- and sex-specific PGS effects. Our results highlight the potential of PGS as a screening tool which may assist in the early prevention of common diseases.


Assuntos
Predisposição Genética para Doença , Herança Multifatorial , Humanos , Masculino , Feminino , Herança Multifatorial/genética , Incidência , Pessoa de Meia-Idade , Adulto , Idoso , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/epidemiologia , Fatores de Risco , Medição de Risco/métodos , Carga Global da Doença , Fatores Sexuais , Fatores Etários
3.
Nat Aging ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38914859

RESUMO

Short-term mortality risk, which is indicative of individual frailty, serves as a marker for aging. Previous age clocks focused on predicting either chronological age or longer-term mortality. Aging clocks predicting short-term mortality are lacking and their algorithmic fairness remains unexamined. We developed a deep learning model to predict 1-year mortality using nationwide longitudinal data from the Finnish population (FinRegistry; n = 5.4 million), incorporating more than 8,000 features spanning up to 50 years. We achieved an area under the curve (AUC) of 0.944, outperforming a baseline model that included only age and sex (AUC = 0.897). The model generalized well to different causes of death (AUC > 0.800 for 45 of 50 causes), including coronavirus disease 2019, which was absent in the training data. Performance varied among demographics, with young females exhibiting the best and older males the worst results. Extensive prediction fairness analyses highlighted disparities among disadvantaged groups, posing challenges to equitable integration into public health interventions. Our model accurately identified short-term mortality risk, potentially serving as a population-wide aging marker.

4.
Lancet Digit Health ; 5(11): e821-e830, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37890904

RESUMO

BACKGROUND: Novel immunisation methods against respiratory syncytial virus (RSV) are emerging, but knowledge of risk factors for severe RSV disease is insufficient for optimal targeting of interventions against them. Our aims were to identify predictors for RSV hospital admission from registry-based data and to develop and validate a clinical prediction model to guide RSV immunoprophylaxis for infants younger than 1 year. METHODS: In this model development and validation study, we studied all infants born in Finland between June 1, 1997, and May 31, 2020, and in Sweden between June 1, 2006, and May 31, 2020, along with the data for their parents and siblings. Infants were excluded if they died or were admitted to hospital for RSV within the first 7 days of life. The outcome was hospital admission due to RSV bronchiolitis during the first year of life. The Finnish study population was divided into a development dataset (born between June 1, 1997, and May 31, 2017) and a temporal hold-out validation dataset (born between June 1, 2017, and May 31, 2020). The development dataset was used for predictor discovery and selection in which we screened 1511 candidate predictors from the infants', parents', and siblings' data, and developed a logistic regression model with the 16 most important predictors. This model was then validated using the Finnish hold-out validation dataset and the Swedish dataset. FINDINGS: In total, there were 1 124 561 infants in the Finnish development dataset, 130 352 infants in the Finnish hold-out validation dataset, and 1 459 472 infants in the Swedish dataset. In addition to known predictors such as severe congenital heart defects (adjusted odds ratio 2·89, 95% CI 2·28-3·65), we confirmed some less established predictors for RSV hospital admission, most notably oesophageal malformations (3·11, 1·86-5·19) and lower complexity congenital heart defects (1·43, 1·25-1·63). The prediction model's C-statistic was 0·766 (95% CI 0·742-0·789) in Finnish data and 0·737 (0·710-0·762) in Swedish validation data. The infants in the highest decile of predicted RSV hospital admission probability had 4·5 times higher observed risk compared with others. Calibration varied according to epidemic intensity. The model's performance was similar to a machine learning (XGboost) model using all 1511 candidate predictors (C-statistic in Finland 0·771, 95% CI 0·754-0·788). The prediction model showed clinical utility in decision curve analysis and in hypothetical number needed to treat calculations for immunisation, and its C-statistic was similar across different strata of parental income. INTERPRETATION: The identified predictors and the prediction model can be used in guiding RSV immunoprophylaxis in infants, or as a basis for further immunoprophylaxis targeting tools. FUNDING: Sigrid Jusélius Foundation, European Research Council, Pediatric Research Foundation, and Academy of Finland.


Assuntos
Cardiopatias Congênitas , Infecções por Vírus Respiratório Sincicial , Lactente , Criança , Humanos , Infecções por Vírus Respiratório Sincicial/epidemiologia , Infecções por Vírus Respiratório Sincicial/prevenção & controle , Modelos Estatísticos , Prognóstico , Vírus Sinciciais Respiratórios , Fatores de Risco
6.
Nat Hum Behav ; 7(7): 1069-1083, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37081098

RESUMO

Understanding factors associated with COVID-19 vaccination can highlight issues in public health systems. Using machine learning, we considered the effects of 2,890 health, socio-economic and demographic factors in the entire Finnish population aged 30-80 and genome-wide information from 273,765 individuals. The strongest predictors of vaccination status were labour income and medication purchase history. Mental health conditions and having unvaccinated first-degree relatives were associated with reduced vaccination. A prediction model combining all predictors achieved good discrimination (area under the receiver operating characteristic curve, 0.801; 95% confidence interval, 0.799-0.803). The 1% of individuals with the highest predicted risk of not vaccinating had an observed vaccination rate of 18.8%, compared with 90.3% in the study population. We identified eight genetic loci associated with vaccination uptake and derived a polygenic score, which was a weak predictor in an independent subset. Our results suggest that individuals at higher risk of suffering the worst consequences of COVID-19 are also less likely to vaccinate.


Assuntos
COVID-19 , Humanos , Finlândia , Vacinas contra COVID-19 , Renda , Vacinação
7.
Nat Genet ; 54(3): 283-294, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35190730

RESUMO

DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF-TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição , Sítios de Ligação/genética , Genoma Humano/genética , Humanos , Ligação Proteica , Sequências Reguladoras de Ácido Nucleico/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
8.
Nat Commun ; 9(1): 3664, 2018 09 10.
Artigo em Inglês | MEDLINE | ID: mdl-30202008

RESUMO

Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed). The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth. Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations. Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.


Assuntos
Desequilíbrio Alélico , Neoplasias Colorretais/genética , Predisposição Genética para Doença , Sistemas CRISPR-Cas , Aberrações Cromossômicas , Cromossomos Humanos Par 8 , Neoplasias Colorretais/patologia , Variações do Número de Cópias de DNA , Dinamarca , Perfilação da Expressão Gênica , Genômica , Genótipo , Humanos , Perda de Heterozigosidade , Repetições de Microssatélites , Fenótipo , Mutação Puntual , Proteínas Proto-Oncogênicas p21(ras)/genética , RNA Interferente Pequeno/genética , Fatores de Transcrição/genética , Proteína Supressora de Tumor p53/genética , Sequenciamento Completo do Genoma
9.
Bioinformatics ; 32(17): i629-i638, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587683

RESUMO

MOTIVATION: Transcription factor (TF) binding can be studied accurately in vivo with ChIP-exo and ChIP-Nexus experiments. Only fraction of TF binding mechanisms are yet fully understood and accurate knowledge of binding locations and patterns of TFs is key to understanding binding that is not explained by simple positional weight matrix models. ChIP-exo/Nexus experiments can also offer insight on the effect of single nucleotide polymorphism (SNP) at TF binding sites on expression of the target genes. This is an important mechanism of action for disease-causing SNPs at non-coding genomic regions. RESULTS: We describe a peak caller PeakXus that is specifically designed to leverage the increased resolution of ChIP-exo/Nexus and developed with the aim of making as few assumptions of the data as possible to allow discoveries of novel binding patterns. We apply PeakXus to ChIP-Nexus and ChIP-exo experiments performed both in Homo sapiens and in Drosophila melanogaster cell lines. We show that PeakXus consistently finds more peaks overlapping with a TF-specific recognition sequence than published methods. As an application example we demonstrate how PeakXus can be coupled with unique molecular identifiers (UMIs) to measure the effect of a SNP overlapping with a TF binding site on the in vivo binding of the TF. AVAILABILITY AND IMPLEMENTATION: Source code of PeakXus is available at https://github.com/hartonen/PeakXus CONTACT: tuomo.hartonen@helsinki.fi or jussi.taipale@ki.se.


Assuntos
Sítios de Ligação , Fatores de Transcrição , Animais , Imunoprecipitação da Cromatina , Biologia Computacional , Simulação por Computador , Drosophila melanogaster , Perfilação da Expressão Gênica , Loci Gênicos , Humanos , Ligação Proteica , Mapeamento de Interação de Proteínas , Análise de Sequência de DNA
10.
PLoS One ; 7(12): e51353, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23300542

RESUMO

Cellular phones are now offering an ubiquitous means for scientists to observe life: how people act, move and respond to external influences. They can be utilized as measurement devices of individual persons and for groups of people of the social context and the related interactions. The picture of human life that emerges shows complexity, which is manifested in such data in properties of the spatiotemporal tracks of individuals. We extract from smartphone-based data for a set of persons important locations such as "home", "work" and so forth over fixed length time-slots covering the days in the data-set (see also [1], [2]). This set of typical places is heavy-tailed, a power-law distribution with an exponent close to -1.7. To analyze the regularities and stochastic features present, the days are classified for each person into regular, personal patterns. To this are superimposed fluctuations for each day. This randomness is measured by "life" entropy, computed both before and after finding the clustering so as to subtract the contribution of a number of patterns. The main issue that we then address is how predictable individuals are in their mobility. The patterns and entropy are reflected in the predictability of the mobility of the life both individually and on average. We explore the simple approaches to guess the location from the typical behavior, and of exploiting the transition probabilities with time from location or activity A to B. The patterns allow an enhanced predictability, at least up to a few hours into the future from the current location. Such fixed habits are most clearly visible in the working-day length.


Assuntos
Telefone Celular/estatística & dados numéricos , Entropia , Estilo de Vida , Atividade Motora/fisiologia , Valor Preditivo dos Testes , Análise por Conglomerados , Meio Ambiente , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA