Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.986
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Annu Rev Neurosci ; 39: 237-56, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27145916

RESUMO

Brain function involves the activity of neuronal populations. Much recent effort has been devoted to measuring the activity of neuronal populations in different parts of the brain under various experimental conditions. Population activity patterns contain rich structure, yet many studies have focused on measuring pairwise relationships between members of a larger population-termed noise correlations. Here we review recent progress in understanding how these correlations affect population information, how information should be quantified, and what mechanisms may give rise to correlations. As population coding theory has improved, it has made clear that some forms of correlation are more important for information than others. We argue that this is a critical lesson for those interested in neuronal population responses more generally: Descriptions of population responses should be motivated by and linked to well-specified function. Within this context, we offer suggestions of where current theoretical frameworks fall short.


Assuntos
Potenciais de Ação/fisiologia , Inteligência Artificial , Encéfalo/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Animais , Humanos , Estatística como Assunto/métodos
2.
PLoS Biol ; 18(1): e3000586, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31951611

RESUMO

The origin and fate of new mutations within species is the fundamental process underlying evolution. However, while much attention has been focused on characterizing the presence, frequency, and phenotypic impact of genetic variation, the evolutionary histories of most variants are largely unexplored. We have developed a nonparametric approach for estimating the date of origin of genetic variants in large-scale sequencing data sets. The accuracy and robustness of the approach is demonstrated through simulation. Using data from two publicly available human genomic diversity resources, we estimated the age of more than 45 million single-nucleotide polymorphisms (SNPs) in the human genome and release the Atlas of Variant Age as a public online database. We characterize the relationship between variant age and frequency in different geographical regions and demonstrate the value of age information in interpreting variants of functional and selective importance. Finally, we use allele age estimates to power a rapid approach for inferring the ancestry shared between individual genomes and to quantify genealogical relationships at different points in the past, as well as to describe and explore the evolutionary history of modern human populations.


Assuntos
Especiação Genética , Genética Populacional/métodos , Polimorfismo de Nucleotídeo Único , Grupos Raciais/genética , Fatores Etários , Alelos , Simulação por Computador , Conjuntos de Dados como Assunto , Evolução Molecular , Frequência do Gene , Variação Genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linhagem , Filogenia , Análise de Sequência de DNA , Estatística como Assunto/métodos , Fatores de Tempo
3.
PLoS Biol ; 17(1): e3000127, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30682013

RESUMO

There is increased concern about poor scientific practices arising from an excessive focus on P-values. Two particularly worrisome practices are selective reporting of significant results and 'P-hacking'. The latter is the manipulation of data collection, usage, or analyses to obtain statistically significant outcomes. Here, we introduce the novel, to our knowledge, concepts of selective reporting of nonsignificant results and 'reverse P-hacking' whereby researchers ensure that tests produce a nonsignificant result. We test whether these practices occur in experiments in which researchers randomly assign subjects to treatment and control groups to minimise differences in confounding variables that might affect the focal outcome. By chance alone, 5% of tests for a group difference in confounding variables should yield a significant result (P < 0.05). If researchers less often report significant findings and/or reverse P-hack to avoid significant outcomes that undermine the ethos that experimental and control groups only differ with respect to actively manipulated variables, we expect significant results from tests for group differences to be under-represented in the literature. We surveyed the behavioural ecology literature and found significantly more nonsignificant P-values reported for tests of group differences in potentially confounding variables than the expected 95% (P = 0.005; N = 250 studies). This novel, to our knowledge, publication bias could result from selective reporting of nonsignificant results and/or from reverse P-hacking. We encourage others to test for a bias toward publishing nonsignificant results in the equivalent context in their own research discipline.


Assuntos
Interpretação Estatística de Dados , Viés de Publicação/estatística & dados numéricos , Estatística como Assunto/métodos , Viés , Análise de Dados , Humanos , Conhecimento , Probabilidade , Viés de Publicação/tendências , Editoração , Pesquisadores , Inquéritos e Questionários
4.
Diabetologia ; 64(7): 1583-1594, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33715025

RESUMO

AIMS/HYPOTHESIS: Type 2 diabetes is a heterogeneous disease process with variable trajectories of CVD risk. We aimed to evaluate four phenomapping strategies and their ability to stratify CVD risk in individuals with type 2 diabetes and to identify subgroups who may benefit from specific therapies. METHODS: Participants with type 2 diabetes and free of baseline CVD in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial were included in this study (N = 6466). Clustering using Gaussian mixture models, latent class analysis, finite mixture models (FMMs) and principal component analysis was compared. Clustering variables included demographics, medical and social history, laboratory values and diabetes complications. The interaction between the phenogroup and intensive glycaemic, combination lipid and intensive BP therapy for the risk of the primary outcome (composite of fatal myocardial infarction, non-fatal myocardial infarction or unstable angina) was evaluated using adjusted Cox models. The phenomapping strategies were independently assessed in an external validation cohort (Look Action for Health in Diabetes [Look AHEAD] trial: n = 4211; and Bypass Angioplasty Revascularisation Investigation 2 Diabetes [BARI 2D] trial: n = 1495). RESULTS: Over 9.1 years of follow-up, 789 (12.2%) participants had a primary outcome event. FMM phenomapping with three phenogroups was the best-performing clustering strategy in both the derivation and validation cohorts as determined by Bayesian information criterion, Dunn index and improvement in model discrimination. Phenogroup 1 (n = 663, 10.3%) had the highest burden of comorbidities and diabetes complications, phenogroup 2 (n = 2388, 36.9%) had an intermediate comorbidity burden and lowest diabetes complications, and phenogroup 3 (n = 3415, 52.8%) had the fewest comorbidities and intermediate burden of diabetes complications. Significant interactions were observed between phenogroups and treatment interventions including intensive glycaemic control (p-interaction = 0.042) and combination lipid therapy (p-interaction < 0.001) in the ACCORD, intensive lifestyle intervention (p-interaction = 0.002) in the Look AHEAD and early coronary revascularisation (p-interaction = 0.003) in the BARI 2D trial cohorts for the risk of the primary composite outcome. Favourable reduction in the risk of the primary composite outcome with these interventions was noted in low-risk participants of phenogroup 3 but not in other phenogroups. Compared with phenogroup 3, phenogroup 1 participants were more likely to have severe/symptomatic hypoglycaemic events and medication non-adherence on follow-up in the ACCORD and Look AHEAD trial cohorts. CONCLUSIONS/INTERPRETATION: Clustering using FMMs was the optimal phenomapping strategy to identify replicable subgroups of patients with type 2 diabetes with distinct clinical characteristics, CVD risk and response to therapies.


Assuntos
Aterosclerose/diagnóstico , Aterosclerose/etiologia , Diabetes Mellitus Tipo 2/diagnóstico , Idoso , Aterosclerose/epidemiologia , Variação Biológica da População , Fatores de Risco Cardiometabólico , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/etiologia , Análise por Conglomerados , Estudos de Coortes , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/terapia , Angiopatias Diabéticas/diagnóstico , Angiopatias Diabéticas/epidemiologia , Angiopatias Diabéticas/etiologia , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Prognóstico , Medição de Risco/métodos , Fatores de Risco , Estatística como Assunto/métodos , Resultado do Tratamento , Estados Unidos/epidemiologia
5.
Am J Epidemiol ; 190(7): 1386-1395, 2021 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-33534904

RESUMO

Ambitious World Health Organization targets for disease elimination require monitoring of epidemics using routine health data in settings of decreasing and low incidence. We evaluated 2 methods commonly applied to routine testing results to estimate incidence rates that assume a uniform probability of infection between consecutive negative and positive tests based on 1) the midpoint of this interval and 2) a randomly selected point in this interval. We compared these with an approximation of the Poisson binomial distribution, which assigns partial incidence to time periods based on the uniform probability of occurrence in these intervals. We assessed bias, variance, and convergence of estimates using simulations of Weibull-distributed failure times with systematically varied baseline incidence and varying trend. We considered results for quarterly, half-yearly, and yearly incidence estimation frequencies. We applied the methods to assess human immunodeficiency virus (HIV) incidence in HIV-negative patients from the Treatment With Antiretrovirals and Their Impact on Positive and Negative Men (TAIPAN) Study, an Australian study of HIV incidence in men who have sex with men, between 2012 and 2018. The Poisson binomial method had reduced bias and variance at low levels of incidence and for increased estimation frequency, with increased consistency of estimation. Application of methods to real-world assessment of HIV incidence found decreased variance in Poisson binomial model estimates, with observed incidence declining to levels where simulation results had indicated bias in midpoint and random-point methods.


Assuntos
Projetos de Pesquisa Epidemiológica , Infecções por HIV/epidemiologia , Vigilância da População/métodos , Minorias Sexuais e de Gênero/estatística & dados numéricos , Estatística como Assunto/métodos , Austrália/epidemiologia , Viés , Simulação por Computador , Epidemias , Humanos , Incidência , Masculino , Modelos Estatísticos , Distribuição de Poisson , Probabilidade
6.
Am J Epidemiol ; 190(8): 1681-1688, 2021 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-33831172

RESUMO

We evaluated whether randomly sampling and testing a set number of individuals for coronavirus disease 2019 (COVID-19) while adjusting for misclassification error captures the true prevalence. We also quantified the impact of misclassification error bias on publicly reported case data in Maryland. Using a stratified random sampling approach, 50,000 individuals were selected from a simulated Maryland population to estimate the prevalence of COVID-19. We examined the situation when the true prevalence is low (0.07%-2%), medium (2%-5%), and high (6%-10%). Bayesian models informed by published validity estimates were used to account for misclassification error when estimating COVID-19 prevalence. Adjustment for misclassification error captured the true prevalence 100% of the time, irrespective of the true prevalence level. When adjustment for misclassification error was not done, the results highly varied depending on the population's underlying true prevalence and the type of diagnostic test used. Generally, the prevalence estimates without adjustment for misclassification error worsened as the true prevalence level increased. Adjustment for misclassification error for publicly reported Maryland data led to a minimal but not significant increase in the estimated average daily cases. Random sampling and testing of COVID-19 are needed with adjustment for misclassification error to improve COVID-19 prevalence estimates.


Assuntos
Teste para COVID-19/estatística & dados numéricos , COVID-19/epidemiologia , Técnicas de Apoio para a Decisão , Estatística como Assunto/métodos , Teorema de Bayes , COVID-19/classificação , Humanos , Maryland/epidemiologia , Prevalência , SARS-CoV-2 , Viés de Seleção
7.
Am J Epidemiol ; 190(10): 1993-1999, 2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-33831173

RESUMO

Test-negative studies are commonly used to estimate influenza vaccine effectiveness (VE). In a typical study, an "overall VE" estimate based on data from the entire sample may be reported. However, there may be heterogeneity in VE, particularly by age. Therefore, in this article we discuss the potential for a weighted average of age-specific VE estimates to provide a more meaningful measure of overall VE. We illustrate this perspective first using simulations to evaluate how overall VE would be biased when certain age groups are overrepresented. We found that unweighted overall VE estimates tended to be higher than weighted VE estimates when children were overrepresented and lower when elderly persons were overrepresented. Then we extracted published estimates from the US Flu VE network, in which children are overrepresented, and some discrepancy between unweighted and weighted overall VE was observed. Differences in weighted versus unweighted overall VE estimates could translate to substantial differences in the interpretation of individual risk reduction among vaccinated persons and in the total averted disease burden at the population level. Weighting of overall estimates should be considered in VE studies in the future.


Assuntos
Vacinas contra Influenza/uso terapêutico , Influenza Humana/epidemiologia , Estudos Soroepidemiológicos , Estatística como Assunto/métodos , Vacinação/estatística & dados numéricos , Adolescente , Adulto , Idoso , Criança , Simulação por Computador , Interpretação Estatística de Dados , Feminino , Humanos , Vírus da Influenza A , Influenza Humana/prevenção & controle , Masculino , Pessoa de Meia-Idade , Resultado do Tratamento , Estados Unidos/epidemiologia , Adulto Jovem
8.
Am J Epidemiol ; 190(6): 1088-1100, 2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33083822

RESUMO

Here we describe methods for assessing heterogeneity of treatment effects over prespecified subgroups in observational studies, using outcome-model-based (g-formula), inverse probability weighting, doubly robust, and matching estimators of subgroup-specific potential outcome means, conditional average treatment effects, and measures of heterogeneity of treatment effects. We compare the finite-sample performance of different estimators in simulation studies where we vary the total sample size, the relative frequency of each subgroup, the magnitude of treatment effect in each subgroup, and the distribution of baseline covariates, for both continuous and binary outcomes. We find that the estimators' bias and variance vary substantially in finite samples, even when there is no unobserved confounding and no model misspecification. As an illustration, we apply the methods to data from the Coronary Artery Surgery Study (August 1975-December 1996) to compare the effect of surgery plus medical therapy with that of medical therapy alone for chronic coronary artery disease in subgroups defined by previous myocardial infarction or left ventricular ejection fraction.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Estudos Observacionais como Assunto/estatística & dados numéricos , Avaliação de Resultados em Cuidados de Saúde/estatística & dados numéricos , Estatística como Assunto/métodos , Procedimentos Cirúrgicos Cardíacos , Fármacos Cardiovasculares/uso terapêutico , Terapia Combinada , Simulação por Computador , Doença da Artéria Coronariana/terapia , Humanos , Estudos Observacionais como Assunto/métodos , Avaliação de Resultados em Cuidados de Saúde/métodos , Probabilidade , Tamanho da Amostra , Resultado do Tratamento
9.
Behav Genet ; 51(3): 215-222, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33630212

RESUMO

Genetic effects on the liability scale are informative for describing the genetic architecture of binary traits, typically diseases. However, most genetic association analyses on binary traits are performed by logistic regression, and there is no straightforward method that transforms both effect size estimate and standard error from the logit scale to the liability scale. Here, we derive a simple linear transformation of the log odds ratio and its standard error for a single nucleotide polymorphism (SNP) to an effect size and standard error on the liability scale. We show by analytic calculations and simulations that this approximation is accurate when the disease is common and the SNP effect is small. We also apply this method to estimate the contribution of a SNP near the RET gene to the variance of Hirschsprung disease liability, and the age-specific contributions of APOE4 on the variance of Alzheimer's disease liability. We discuss the approximate linear inter-relationships between genotype and effect sizes on the observed binary, logit, and liability scales, and the potential applications of the linear approximation to statistical power calculation for binary traits.


Assuntos
Estatística como Assunto/métodos , Testes Genéticos , Genótipo , Humanos , Modelos Logísticos , Modelos Genéticos , Razão de Chances , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Tamanho da Amostra
10.
Behav Genet ; 51(3): 358-373, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33899139

RESUMO

Gene-environment interactions (GxE) play a central role in the theoretical relationship between genetic factors and complex traits. While genome wide GxE studies of human behaviors remain underutilized, in part due to methodological limitations, existing GxE research in model organisms emphasizes the importance of interpreting genetic associations within environmental contexts. In this paper, we present a framework for conducting an analysis of GxE using raw data from genome wide association studies (GWAS) and applying the techniques to analyze gene-by-age interactions for alcohol use frequency. To illustrate the effectiveness of this procedure, we calculate genetic marginal effects from a GxE GWAS analysis for an ordinal measure of alcohol use frequency from the UK Biobank dataset, treating the respondent's age as the continuous moderating environment. The genetic marginal effects clarify the interpretation of the GxE associations and provide a direct and clear understanding of how the genetic associations vary across age (the environment). To highlight the advantages of our proposed methods for presenting GxE GWAS results, we compare the interpretation of marginal genetic effects with an interpretation that focuses narrowly on the significance of the interaction coefficients. The results imply that the genetic associations with alcohol use frequency vary considerably across ages, a conclusion that may not be obvious from the raw regression or interaction coefficients. GxE GWAS is less powerful than the standard "main effect" GWAS approach, and therefore require larger samples to detect significant moderated associations. Fortunately, the necessary sample sizes for a successful application of GxE GWAS can rely on the existing and on-going development of consortia and large-scale population-based studies.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Estatística como Assunto/métodos , Análise de Dados , Meio Ambiente , Interação Gene-Ambiente , Genótipo , Humanos , Modelos Genéticos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável
11.
Behav Genet ; 51(3): 331-342, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33439421

RESUMO

There is a long history of fitting biometrical structural-equation models (SEMs) in the pregenomic behavioral-genetics literature of twin, family, and adoption studies. Recently, a method has emerged for estimating biometrical variance-covariance components based not upon the expected degree of genetic resemblance among relatives, but upon the observed degree of genetic resemblance among unrelated individuals for whom genome-wide genotypes are available-genomic-relatedness-matrix restricted maximum-likelihood (GREML). However, most existing GREML software is concerned with quickly and efficiently estimating heritability coefficients, genetic correlations, and so on, rather than with allowing the user to fit SEMs to multitrait samples of genotyped participants. We therefore introduce a feature in the OpenMx package, "mxGREML", designed to fit the biometrical SEMs from the pregenomic era in present-day genomic study designs. We explain the additional functionality this new feature has brought to OpenMx, and how the new functionality works. We provide an illustrative example of its use. We discuss the feature's current limitations, and our plans for its further development.


Assuntos
Estatística como Assunto/métodos , Gêmeos/genética , Análise de Variância , Biometria/métodos , Estudo de Associação Genômica Ampla/métodos , Genômica , Genótipo , Funções Verossimilhança , Modelos Genéticos , Modelos Teóricos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Software
12.
Behav Genet ; 51(3): 343-357, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33604756

RESUMO

Most genome-wide association study (GWAS) analyses test the association between single-nucleotide polymorphisms (SNPs) and a single trait or outcome. While valuable second-step analyses of these associations (e.g., calculating genetic correlations between traits) are common, single-step multivariate analyses of GWAS data are rarely performed. This is unfortunate because multivariate analyses can reveal information which is irrevocably obscured in multi-step analysis. One simple example is the distinction between variance common to a set of measures, and variance specific to each. Neither GWAS of sum- or factor-scores, nor GWAS of the individual measures will deliver a clean picture of loci associated with each measure's specific variance. While multivariate GWAS opens up a broad new landscape of feasible and informative analyses, its adoption has been slow, likely due to the heavy computational demands and difficulties specifying models it requires. Here we describe GW-SEM 2.0, which is designed to simplify model specification and overcome the inherent computational challenges associated with multivariate GWAS. In addition, GW-SEM 2.0 allows users to accurately model ordinal items, which are common in behavioral and psychological research, within a GWAS context. This new release enhances computational efficiency, allows users to select the fit function that is appropriate for their analyses, expands compatibility with standard genomic data formats, and outputs results for seamless reading into other standard post-GWAS processing software. To demonstrate GW-SEM's utility, we conducted (1) a series of GWAS using three substance use frequency items from data in the UK Biobank, (2) a timing study for several predefined GWAS functions, and (3) a Type I Error rate study. Our multivariate GWAS analyses emphasize the utility of GW-SEM for identifying novel patterns of associations that vary considerably between genomic loci for specific substances, highlighting the importance of differentiating between substance-specific use behaviors and polysubstance use. The timing studies demonstrate that the analyses take a reasonable amount of time and show the cost of including additional items. The Type I Error rate study demonstrates that hypothesis tests for genetic associations with latent variable models follow the hypothesized uniform distribution. Taken together, we suggest that GW-SEM may provide substantially deeper insights into the underlying genomic architecture for multivariate behavioral and psychological systems than is currently possible with standard GWAS methods. The current release of GW-SEM 2.0 is available on CRAN (stable release) and GitHub (beta release), and tutorials are available on our github wiki ( https://jpritikin.github.io/gwsem/ ).


Assuntos
Análise de Variância , Estudo de Associação Genômica Ampla/métodos , Estatística como Assunto/métodos , Genômica/métodos , Humanos , Modelos Genéticos , Modelos Teóricos , Análise Multivariada , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Software
13.
Behav Genet ; 51(3): 319-330, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33638732

RESUMO

The classical twin model can be reparametrized as an equivalent multilevel model. The multilevel parameterization has underexplored advantages, such as the possibility to include higher-level clustering variables in which lower levels are nested. When this higher-level clustering is not modeled, its variance is captured by the common environmental variance component. In this paper we illustrate the application of a 3-level multilevel model to twin data by analyzing the regional clustering of 7-year-old children's height in the Netherlands. Our findings show that 1.8%, of the phenotypic variance in children's height is attributable to regional clustering, which is 7% of the variance explained by between-family or common environmental components. Since regional clustering may represent ancestry, we also investigate the effect of region after correcting for genetic principal components, in a subsample of participants with genome-wide SNP data. After correction, region no longer explained variation in height. Our results suggest that the phenotypic variance explained by region might represent ancestry effects on height.


Assuntos
Estatura/genética , Análise Multinível/métodos , Estatística como Assunto/métodos , Criança , Análise por Conglomerados , Feminino , Genética Comportamental/métodos , Genética Comportamental/tendências , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Masculino , Modelos Genéticos , Países Baixos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Gêmeos/genética
14.
Behav Genet ; 51(3): 301-318, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33609197

RESUMO

For more than a decade, it has been known that many common behavior genetics models for a single phenotype can be estimated as multilevel models (e.g., van den Oord 2001; Guo and Wang 2002; McArdle and Prescott 2005; Rabe-Hesketh et al. 2007). This paper extends the current knowledge to (1) multiple phenotypes such that the method is completely general to the variance structure hypothesized, and (2) both higher and lower levels of nesting. The multi-phenotype method also allows extended relationships to be considered (see also, Bard et al. 2012; Hadfield and Nakagawa 2010). The extended relationship model can then be continuously expanded to merge with the case typically seen in the molecular genetics analyses of unrelated individuals (e.g., Yang et al. 2011). We use the multilevel form of behavior genetics models to fit a multivariate three level model that allows for (1) child level variation from unique environments and additive genetics, (2) family level variation from additive genetics and common environments, and (3) neighborhood level variation from broader geographic contexts. Finally, we provide R (R Development Core Team 2020) functions and code for multilevel specification of several common behavior genetics models using OpenMx (Neale et al. 2016).


Assuntos
Genética Comportamental/métodos , Análise Multinível/métodos , Estatística como Assunto/métodos , Meio Ambiente , Interação Gene-Ambiente , Genética Comportamental/tendências , Genótipo , Humanos , Modelos Genéticos , Modelos Teóricos , Fenótipo , Software , Gêmeos/genética
15.
Behav Genet ; 51(3): 223-236, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33582897

RESUMO

The Classical Twin Method (CTM) compares the similarity of monozygotic (MZ) twins with that of dizygotic (DZ) twins to make inferences about the relative importance of genes and environment in the etiology of individual differences. The design has been applied to thousands of traits across the biomedical, behavioral and social sciences and is arguably the most widely used natural experiment known to science. The fundamental assumption of the CTM is that trait relevant environmental covariation within MZ pairs is the same as that found within DZ pairs, so that zygosity differences in within-pair variance must be due to genetic factors uncontaminated by the environment. This equal environments assumption (EEA) has been, and still is hotly contested, and has been mentioned as a possible contributing factor to the missing heritability conundrum. In this manuscript, we introduce a new model for testing the EEA, which we call the Augmented Classical Twin Design which uses identity by descent (IBD) sharing between DZ twin pairs to estimate separate environmental variance components for MZ and DZ twin pairs, and provides a test of whether these are equal. We show through simulation that given large samples of DZ twin pairs, the model provides unbiased estimates of variance components and valid tests of the EEA under strong assumptions (e.g. no epistatic variance, IBD sharing in DZ twins estimated accurately etc.) which may not hold in reality. Sample sizes in excess of 50,000 DZ twin pairs with genome-wide genetic data are likely to be required in order to detect substantial violations of the EEA with moderate power. Consequently, we recommend that the Augmented Classical Twin Design only be applied to datasets with very large numbers of DZ twin pairs (> 50,000 DZ twin pairs), and given the strong assumptions relating to the absence of epistatic variance, appropriate caution be exercised regarding interpretation of the results.


Assuntos
Doenças em Gêmeos/genética , Estudo de Associação Genômica Ampla/métodos , Estatística como Assunto/métodos , Simulação por Computador , Meio Ambiente , Interação Gene-Ambiente , Genótipo , Humanos , Modelos Genéticos , Modelos Teóricos , Fenótipo , Fatores de Risco , Meio Social , Gêmeos/genética , Gêmeos Dizigóticos/genética , Gêmeos Monozigóticos/genética
16.
Behav Genet ; 51(3): 204-214, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33400061

RESUMO

The measurement of many human traits, states, and disorders begins with a set of items on a questionnaire. The response format for these questions is often simply binary (e.g., yes/no) or ordered (e.g., high, medium or low). During data analysis, these items are frequently summed or used to estimate factor scores. In clinical applications, such assessments are often non-normally distributed in the general population because many respondents are unaffected, and therefore asymptomatic. As a result, in many cases these measures violate the statistical assumptions required for subsequent analyses. To reduce the influence of the non-normality and quasi-continuous assessment, variables are frequently recoded into binary (affected-unaffected) or ordinal (mild-moderate-severe) diagnoses. Ordinal data therefore present challenges at multiple levels of analysis. Categorizing continuous variables into ordered categories typically results in a loss of statistical power, which represents an incentive to the data analyst to assume that the data are normally distributed, even when they are not. Despite prior zeitgeists suggesting that, e.g., variables with more than 10 ordered categories may be regarded as continuous and analyzed as if they were, we show via simulation studies that this is not generally the case. In particular, using Pearson product-moment correlations instead of maximum likelihood estimates of polychoric correlations biases the estimated correlations towards zero. This bias is especially severe when a plurality of the observations fall into a single observed category, such as a score of zero. By contrast, estimating the ordinal correlation by maximum likelihood yields no estimation bias, although standard errors are (appropriately) larger. We also illustrate how odds ratios depend critically on the proportion or prevalence of affected individuals in the population, and therefore are sub-optimal for studies where comparisons of association metrics are needed. Finally, we extend these analyses to the classical twin model and demonstrate that treating binary data as continuous will underestimate genetic and common environmental variance components, and overestimate unique environment (residual) variance. These biases increase as prevalence declines. While modeling ordinal data appropriately may be more computationally intensive and time consuming, failing to do so will likely yield biased correlations and biased parameter estimates from modeling them.


Assuntos
Análise de Dados , Estatística como Assunto/métodos , Estatística como Assunto/tendências , Viés , Simulação por Computador , Humanos , Funções Verossimilhança , Modelos Estatísticos , Razão de Chances , Guias de Prática Clínica como Assunto
17.
Behav Genet ; 51(3): 264-278, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33387133

RESUMO

Offspring resemble their parents for both genetic and environmental reasons. Understanding the relative magnitude of these alternatives has long been a core interest in behavioral genetics research, but traditional designs, which compare phenotypic covariances to make inferences about unmeasured genetic and environmental factors, have struggled to disentangle them. Recently, Kong et al. (2018) showed that by correlating offspring phenotypic values with the measured polygenic score of parents' nontransmitted alleles, one can estimate the effect of "genetic nurture"-a type of passive gene-environment covariation that arises when heritable parental traits directly influence offspring traits. Here, we instantiate this basic idea in a set of causal models that provide novel insights into the estimation of parental influences on offspring. Most importantly, we show how jointly modeling the parental polygenic scores and the offspring phenotypes can provide an unbiased estimate of the variation attributable to the environmental influence of parents on offspring, even when the polygenic score accounts for a small fraction of trait heritability. This model can be further extended to (a) account for the influence of different types of assortative mating, (b) estimate the total variation due to additive genetic effects and their covariance with the familial environment (i.e., the full genetic nurture effect), and (c) model situations where a parental trait influences a different offspring trait. By utilizing structural equation modeling techniques developed for extended twin family designs, our approach provides a general framework for modeling polygenic scores in family studies and allows for various model extensions that can be used to answer old questions about familial influences in new ways.


Assuntos
Herança Materna/genética , Herança Paterna/genética , Estatística como Assunto/métodos , Alelos , Interação Gene-Ambiente , Genótipo , Humanos , Modelos Genéticos , Modelos Teóricos , Herança Multifatorial/genética , Relações Pais-Filho , Pais/psicologia , Fenótipo , Gêmeos/genética
18.
Behav Genet ; 51(3): 279-288, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33301082

RESUMO

In a companion paper Balbona et al. (Behav Genet, in press), we introduced a series of causal models that use polygenic scores from transmitted and nontransmitted alleles, the offspring trait, and parental traits to estimate the variation due to the environmental influences the parental trait has on the offspring trait (vertical transmission) as well as additive genetic effects. These models also estimate and account for the gene-gene and gene-environment covariation that arises from assortative mating and vertical transmission respectively. In the current study, we simulated polygenic scores and phenotypes of parents and offspring under genetic and vertical transmission scenarios, assuming two types of assortative mating. We instantiated the models from our companion paper in the OpenMx software, and compared the true values of parameters to maximum likelihood estimates from models fitted on the simulated data to quantify the bias and precision of estimates. We show that parameter estimates from these models are unbiased when assumptions are met, but as expected, they are biased to the degree that assumptions are unmet. Standard errors of the estimated variances due to vertical transmission and to genetic effects decrease with increasing sample sizes and with increasing [Formula: see text] values of the polygenic score. Even when the polygenic score explains a modest amount of trait variation ([Formula: see text]), standard errors of these standardized estimates are reasonable ([Formula: see text]) for [Formula: see text] trios, and can even be reasonable for smaller sample sizes (e.g., down to 4K) when the polygenic score is more predictive. These causal models offer a novel approach for understanding how parents influence their offspring, but their use requires polygenic scores on relevant traits that are modestly predictive (e.g., [Formula: see text] as well as datasets with genomic and phenotypic information on parents and offspring. The utility of polygenic scores for elucidating parental influences should thus serve as additional motivation for large genomic biobanks to perform GWAS's on traits that may be relevant to parenting and to oversample close relatives, particularly parents and offspring.


Assuntos
Herança Materna/genética , Herança Paterna/genética , Estatística como Assunto/métodos , Alelos , Viés , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Genômica , Genótipo , Humanos , Funções Verossimilhança , Modelos Genéticos , Modelos Teóricos , Herança Multifatorial/genética , Relações Pais-Filho , Poder Familiar , Fenótipo , Gêmeos/genética
19.
Behav Genet ; 51(3): 250-263, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33259025

RESUMO

We present a procedure to simultaneously fit a genetic covariance structure model and a regression model to multivariate data from mono- and dizygotic twin pairs to test for the prediction of a dependent trait by multiple correlated predictors. We applied the model to aggressive behavior as an outcome trait and investigated the prediction of aggression from inattention (InA) and hyperactivity (HA) in two age groups. Predictions were examined in twins with an average age of 10 years (11,345 pairs), and in adult twins with an average age of 30 years (7433 pairs). All phenotypes were assessed by the same, but age-appropriate, instruments in children and adults. Because of the different genetic architecture of aggression, InA and HA, a model was fitted to these data that specified additive and non-additive genetic factors (A and D) plus common and unique environmental (C and E) influences. Given appropriate identifying constraints, this ADCE model is identified in trivariate data. We obtained different results for the prediction of aggression in children, where HA was the more important predictor, and in adults, where InA was the more important predictor. In children, about 36% of the total aggression variance was explained by the genetic and environmental components of HA and InA. Most of this was explained by the genetic components of HA and InA, i.e., 29.7%, with 22.6% due to the genetic component of HA. In adults, about 21% of the aggression variance was explained. Most was this was again explained by the genetic components of InA and HA (16.2%), with 8.6% due to the genetic component of InA.


Assuntos
Agressão/psicologia , Transtorno do Deficit de Atenção com Hiperatividade/genética , Estatística como Assunto/métodos , Adulto , Análise de Variância , Criança , Doenças em Gêmeos/genética , Humanos , Transtornos Mentais/genética , Modelos Genéticos , Modelos Estatísticos , Países Baixos , Fenótipo , Análise de Regressão , Gêmeos/genética , Gêmeos Dizigóticos/genética , Gêmeos Monozigóticos/genética
20.
Behav Genet ; 51(3): 289-300, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33454873

RESUMO

Disaggregation and estimation of genetic effects from offspring and parents has long been of interest to statistical geneticists. Recently, technical and methodological advances have made the genome-wide and loci-specific estimation of direct offspring and parental genetic nurture effects more possible. However, unbiased estimation using these methods requires datasets where both parents and at least one child have been genotyped, which are relatively scarce. Our group has recently developed a method and accompanying software (IMPISH; Hwang et al. in PLoS Genet 16:e1009154, 2020) which is able to impute missing parental genotypes from observed data on sibships and estimate their effects on an offspring phenotype conditional on the effects of genetic transmission. However, this method is unable to disentangle maternal and paternal effects, which may differ in magnitude and direction. Here, we introduce an extension to the original IMPISH routine which takes advantage of all available nuclear families to impute parent-specific missing genotypes and obtain asymptotically unbiased estimates of genetic effects on offspring phenotypes. We apply this this method to data from related individuals in the UK Biobank, showing concordance with previous estimates of maternal genetic effects on offspring birthweight. We also conduct the first GWAS jointly estimating offspring-, maternal-, and paternal-specific genetic effects on body-mass index.


Assuntos
Herança Materna/genética , Herança Paterna/genética , Estatística como Assunto/métodos , Alelos , Peso ao Nascer/genética , Índice de Massa Corporal , Família , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Genômica , Genótipo , Humanos , Funções Verossimilhança , Modelos Genéticos , Modelos Teóricos , Pais , Fenótipo , Irmãos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA