RESUMEN
BACKGROUND: Heritability and genetic correlation can be estimated from genome-wide single-nucleotide polymorphism (SNP) data using various methods. We recently developed multivariate genomic-relatedness-based restricted maximum likelihood (MGREML) for statistically and computationally efficient estimation of SNP-based heritability ([Formula: see text]) and genetic correlation ([Formula: see text]) across many traits in large datasets. Here, we extend MGREML by allowing it to fit and perform tests on user-specified factor models, while preserving the low computational complexity. RESULTS: Using simulations, we show that MGREML yields consistent estimates and valid inferences for such factor models at low computational cost (e.g., for data on 50 traits and 20,000 individuals, a saturated model involving 50 [Formula: see text]'s, 1225 [Formula: see text]'s, and 50 fixed effects is estimated and compared to a restricted model in less than one hour on a single notebook with two 2.7 GHz cores and 16 GB of RAM). Using repeated measures of height and body mass index from the US Health and Retirement Study, we illustrate the ability of MGREML to estimate a factor model and test whether it fits the data better than a nested model. The MGREML tool, the simulation code, and an extensive tutorial are freely available at https://github.com/devlaming/mgreml/ . CONCLUSION: MGREML can now be used to estimate multivariate factor structures and perform inferences on such factor models at low computational cost. This new feature enables simple structural equation modeling using MGREML, allowing researchers to specify, estimate, and compare genetic factor models of their choosing using SNP data.
Asunto(s)
Genómica , Herencia Multifactorial , Genoma , Estudio de Asociación del Genoma Completo , Genómica/métodos , Humanos , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
Large-scale genome-wide association results are typically obtained from a fixed-effects meta-analysis of GWAS summary statistics from multiple studies spanning different regions and/or time periods. This approach averages the estimated effects of genetic variants across studies. In case genetic effects are heterogeneous across studies, the statistical power of a GWAS and the predictive accuracy of polygenic scores are attenuated, contributing to the so-called 'missing heritability'. Here, we describe the online Meta-GWAS Accuracy and Power (MetaGAP) calculator (available at www.devlaming.eu) which quantifies this attenuation based on a novel multi-study framework. By means of simulation studies, we show that under a wide range of genetic architectures, the statistical power and predictive accuracy provided by this calculator are accurate. We compare the predictions from the MetaGAP calculator with actual results obtained in the GWAS literature. Specifically, we use genomic-relatedness-matrix restricted maximum likelihood to estimate the SNP heritability and cross-study genetic correlation of height, BMI, years of education, and self-rated health in three large samples. These estimates are used as input parameters for the MetaGAP calculator. Results from the calculator suggest that cross-study heterogeneity has led to attenuation of statistical power and predictive accuracy in recent large-scale GWAS efforts on these traits (e.g., for years of education, we estimate a relative loss of 51-62% in the number of genome-wide significant loci and a relative loss in polygenic score R2 of 36-38%). Hence, cross-study heterogeneity contributes to the missing heritability.
Asunto(s)
Exactitud de los Datos , Estudio de Asociación del Genoma Completo/normas , Programas Informáticos , Estudio de Asociación del Genoma Completo/métodos , Humanos , Metaanálisis como AsuntoRESUMEN
In psychophysiology, an interesting question is how to estimate the reliability of event-related potentials collected by means of the Eriksen Flanker Task or similar tests. A special problem presents itself if the data represent neurological reactions that are associated with some responses (in case of the Flanker Task, responding incorrectly on a trial) but not others (like when providing a correct response), inherently resulting in unequal numbers of observations per subject. The general trend in reliability research here is to use generalizability theory and Bayesian estimation. We show that a new approach based on classical test theory and frequentist estimation can do the job as well and in a simpler way, and even provides additional insight to matters that were unsolved in the generalizability method approach. One of our contributions is the definition of a single, overall reliability coefficient for an entire group of subjects with unequal numbers of observations. Both methods have slightly different objectives. We argue in favor of the classical approach but without rejecting the generalizability approach.
RESUMEN
Genome-wide association studies are characterized by a huge number of statistical tests performed to discover new disease-related genetic variants [in the form of single-nucleotide polymorphisms (SNPs)] in human DNA. Many SNPs have been identified for cross-sectionally measured phenotypes. However, there is a growing interest in genetic determinants of the evolution of traits over time. Dealing with correlated observations from the same individual, we need to apply advanced statistical techniques. The linear mixed model is popular but also much more computationally demanding than fitting a linear regression model to independent observations. We propose a conditional two-step approach as an approximate method to explore the longitudinal relationship between the trait and the SNP. In a simulation study, we compare several fast methods with respect to their accuracy and speed. The conditional two-step approach is applied to relate SNPs to longitudinal bone mineral density responses collected in the Rotterdam Study.
Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Modelos Lineales , Modelos Genéticos , Anciano , Densidad Ósea/genética , Simulación por Computador , Humanos , Estudios Longitudinales , Persona de Mediana Edad , Osteoporosis/genética , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo HeredableRESUMEN
Risk behavior has substantial consequences for health, well-being, and general behavior. The association between real-world risk behavior and risk behavior on experimental tasks is well documented, but their modeling is challenging for several reasons. First, many experimental risk tasks may end prematurely leading to censored observations. Second, certain outcome values can be more attractive than others. Third, a priori unknown groups of participants can react differently to certain risk-levels. Here, we propose the censored mixture model which models risk taking while dealing with censoring, attractiveness to certain outcomes, and unobserved individual risk preferences, next to experimental conditions.
Asunto(s)
Asunción de Riesgos , Humanos , PsicometríaRESUMEN
Human variation in brain morphology and behavior are related and highly heritable. Yet, it is largely unknown to what extent specific features of brain morphology and behavior are genetically related. Here, we introduce a computationally efficient approach for multivariate genomic-relatedness-based restricted maximum likelihood (MGREML) to estimate the genetic correlation between a large number of phenotypes simultaneously. Using individual-level data (N = 20,190) from the UK Biobank, we provide estimates of the heritability of gray-matter volume in 74 regions of interest (ROIs) in the brain and we map genetic correlations between these ROIs and health-relevant behavioral outcomes, including intelligence. We find four genetically distinct clusters in the brain that are aligned with standard anatomical subdivision in neuroscience. Behavioral traits have distinct genetic correlations with brain morphology which suggests trait-specific relevance of ROIs. These empirical results illustrate how MGREML can be used to estimate internally consistent and high-dimensional genetic correlation matrices in large datasets.
Asunto(s)
Conducta , Encéfalo/anatomía & histología , Corteza Cerebral , Femenino , Genoma Humano , Humanos , Masculino , Modelos Genéticos , Análisis MultivarianteRESUMEN
Humans vary substantially in their willingness to take risks. In a combined sample of over 1 million individuals, we conducted genome-wide association studies (GWAS) of general risk tolerance, adventurousness, and risky behaviors in the driving, drinking, smoking, and sexual domains. Across all GWAS, we identified hundreds of associated loci, including 99 loci associated with general risk tolerance. We report evidence of substantial shared genetic influences across risk tolerance and the risky behaviors: 46 of the 99 general risk tolerance loci contain a lead SNP for at least one of our other GWAS, and general risk tolerance is genetically correlated ([Formula: see text] ~ 0.25 to 0.50) with a range of risky behaviors. Bioinformatics analyses imply that genes near SNPs associated with general risk tolerance are highly expressed in brain tissues and point to a role for glutamatergic and GABAergic neurotransmission. We found no evidence of enrichment for genes previously hypothesized to relate to risk tolerance.
Asunto(s)
Conducta/fisiología , Sitios Genéticos/genética , Predisposición Genética a la Enfermedad/genética , Estudios de Casos y Controles , Femenino , Genética Conductual/métodos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Masculino , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Values are central to public debates today. Human values convey broad goals that serve as guiding principles in a person's life and value priorities differ across people in society. Groups in society holding opposing values (e.g., universalism versus security) will make different choices when voting in an election. Whereas over time, values are relatively stable, the number and type of political parties as well as the political values they communicate and disseminate have been changing. Groups of people holding the same human values may therefore vote for another (new) party in a later election. We focus on analyzing the relationship between human values and voting in elections, introducing a new methodology to analyze how value profiles relate to political support over time. We investigate the Dutch multi-party political system over five waves of the European Social Survey, spanning 2002 until 2010. Whilst previous research has focused on individual values separately and focused on voters only, we (1) distinguish groups holding a similar set of opposing and compatible values (value profile) instead of focusing on single values in the the entire population; (2) incorporate a correction for differences in scale use in our model; (3) compare voting over time; (4) include non-voters, a growing group in Dutch society. We find evidence that specific value profiles are related to voting for a specific set of political parties. We also find that specific value profiles distinguish non-voters from voters and that voters for populist parties resemble non-voters.
Asunto(s)
Conducta Competitiva , Política , Humanos , Modelos Teóricos , Países BajosRESUMEN
Genome-wide association studies (GWAS) with longitudinal phenotypes provide opportunities to identify genetic variations associated with changes in human traits over time. Mixed models are used to correct for the correlated nature of longitudinal data. GWA studies are notorious for their computational challenges, which are considerable when mixed models for thousands of individuals are fitted to millions of SNPs. We present a new algorithm that speeds up a genome-wide analysis of longitudinal data by several orders of magnitude. It solves the equivalent penalized least squares problem efficiently, computing variances in an initial step. Factorizations and transformations are used to avoid inversion of large matrices. Because the system of equations is bordered, we can re-use components, which can be precomputed for the mixed model without a SNP. Two SNP effects (main and its interaction with time) are obtained. Our method completes the analysis a thousand times faster than the R package lme4, providing an almost identical solution for the coefficients and p-values. We provide an R implementation of our algorithm.
Asunto(s)
Algoritmos , Estudio de Asociación del Genoma Completo/métodos , Modelos Genéticos , Simulación por Computador , Estudios Transversales , Exactitud de los Datos , Humanos , Análisis de los Mínimos Cuadrados , Modelos Lineales , Estudios Longitudinales , Fenotipo , Polimorfismo de Nucleótido Simple , Programas InformáticosRESUMEN
The authors provide a didactic treatment of nonlinear (categorical) principal components analysis (PCA). This method is the nonlinear equivalent of standard PCA and reduces the observed variables to a number of uncorrelated principal components. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal variables and that it can handle and discover nonlinear relationships between variables. Also, nonlinear PCA can deal with variables at their appropriate measurement level; for example, it can treat Likert-type scales ordinally instead of numerically. Every observed value of a variable can be referred to as a category. While performing PCA, nonlinear PCA converts every category to a numeric value, in accordance with the variable's analysis level, using optimal quantification. The authors discuss how optimal quantification is carried out, what analysis levels are, which decisions have to be made when applying nonlinear PCA, and how the results can be interpreted. The strengths and limitations of the method are discussed. An example applying nonlinear PCA to empirical data using the program CATPCA (J. J. Meulman, W. J. Heiser, & SPSS, 2004) is provided.
Asunto(s)
Modelos Psicológicos , Niño , HumanosRESUMEN
Principal components analysis (PCA) is used to explore the structure of data sets containing linearly related numeric variables. Alternatively, nonlinear PCA can handle possibly nonlinearly related numeric as well as nonnumeric variables. For linear PCA, the stability of its solution can be established under the assumption of multivariate normality. For nonlinear PCA, however, standard options for establishing stability are not provided. The authors use the nonparametric bootstrap procedure to assess the stability of nonlinear PCA results, applied to empirical data. They use confidence intervals for the variable transformations and confidence ellipses for the eigenvalues, the component loadings, and the person scores. They discuss the balanced version of the bootstrap, bias estimation, and Procrustes rotation. To provide a benchmark, the same bootstrap procedure is applied to linear PCA on the same data. On the basis of the results, the authors advise using at least 1,000 bootstrap samples, using Procrustes rotation on the bootstrap results, examining the bootstrap distributions along with the confidence regions, and merging categories with small marginal frequencies to reduce the variance of the bootstrap results.
Asunto(s)
Investigación Empírica , Modelos Psicológicos , Medio Social , HumanosRESUMEN
A new framework for sequential multiblock component methods is presented. This framework relies on a new version of regularized generalized canonical correlation analysis (RGCCA) where various scheme functions and shrinkage constants are considered. Two types of between block connections are considered: blocks are either fully connected or connected to the superblock (concatenation of all blocks). The proposed iterative algorithm is monotone convergent and guarantees obtaining at convergence a stationary point of RGCCA. In some cases, the solution of RGCCA is the first eigenvalue/eigenvector of a certain matrix. For the scheme functions x, [Formula: see text], [Formula: see text] or [Formula: see text] and shrinkage constants 0 or 1, many multiblock component methods are recovered.
RESUMEN
Power differences are ubiquitous in social settings. However, the question of whether groups with higher or lower power disparity achieve better performance has thus far received conflicting answers. To address this issue, we identify 3 underlying assumptions in the literature that may have led to these divergent findings, including a myopic focus on static hierarchies, an assumption that those at the top of hierarchies are competent at group tasks, and an assumption that equality is not possible. We employ a multimethod set of studies to examine these assumptions and to understand when power disparity will help or harm group performance. First, our agent-based simulation analyses show that by unpacking these common implicit assumptions in power research, we can explain earlier disparate findings--power disparity benefits group performance when it is dynamically aligned with the power holder's task competence, and harms group performance when held constant and/or is not aligned with task competence. Second, our empirical findings in both a field study of fraud investigation groups and a multiround laboratory study corroborate the simulation results. We thereby contribute to research on power by highlighting a dynamic understanding of power in groups and explaining how current implicit assumptions may lead to opposing findings.
Asunto(s)
Conducta Cooperativa , Procesos de Grupo , Jerarquia Social , Poder Psicológico , Adulto , Femenino , Humanos , Masculino , Adulto JovenRESUMEN
In recent years, there has been a considerable amount of research on the use of regularization methods for inference and prediction in quantitative genetics. Such research mostly focuses on selection of markers and shrinkage of their effects. In this review paper, the use of ridge regression for prediction in quantitative genetics using single-nucleotide polymorphism data is discussed. In particular, we consider (i) the theoretical foundations of ridge regression, (ii) its link to commonly used methods in animal breeding, (iii) the computational feasibility, and (iv) the scope for constructing prediction models with nonlinear effects (e.g., dominance and epistasis). Based on a simulation study we gauge the current and future potential of ridge regression for prediction of human traits using genome-wide SNP data. We conclude that, for outcomes with a relatively simple genetic architecture, given current sample sizes in most cohorts (i.e., N < 10,000) the predictive accuracy of ridge regression is slightly higher than the classical genome-wide association study approach of repeated simple regression (i.e., one regression per SNP). However, both capture only a small proportion of the heritability. Nevertheless, we find evidence that for large-scale initiatives, such as biobanks, sample sizes can be achieved where ridge regression compared to the classical approach improves predictive accuracy substantially.
Asunto(s)
Análisis Mutacional de ADN/tendencias , Predicción , Genética/tendencias , Estudio de Asociación del Genoma Completo/tendencias , Polimorfismo de Nucleótido Simple/genética , Análisis de Regresión , Interpretación Estadística de DatosRESUMEN
Dual scaling (DS) is a multivariate exploratory method equivalent to correspondence analysis when analysing contingency tables. However, for the analysis of rating data, different proposals appear in the DS and correspondence analysis literature. It is shown here that a peculiarity of the DS method can be exploited to detect differences in response styles. Response styles occur when respondents use rating scales differently for reasons not related to the questions, often biasing results. A spline-based constrained version of DS is devised which can detect the presence of four prominent types of response styles, and is extended to allow for multiple response styles. An alternating nonnegative least squares algorithm is devised for estimating the parameters. The new method is appraised both by simulation studies and an empirical application.
Asunto(s)
Análisis de los Mínimos Cuadrados , Algoritmos , Sesgo , Humanos , Análisis Multivariante , Psicometría/estadística & datos numéricos , Encuestas y CuestionariosRESUMEN
We broaden the developmental focus of the theory of universals in basic human values (Schwartz, 1992, Advances in Experimental Social Psychology) by presenting supportive evidence on children's values from six countries: Germany, Italy, Poland, Bulgaria, the United States, and New Zealand. 3,088 7-11-year-old children completed the Picture-Based Value Survey for Children (PBVS-C, Döring et al., 2010, J. Pers. Assess., 92, 439). Grade 5 children also completed the Portrait Values Questionnaire (PVQ, Schwartz, 2003, A proposal for measuring value orientations across nations. Chapter 7 in the Questionnaire Development Package of the European Social Survey). Findings reveal that the broad value structures, sex differences in value priorities and pan-cultural value hierarchies typical of adults have already taken form at this early age. We discuss the conceptual implications of these findings for the new field of children's basic values by embedding them in the recent developmental literature.
Asunto(s)
Comparación Transcultural , Valores Sociales/etnología , Adulto , Bulgaria , Niño , Desarrollo Infantil , Femenino , Alemania , Humanos , Italia , Masculino , Nueva Zelanda , Polonia , Caracteres Sexuales , Encuestas y Cuestionarios , Estados UnidosRESUMEN
Previous research has suggested a positive association between testosterone (T) and entrepreneurial behavior in males. However, this evidence was found in a study with a small sample size and has not been replicated. In the present study, we aimed to verify this association using two large, independent, population-based samples of males. We tested the association of T with entrepreneurial behavior, operationalized as self-employment, using data from the Rotterdam Study (N=587) and the Study of Health in Pomerania (N=1697). Total testosterone (TT) and sex hormone-binding globulin (SHBG) were measured in the serum. Free testosterone (FT), non-SHBG-bound T (non-SHBG-T), and the TT/SHBG ratio were calculated and used as measures of bioactive serum T, in addition to TT adjusted for SHBG. Using logistic regression models, we found no significant associations between any of the serum T measures and self-employment in either of the samples. To our knowledge, this is the first large-scale study on the relationship between serum T and entrepreneurial behavior.
Asunto(s)
Emprendimiento , Testosterona/sangre , Anciano , Estudios Transversales , Humanos , Masculino , Persona de Mediana Edad , Globulina de Unión a Hormona Sexual/metabolismoRESUMEN
OBJECTIVES: The aim of the present study was to investigate the pattern of emergence of permanent teeth using nonparametric techniques. MATERIALS AND METHODS: Data were obtained from the Signal-Tandmobiel project, a 6-year prospective dental study conducted in Flanders (Belgium) in which 4468 primary school children born in 1989 were annually examined. A new exploratory method for interval-censored data, the IC-biplot, was applied to estimate individual sequences of emergence. In addition, the method renders a nice graphical representation of both children and teeth in the plane where the individual sequences of emergence can easily be visualized. On the basis of the estimated individual sequences, their corresponding prevalences were calculated. RESULTS: The study revealed that between 7 and 13 different sequences of emergence can be expected depending on gender and quadrant. The prevalences of the most frequent sequences in girls varied from 35% to 85% depending on the quadrant, while in boys they varied from 28% to 32%. Most sequences in the maxilla start with 6-1-2 and in the mandible with 1-6-2. CONCLUSIONS: The IC-biplot is a flexible procedure that allows an easy visualization of the pattern of emergence of permanent teeth. Rank orders derived from the IC-biplot confirm rank orders suggested earlier in the literature.