Search | VHL Search Portal

1.

Joint models reveal genetic architecture of pubertal stage transitions and their association with BMI in admixed Chilean population.

Vicuña, Lucas; Barrientos, Esteban; Leiva-Yamaguchi, Valeria; Alvares, Danilo; Mericq, Veronica; Pereira, Anita; Eyheramendy, Susana.

Hum Mol Genet ; 2024 Jul 09.

Article in English | MEDLINE | ID: mdl-38981621

ABSTRACT

Early or late pubertal onset can lead to disease in adulthood, including cancer, obesity, type 2 diabetes, metabolic disorders, bone fractures, and psychopathologies. Thus, knowing the age at which puberty is attained is crucial as it can serve as a risk factor for future diseases. Pubertal development is divided into five stages of sexual maturation in boys and girls according to the standardized Tanner scale. We performed genome-wide association studies (GWAS) on the "Growth and Obesity Chilean Cohort Study" cohort composed of admixed children with mainly European and Native American ancestry. Using joint models that integrate time-to-event data with longitudinal trajectories of body mass index (BMI), we identified genetic variants associated with phenotypic transitions between pairs of Tanner stages. We identified $42$ novel significant associations, most of them in boys. The GWAS on Tanner $3\rightarrow 4$ transition in boys captured an association peak around the growth-related genes LARS2 and LIMD1 genes, the former of which causes ovarian dysfunction when mutated. The associated variants are expression and splicing Quantitative Trait Loci regulating gene expression and alternative splicing in multiple tissues. Further, higher individual Native American genetic ancestry proportions predicted a significantly earlier puberty onset in boys but not in girls. Finally, the joint models identified a longitudinal BMI parameter significantly associated with several Tanner stages' transitions, confirming the association of BMI with pubertal timing.

2.

Tree-based QTL mapping with expected local genetic relatedness matrices.

Link, Vivian; Schraiber, Joshua G; Fan, Caoqi; Dinh, Bryan; Mancuso, Nicholas; Chiang, Charleston W K; Edge, Michael D.

Am J Hum Genet ; 110(12): 2077-2091, 2023 Dec 07.

Article in English | MEDLINE | ID: mdl-38065072

ABSTRACT

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix (local eGRM) given the ARG. Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to analyze two chromosomes containing known body size loci in a sample of Native Hawaiians. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.

Subject(s)

Genetics, Population , Genome-Wide Association Study , Quantitative Trait Loci , Humans , Chromosome Mapping/methods , Models, Genetic , Phenotype , Quantitative Trait Loci/genetics , Native Hawaiian or Other Pacific Islander/genetics

3.

DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification.

Jain, Sarthak; Safo, Sandra E.

Brief Bioinform ; 25(4)2024 May 23.

Article in English | MEDLINE | ID: mdl-39007595

ABSTRACT

Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, which presents limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. In addition, it identifies key variables that contribute to the association between views and the separation between classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks for cross-sectional data and recurrent neural networks for longitudinal data. We applied this pipeline to cross-sectional and longitudinal multiomics data (metagenomics, transcriptomics and metabolomics) from an inflammatory bowel disease (IBD) study and identified microbial pathways, metabolites and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods.

Subject(s)

Deep Learning , Inflammatory Bowel Diseases , Humans , Cross-Sectional Studies , Inflammatory Bowel Diseases/classification , Inflammatory Bowel Diseases/genetics , Longitudinal Studies , Discriminant Analysis , Metabolomics/methods , Computational Biology/methods

4.

Generational differences in mental health trends in the twenty-first century.

Botha, Ferdi; Morris, Richard W; Butterworth, Peter; Glozier, Nick.

Proc Natl Acad Sci U S A ; 120(49): e2303781120, 2023 Dec 05.

Article in English | MEDLINE | ID: mdl-38011547

ABSTRACT

Given the observed deterioration in mental health among Australians over the past decade, this study investigates to what extent this differs in people born in different decades-i.e., possible birth cohort differences in the mental health of Australians. Using 20 y of data from a large, nationally representative panel survey (N = 27,572), we find strong evidence that cohort effects are driving the increase in population-level mental ill-health. Deteriorating mental health is particularly pronounced among people born in the 1990s and seen to a lesser extent among the 1980s cohort. There is little evidence that mental health is worsening with age for people born prior to the 1980s. The findings from this study highlight that it is the poorer mental health of Millennials that is driving the apparent deterioration in population-level mental health. Understanding the context and changes in society that have differentially affected younger people may inform efforts to ameliorate this trend and prevent it continuing for emerging cohorts.

Subject(s)

Mental Health , Humans , Australia/epidemiology , Surveys and Questionnaires

5.

Quantifying the contribution of dominance deviation effects to complex trait variation in biobank-scale data.

Pazokitoroudi, Ali; Chiu, Alec M; Burch, Kathryn S; Pasaniuc, Bogdan; Sankararaman, Sriram.

Am J Hum Genet ; 108(5): 799-808, 2021 05 06.

Article in English | MEDLINE | ID: mdl-33811807

ABSTRACT

The proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method to estimate the variation in a complex trait that can be attributed to additive (additive heritability) and dominance deviation (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of additive and dominance heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency [MAF] > 1%) and 50 quantitative traits measured in 291,273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no statistically significant evidence for dominance heritability (p<0.05/50 accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.

Subject(s)

Biological Specimen Banks , Datasets as Topic , Genes, Dominant/genetics , Genetic Variation , Multifactorial Inheritance/genetics , Female , Humans , Male , Models, Genetic , Polymorphism, Single Nucleotide/genetics

6.

A cautionary tale on the effects of different covariance structures in linear mixed effects modeling of fMRI data.

van der Horn, Harm Jan; Erhardt, Erik B; Dodd, Andrew B; Nathaniel, Upasana; Wick, Tracey V; McQuaid, Jessica R; Ryman, Sephira G; Vakhtin, Andrei A; Meier, Timothy B; Mayer, Andrew R.

Hum Brain Mapp ; 45(7): e26699, 2024 May.

Article in English | MEDLINE | ID: mdl-38726907

ABSTRACT

With the steadily increasing abundance of longitudinal neuroimaging studies with large sample sizes and multiple repeated measures, questions arise regarding the appropriate modeling of variance and covariance. The current study examined the influence of standard classes of variance-covariance structures in linear mixed effects (LME) modeling of fMRI data from patients with pediatric mild traumatic brain injury (pmTBI; N = 181) and healthy controls (N = 162). During two visits, participants performed a cognitive control fMRI paradigm that compared congruent and incongruent stimuli. The hemodynamic response function was parsed into peak and late peak phases. Data were analyzed with a 4-way (GROUP×VISIT×CONGRUENCY×PHASE) LME using AFNI's 3dLME and compound symmetry (CS), autoregressive process of order 1 (AR1), and unstructured (UN) variance-covariance matrices. Voxel-wise results dramatically varied both within the cognitive control network (UN>CS for CONGRUENCY effect) and broader brain regions (CS>UN for GROUP:VISIT) depending on the variance-covariance matrix that was selected. Additional testing indicated that both model fit and estimated standard error were superior for the UN matrix, likely as a result of the modeling of individual terms. In summary, current findings suggest that the interpretation of results from complex designs is highly dependent on the selection of the variance-covariance structure using LME modeling.

Subject(s)

Magnetic Resonance Imaging , Humans , Male , Female , Adolescent , Child , Brain Concussion/diagnostic imaging , Brain Concussion/physiopathology , Linear Models , Brain/diagnostic imaging , Brain/physiology , Brain Mapping/methods , Executive Function/physiology

7.

FEMA: Fast and efficient mixed-effects algorithm for large sample whole-brain imaging data.

Parekh, Pravesh; Fan, Chun Chieh; Frei, Oleksandr; Palmer, Clare E; Smith, Diana M; Makowski, Carolina; Iversen, John R; Pecheva, Diliana; Holland, Dominic; Loughnan, Robert; Nedelec, Pierre; Thompson, Wesley K; Hagler, Donald J; Andreassen, Ole A; Jernigan, Terry L; Nichols, Thomas E; Dale, Anders M.

Hum Brain Mapp ; 45(2): e26579, 2024 Feb 01.

Article in English | MEDLINE | ID: mdl-38339910

ABSTRACT

The linear mixed-effects model (LME) is a versatile approach to account for dependence among observations. Many large-scale neuroimaging datasets with complex designs have increased the need for LME; however LME has seldom been used in whole-brain imaging analyses due to its heavy computational requirements. In this paper, we introduce a fast and efficient mixed-effects algorithm (FEMA) that makes whole-brain vertex-wise, voxel-wise, and connectome-wide LME analyses in large samples possible. We validate FEMA with extensive simulations, showing that the estimates of the fixed effects are equivalent to standard maximum likelihood estimates but obtained with orders of magnitude improvement in computational speed. We demonstrate the applicability of FEMA by studying the cross-sectional and longitudinal effects of age on region-of-interest level and vertex-wise cortical thickness, as well as connectome-wide functional connectivity values derived from resting state functional MRI, using longitudinal imaging data from the Adolescent Brain Cognitive DevelopmentSM Study release 4.0. Our analyses reveal distinct spatial patterns for the annualized changes in vertex-wise cortical thickness and connectome-wide connectivity values in early adolescence, highlighting a critical time of brain maturation. The simulations and application to real data show that FEMA enables advanced investigation of the relationships between large numbers of neuroimaging metrics and variables of interest while considering complex study designs, including repeated measures and family structures, in a fast and efficient manner. The source code for FEMA is available via: https://github.com/cmig-research-group/cmig_tools/.

Subject(s)

Connectome , Magnetic Resonance Imaging , Adolescent , Humans , Magnetic Resonance Imaging/methods , Cross-Sectional Studies , Brain/diagnostic imaging , Neuroimaging/methods , Connectome/methods , Algorithms

8.

A penalized linear mixed model with generalized method of moments for prediction analysis on high-dimensional multi-omics data.

Wang, Xiaqiong; Wen, Yalu.

Brief Bioinform ; 23(4)2022 07 18.

Article in English | MEDLINE | ID: mdl-35649346

ABSTRACT

With the advances in high-throughput biotechnologies, high-dimensional multi-layer omics data become increasingly available. They can provide both confirmatory and complementary information to disease risk and thus have offered unprecedented opportunities for risk prediction studies. However, the high-dimensionality and complex inter/intra-relationships among multi-omics data have brought tremendous analytical challenges. Here we present a computationally efficient penalized linear mixed model with generalized method of moments estimator (MpLMMGMM) for the prediction analysis on multi-omics data. Our method extends the widely used linear mixed model proposed for genomic risk predictions to model multi-omics data, where kernel functions are used to capture various types of predictive effects from different layers of omics data and penalty terms are introduced to reduce the impact of noise. Compared with existing penalized linear mixed models, the proposed method adopts the generalized method of moments estimator and it is much more computationally efficient. Through extensive simulation studies and the analysis of positron emission tomography imaging outcomes, we have demonstrated that MpLMMGMM can simultaneously consider a large number of variables and efficiently select those that are predictive from the corresponding omics layers. It can capture both linear and nonlinear predictive effects and achieves better prediction performance than competing methods.

Subject(s)

Algorithms , Genomics , Genome , Genomics/methods , Linear Models , Research Design

9.

A host of issues: pseudoreplication in host-microbiota studies.

McKinnon Reish, Hannah; Dewey, Lindsey; Kirschman, Lucas J.

Appl Environ Microbiol ; 90(8): e0103324, 2024 Aug 21.

Article in English | MEDLINE | ID: mdl-39082810

ABSTRACT

Pseudoreplication compromises the validity of research by treating non-independent samples as independent replicates. This review examines the prevalence of pseudoreplication in host-microbiota studies, highlighting the critical need for rigorous experimental design and appropriate statistical analysis. We systematically reviewed 115 manuscripts on host-microbiota interactions. Our analysis revealed that 22% of the papers contained pseudoreplication, primarily due to co-housed organisms, whereas 52% lacked sufficient methodological details. The remaining 26% adequately addressed pseudoreplication through proper experimental design or statistical analysis. The high incidence of pseudoreplication and insufficient information underscores the importance of methodological reporting and statistical rigor to ensure reproducibility of host-microbiota research.

Subject(s)

Host Microbial Interactions , Microbiota , Animals , Humans , Reproducibility of Results , Research Design

10.

Relationship between Intraocular Pressure Fluctuation and Visual Field Progression Rates in the United Kingdom Glaucoma Treatment Study.

Rabiolo, Alessandro; Montesano, Giovanni; Crabb, David P; Garway-Heath, David F.

Ophthalmology ; 131(8): 902-913, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38354911

ABSTRACT

PURPOSE: To investigate whether intraocular pressure (IOP) fluctuation is associated independently with the rate of visual field (VF) progression in the United Kingdom Glaucoma Treatment Study. DESIGN: Randomized, double-masked, placebo-controlled multicenter trial. PARTICIPANTS: Participants with ≥5 VFs (213 placebo, 217 treatment). METHODS: Associations between IOP metrics and VF progression rates (mean deviation [MD] and five fastest locations) were assessed with linear mixed models. Fluctuation variables were mean Pascal ocular pulse amplitude (OPA), standard deviation (SD) of diurnal Goldmann IOP (diurnal fluctuation), and SD of Goldmann IOP at all visits (long-term fluctuation). Fluctuation values were normalized for mean IOP to make them independent from the mean IOP. Correlated nonfluctuation IOP metrics (baseline, peak, mean, supine, and peak phasing IOP) were combined with principal component analysis, and principal component 1 (PC1) was included as a covariate. Interactions between covariates and time from baseline modeled the effect of the variables on VF rates. Analyses were conducted separately in the two treatment arms. MAIN OUTCOME MEASURES: Associations between IOP fluctuation metrics and rates of MD and the five fastest test locations. RESULTS: In the placebo arm, only PC1 was associated significantly with the MD rate (estimate, -0.19 dB/year [standard error (SE), 0.04 dB/year]; P < 0.001), whereas normalized IOP fluctuation metrics were not. No variable was associated significantly with MD rates in the treatment arm. For the fastest five locations in the placebo group, PC1 (estimate, -0.58 dB/year [SE, 0.16 dB/year]; P < 0.001), central corneal thickness (estimate, 0.26 dB/year [SE, 0.10 dB/year] for 10 µm thicker; P = 0.01) and normalized OPA (estimate, -3.50 dB/year [SE, 1.04 dB/year]; P = 0.001) were associated with rates of progression; normalized diurnal and long-term IOP fluctuations were not. In the treatment group, only PC1 (estimate, -0.27 dB/year [SE, 0.12 dB/year]; P = 0.028) was associated with the rates of progression. CONCLUSIONS: No evidence supports that either diurnal or long-term IOP fluctuation, as measured in clinical practice, are independent factors for glaucoma progression; other aspects of IOP, including mean IOP and peak IOP, may be more informative. Ocular pulse amplitude may be an independent factor for faster glaucoma progression. FINANCIAL DISCLOSURE(S): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Subject(s)

Antihypertensive Agents , Disease Progression , Glaucoma, Open-Angle , Intraocular Pressure , Tonometry, Ocular , Visual Fields , Humans , Intraocular Pressure/physiology , Visual Fields/physiology , Double-Blind Method , Antihypertensive Agents/therapeutic use , Male , Female , Aged , Glaucoma, Open-Angle/physiopathology , Glaucoma, Open-Angle/drug therapy , United Kingdom , Middle Aged , Visual Field Tests , Vision Disorders/physiopathology , Latanoprost/therapeutic use , Circadian Rhythm/physiology

11.

The Benefits of Permutation-Based Genome-Wide Association Studies.

John, Maura; Korte, Arthur; Grimm, Dominik G.

J Exp Bot ; 2024 Jul 02.

Article in English | MEDLINE | ID: mdl-38954539

ABSTRACT

Linear mixed models (LMMs) are a commonly used method for genome-wide association studies (GWAS) that aim to detect associations between genetic markers and phenotypic measurements in a population of individuals while accounting for population structure and cryptic relatedness. In a standard GWAS, hundreds of thousands to millions of statistical tests are performed, requiring control for multiple hypothesis testing. Typically, static corrections that penalize the number of tests performed are used to control for the family-wise error rate, which is the probability of making at least one false positive. However, it has been shown that in practice this threshold is too conservative for normally distributed phenotypes and not stringent enough for non-normally distributed phenotypes. Therefore, permutation-based LMM approaches have recently been proposed to provide a more realistic threshold that takes phenotypic distributions into account. In this work, we will discuss the advantages of permutation-based GWAS approaches, including new simulations and results from a re-analysis of all publicly available Arabidopsis thaliana phenotypes from the AraPheno database.

12.

Efficient computation of high-dimensional penalized generalized linear mixed models by latent factor modeling of the random effects.

Heiling, Hillary M; Rashid, Naim U; Li, Quefeng; Peng, Xianlu L; Yeh, Jen Jen; Ibrahim, Joseph G.

Biometrics ; 80(1)2024 Jan 29.

Article in English | MEDLINE | ID: mdl-38497825

ABSTRACT

Modern biomedical datasets are increasingly high-dimensional and exhibit complex correlation structures. Generalized linear mixed models (GLMMs) have long been employed to account for such dependencies. However, proper specification of the fixed and random effects in GLMMs is increasingly difficult in high dimensions, and computational complexity grows with increasing dimension of the random effects. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in high dimensions by reducing the latent space from a large number of random effects to a smaller set of latent factors. We also extend our prior work to estimate model parameters using a modified Monte Carlo Expectation Conditional Minimization algorithm, allowing us to perform variable selection on both the fixed and random effects simultaneously. We show through simulation that through this factor model decomposition, our method can fit high-dimensional penalized GLMMs faster than comparable methods and more easily scale to larger dimensions not previously seen in existing approaches.

Subject(s)

Algorithms , Computer Simulation , Linear Models , Monte Carlo Method

13.

A mixed model approach to estimate the survivor average causal effect in cluster-randomized trials.

Wang, Wei; Tong, Guangyu; Hirani, Shashivadan P; Newman, Stanton P; Halpern, Scott D; Small, Dylan S; Li, Fan; Harhay, Michael O.

Stat Med ; 43(1): 16-33, 2024 01 15.

Article in English | MEDLINE | ID: mdl-37985966

ABSTRACT

In many medical studies, the outcome measure (such as quality of life, QOL) for some study participants becomes informatively truncated (censored, missing, or unobserved) due to death or other forms of dropout, creating a nonignorable missing data problem. In such cases, the use of a composite outcome or imputation methods that fill in unmeasurable QOL values for those who died rely on strong and untestable assumptions and may be conceptually unappealing to certain stakeholders when estimating a treatment effect. The survivor average causal effect (SACE) is an alternative causal estimand that surmounts some of these issues. While principal stratification has been applied to estimate the SACE in individually randomized trials, methods for estimating the SACE in cluster-randomized trials are currently limited. To address this gap, we develop a mixed model approach along with an expectation-maximization algorithm to estimate the SACE in cluster-randomized trials. We model the continuous outcome measure with a random intercept to account for intracluster correlations due to cluster-level randomization, and model the principal strata membership both with and without a random intercept. In simulations, we compare the performance of our approaches with an existing fixed-effects approach to illustrate the importance of accounting for clustering in cluster-randomized trials. The methodology is then illustrated using a cluster-randomized trial of telecare and assistive technology on health-related QOL in the elderly.

Subject(s)

Models, Statistical , Quality of Life , Humans , Aged , Randomized Controlled Trials as Topic , Outcome Assessment, Health Care , Survivors

14.

Novel non-linear models for clinical trial analysis with longitudinal data: A tutorial using SAS for both frequentist and Bayesian methods.

Wang, Guoqiao; Wang, Whedy; Mangal, Brian; Liao, Yijie; Schneider, Lon; Li, Yan; Xiong, Chengjie; McDade, Eric; Kennedy, Richard; Bateman, Randall; Cutter, Gary.

Stat Med ; 43(15): 2987-3004, 2024 Jul 10.

Article in English | MEDLINE | ID: mdl-38727205

ABSTRACT

Longitudinal data from clinical trials are commonly analyzed using mixed models for repeated measures (MMRM) when the time variable is categorical or linear mixed-effects models (ie, random effects model) when the time variable is continuous. In these models, statistical inference is typically based on the absolute difference in the adjusted mean change (for categorical time) or the rate of change (for continuous time). Previously, we proposed a novel approach: modeling the percentage reduction in disease progression associated with the treatment relative to the placebo decline using proportional models. This concept of proportionality provides an innovative and flexible method for simultaneously modeling different cohorts, multivariate endpoints, and jointly modeling continuous and survival endpoints. Through simulated data, we demonstrate the implementation of these models using SAS procedures in both frequentist and Bayesian approaches. Additionally, we introduce a novel method for implementing MMRM models (ie, analysis of response profile) using the nlmixed procedure.

Subject(s)

Bayes Theorem , Clinical Trials as Topic , Computer Simulation , Models, Statistical , Humans , Longitudinal Studies , Clinical Trials as Topic/methods , Nonlinear Dynamics , Proportional Hazards Models , Data Interpretation, Statistical

15.

Multivariate probit linear mixed models for multivariate longitudinal binary data.

Lee, Kuo-Jung; Kim, Chanmin; Yoo, Jae Keun; Lee, Keunbaik.

Stat Med ; 43(8): 1527-1548, 2024 Apr 15.

Article in English | MEDLINE | ID: mdl-38488782

ABSTRACT

When analyzing multivariate longitudinal binary data, we estimate the effects on the responses of the covariates while accounting for three types of complex correlations present in the data. These include the correlations within separate responses over time, cross-correlations between different responses at different times, and correlations between different responses at each time point. The number of parameters thus increases quadratically with the dimension of the correlation matrix, making parameter estimation difficult; the estimated correlation matrix must also meet the positive definiteness constraint. The correlation matrix may additionally be heteroscedastic; however, the matrix structure is commonly considered to be homoscedastic and constrained, such as exchangeable or autoregressive with order one. These assumptions are overly strong, resulting in skewed estimates of the covariate effects on the responses. Hence, we propose probit linear mixed models for multivariate longitudinal binary data, where the correlation matrix is estimated using hypersphere decomposition instead of the strong assumptions noted above. Simulations and real examples are used to demonstrate the proposed methods. An open source R package, BayesMGLM, is made available on GitHub at https://github.com/kuojunglee/BayesMGLM/ with full documentation to produce the results.

Subject(s)

Linear Models , Humans

16.

Negative variance components and intercept-slope correlations greater than one in magnitude: How do such "non-regular" random intercept and slope models arise, and what should be done when they do?

Bridge, Helen; Morgan, Katy E; Frost, Chris.

Stat Med ; 43(14): 2747-2764, 2024 Jun 30.

Article in English | MEDLINE | ID: mdl-38695394

ABSTRACT

Statistical models with random intercepts and slopes (RIAS models) are commonly used to analyze longitudinal data. Fitting such models sometimes results in negative estimates of variance components or estimates on parameter space boundaries. This can be an unlucky chance occurrence, but can also occur because certain marginal distributions are mathematically identical to those from RIAS models with negative intercept and/or slope variance components and/or intercept-slope correlations greater than one in magnitude. We term such parameters "pseudo-variances" and "pseudo-correlations," and the models "non-regular." We use eigenvalue theory to explore how and when such non-regular RIAS models arise, showing: (i) A small number of measurements, short follow-up, and large residual variance increase the parameter space for which data (with a positive semidefinite marginal variance-covariance matrix) are compatible with non-regular RIAS models. (ii) Non-regular RIAS models can arise from model misspecification, when non-linearity in fixed effects is ignored or when random effects are omitted. (iii) A non-regular RIAS model can sometimes be interpreted as a regular linear mixed model with one or more additional random effects, which may not be identifiable from the data. (iv) Particular parameterizations of non-regular RIAS models have no generality for all possible numbers of measurements over time. Because of this lack of generality, we conclude that non-regular RIAS models can only be regarded as plausible data-generating mechanisms in some situations. Nevertheless, fitting a non-regular RIAS model can be acceptable, allowing unbiased inference on fixed effects where commonly recommended alternatives such as dropping the random slope result in bias.

Subject(s)

Models, Statistical , Humans , Longitudinal Studies , Data Interpretation, Statistical , Computer Simulation , Linear Models

17.

Safety signal detection with control of latent factors.

Tan, Xianming; Wang, William; Zeng, Donglin; Liu, Guanghan F; Diao, Guoqing; Jafari, Niusha; Alt, Ethan M; Ibrahim, Joseph G.

Stat Med ; 43(7): 1397-1418, 2024 Mar 30.

Article in English | MEDLINE | ID: mdl-38297431

ABSTRACT

Postmarket drug safety database like vaccine adverse event reporting system (VAERS) collect thousands of spontaneous reports annually, with each report recording occurrences of any adverse events (AEs) and use of vaccines. We hope to identify signal vaccine-AE pairs, for which certain vaccines are statistically associated with certain adverse events (AE), using such data. Thus, the outcomes of interest are multiple AEs, which are binary outcomes and could be correlated because they might share certain latent factors; and the primary covariates are vaccines. Appropriately accounting for the complex correlation among AEs could improve the sensitivity and specificity of identifying signal vaccine-AE pairs. We propose a two-step approach in which we first estimate the shared latent factors among AEs using a working multivariate logistic regression model, and then use univariate logistic regression model to examine the vaccine-AE associations after controlling for the latent factors. Our simulation studies show that this approach outperforms current approaches in terms of sensitivity and specificity. We apply our approach in analyzing VAERS data and report our findings.

Subject(s)

Adverse Drug Reaction Reporting Systems , Vaccines , Humans , United States , Vaccines/adverse effects , Databases, Factual , Computer Simulation , Software

18.

Planning stepped wedge cluster randomized trials to detect treatment effect heterogeneity.

Li, Fan; Chen, Xinyuan; Tian, Zizhong; Wang, Rui; Heagerty, Patrick J.

Stat Med ; 43(5): 890-911, 2024 Feb 28.

Article in English | MEDLINE | ID: mdl-38115805

ABSTRACT

Stepped wedge design is a popular research design that enables a rigorous evaluation of candidate interventions by using a staggered cluster randomization strategy. While analytical methods were developed for designing stepped wedge trials, the prior focus has been solely on testing for the average treatment effect. With a growing interest on formal evaluation of the heterogeneity of treatment effects across patient subpopulations, trial planning efforts need appropriate methods to accurately identify sample sizes or design configurations that can generate evidence for both the average treatment effect and variations in subgroup treatment effects. To fill in that important gap, this article derives novel variance formulas for confirmatory analyses of treatment effect heterogeneity, that are applicable to both cross-sectional and closed-cohort stepped wedge designs. We additionally point out that the same framework can be used for more efficient average treatment effect analyses via covariate adjustment, and allows the use of familiar power formulas for average treatment effect analyses to proceed. Our results further sheds light on optimal design allocations of clusters to maximize the weighted precision for assessing both the average and heterogeneous treatment effects. We apply the new methods to the Lumbar Imaging with Reporting of Epidemiology Trial, and carry out a simulation study to validate our new methods.

Subject(s)

Research Design , Treatment Effect Heterogeneity , Humans , Cross-Sectional Studies , Randomized Controlled Trials as Topic , Computer Simulation , Sample Size , Cluster Analysis

19.

Extending the DeLong algorithm for comparing areas under correlated receiver operating characteristic curves with missing data.

Zou, Lily; Choi, Yun-Hee; Guizzetti, Leonardo; Shu, Di; Zou, Joshua; Zou, Guangyong.

Stat Med ; 43(21): 4148-4162, 2024 Sep 20.

Article in English | MEDLINE | ID: mdl-39013403

ABSTRACT

A nonparametric method proposed by DeLong et al in 1988 for comparing areas under correlated receiver operating characteristic curves is used widely in practice. However, the DeLong method as implemented in popular software quietly deletes individuals with any missing values, yielding potentially invalid and/or inefficient results. We simplify the DeLong algorithm using ranks and extend it to accommodate missing data by using a mixed model approach for multivariate data. Simulation results demonstrate the validity and efficiency of our procedure for data missing at random. We illustrate our proposed procedure in SAS, Stata, and R using the original DeLong data.

Subject(s)

Algorithms , Area Under Curve , Computer Simulation , ROC Curve , Humans , Models, Statistical , Statistics, Nonparametric , Data Interpretation, Statistical , Multivariate Analysis

20.

Flexible Bayesian semiparametric mixed-effects model for skewed longitudinal data.

Ferede, Melkamu M; Dagne, Getachew A; Mwalili, Samuel M; Bilchut, Workagegnehu H; Engida, Habtamu A; Karanja, Simon M.

BMC Med Res Methodol ; 24(1): 56, 2024 Mar 01.

Article in English | MEDLINE | ID: mdl-38429729

ABSTRACT

BACKGROUND: In clinical trials and epidemiological research, mixed-effects models are commonly used to examine population-level and subject-specific trajectories of biomarkers over time. Despite their increasing popularity and application, the specification of these models necessitates a great deal of care when analysing longitudinal data with non-linear patterns and asymmetry. Parametric (linear) mixed-effect models may not capture these complexities flexibly and adequately. Additionally, assuming a Gaussian distribution for random effects and/or model errors may be overly restrictive, as it lacks robustness against deviations from symmetry. METHODS: This paper presents a semiparametric mixed-effects model with flexible distributions for complex longitudinal data in the Bayesian paradigm. The non-linear time effect on the longitudinal response was modelled using a spline approach. The multivariate skew-t distribution, which is a more flexible distribution, is utilized to relax the normality assumptions associated with both random-effects and model errors. RESULTS: To assess the effectiveness of the proposed methods in various model settings, simulation studies were conducted. We then applied these models on chronic kidney disease (CKD) data and assessed the relationship between covariates and estimated glomerular filtration rate (eGFR). First, we compared the proposed semiparametric partially linear mixed-effect (SPPLM) model with the fully parametric one (FPLM), and the results indicated that the SPPLM model outperformed the FPLM model. We then further compared four different SPPLM models, each assuming different distributions for the random effects and model errors. The model with a skew-t distribution exhibited a superior fit to the CKD data compared to the Gaussian model. The findings from the application revealed that hypertension, diabetes, and follow-up time had a substantial association with kidney function, specifically leading to a decrease in GFR estimates. CONCLUSIONS: The application and simulation studies have demonstrated that our work has made a significant contribution towards a more robust and adaptable methodology for modeling intricate longitudinal data. We achieved this by proposing a semiparametric Bayesian modeling approach with a spline smoothing function and a skew-t distribution.

Subject(s)

Models, Statistical , Renal Insufficiency, Chronic , Humans , Bayes Theorem , Linear Models , Longitudinal Studies , Renal Insufficiency, Chronic/diagnosis

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL