Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 244
Filtrar
1.
Am J Epidemiol ; 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38806447

RESUMO

Polygenic risk scores (PRS) are rapidly emerging as a way to measure disease risk by aggregating multiple genetic variants. Understanding the interplay of PRS with environmental factors is critical for interpreting and applying PRS in a wide variety of settings. We develop an efficient method for simultaneously modeling gene-environment correlations and interactions using PRS in case control studies. We use a logistic-normal regression modeling framework to specify the disease risk and PRS distribution in the underlying population and propose joint inference across the two models using the retrospective likelihood of the case-control data. Extensive simulation studies demonstrate the flexibility of the method in trading-off bias and efficiency for the estimation of various model parameters compared to the standard logistic regression or a case-only analysis for gene-environment interactions, or a control-only analysis for gene-environment correlations. Finally using simulated case-control data sets within the UK Biobank study, we demonstrate the power of our method for its ability to recover results from the full prospective cohort for the detection of an interaction between long-term oral contraceptive use and PRS on the risk of breast cancer. This method is computationally efficient and implemented in a user-friendly R package.

2.
Am J Epidemiol ; 2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-38583943

RESUMO

The objective of this study was to examine the impact of methodological changes to the 2018 World Cancer Research Fund/American Institute for Cancer Research (WCRF/AICR) Score on associations with risk for all-cause mortality, cancer mortality, and cancer risk jointly among older adults in the NIH-AARP Diet and Health Study. Weights were incorporated for each Score component; a continuous point scale was developed in place of the Score's fully discrete cut-points; and cut-point values were changed for physical activity and red meat based on evidence-based recommendations. Exploratory aims also examined the impact of separating components with more than one sub-component and whether all components were necessary to retain within this population utilizing a penalized scoring approach. Findings suggested weighting the original 2018 WCRF/AICR Score improved the score's predictive performance in association with all-cause mortality and provided more precise estimates in relation to cancer risk and mortality outcomes. The importance of healthy weight, physically activity, and plant-based foods in relation to cancer and overall mortality risk were highlighted in this population of older adults. Further studies are needed to better understand the consistency and generalizability of these findings across other populations.

3.
Annu Rev Nutr ; 43: 179-197, 2023 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-37196365

RESUMO

Precise dietary assessment is critical for accurate exposure classification in nutritional research, typically aimed at understanding how diet relates to health. Dietary supplement (DS) use is widespread and represents a considerable source of nutrients. However, few studies have compared the best methods to measure DSs. Our literature review on the relative validity and reproducibility of DS instruments in the United States [e.g., product inventories, questionnaires, and 24-h dietary recalls (24HR)] identified five studies that examined validity (n = 5) and/or reproducibility (n = 4). No gold standard reference method exists for validating DS use; thus, each study's investigators chose the reference instrument used to measure validity. Self-administered questionnaires agreed well with 24HR and inventory methods when comparing the prevalence of commonly used DSs. The inventory method captured nutrient amounts more accurately than the other methods. Reproducibility (over 3 months to 2.4 years) of prevalence of use estimates on the questionnaires was acceptable for common DSs. Given the limited body of research on measurement error in DS assessment, only tentative conclusions on these DS instruments can be drawn at present. Further research is critical to advancing knowledge in DS assessment for research and monitoring purposes.


Assuntos
Dieta , Suplementos Nutricionais , Humanos , Estados Unidos , Reprodutibilidade dos Testes , Inquéritos e Questionários , Nutrientes
4.
J Biomed Inform ; 150: 104595, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38244958

RESUMO

OBJECTIVE: To characterize the interplay between multiple medical conditions across sites and account for the heterogeneity in patient population characteristics across sites within a distributed research network, we develop a one-shot algorithm that can efficiently utilize summary-level data from various institutions. By applying our proposed algorithm to a large pediatric cohort across four national Children's hospitals, we replicated a recently published prospective cohort, the RISK study, and quantified the impact of the risk factors associated with the penetrating or stricturing behaviors of pediatric Crohn's disease (PCD). METHODS: In this study, we introduce the ODACoRH algorithm, a one-shot distributed algorithm designed for the competing risks model with heterogeneity. Our approach considers the variability in baseline hazard functions of multiple endpoints of interest across different sites. To accomplish this, we build a surrogate likelihood function by combining patient-level data from the local site with aggregated data from other external sites. We validated our method through extensive simulation studies and replication of the RISK study to investigate the impact of risk factors on the PCD for adolescents and children from four children's hospitals within the PEDSnet, A National Pediatric Learning Health System. To evaluate our ODACoRH algorithm, we compared results from the ODACoRH algorithms with those from meta-analysis as well as those derived from the pooled data. RESULTS: The ODACoRH algorithm had the smallest relative bias to the gold standard method (-0.2%), outperforming the meta-analysis method (-11.4%). In the PCD association study, the estimated subdistribution hazard ratios obtained through the ODACoRH algorithms are identical on par with the results derived from pooled data, which demonstrates the high reliability of our federated learning algorithms. From a clinical standpoint, the identified risk factors for PCD align well with the RISK study published in the Lancet in 2017 and other published studies, supporting the validity of our findings. CONCLUSION: With the ODACoRH algorithm, we demonstrate the capability of effectively integrating data from multiple sites in a decentralized data setting while accounting for between-site heterogeneity. Importantly, our study reveals several crucial clinical risk factors for PCD that merit further investigations.


Assuntos
Algoritmos , Humanos , Criança , Adolescente , Reprodutibilidade dos Testes , Simulação por Computador , Modelos de Riscos Proporcionais , Funções Verossimilhança
5.
Lifetime Data Anal ; 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38806842

RESUMO

We consider measurement error models for two variables observed repeatedly and subject to measurement error. One variable is continuous, while the other variable is a mixture of continuous and zero measurements. This second variable has two sources of zeros. The first source is episodic zeros, wherein some of the measurements for an individual may be zero and others positive. The second source is hard zeros, i.e., some individuals will always report zero. An example is the consumption of alcohol from alcoholic beverages: some individuals consume alcoholic beverages episodically, while others never consume alcoholic beverages. However, with a small number of repeat measurements from individuals, it is not possible to determine those who are episodic zeros and those who are hard zeros. We develop a new measurement error model for this problem, and use Bayesian methods to fit it. Simulations and data analyses are used to illustrate our methods. Extensions to parametric models and survival analysis are discussed briefly.

6.
Biostatistics ; 23(4): 1218-1241, 2022 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-35640937

RESUMO

Quantile regression is a semiparametric method for modeling associations between variables. It is most helpful when the covariates have complex relationships with the location, scale, and shape of the outcome distribution. Despite the method's robustness to distributional assumptions and outliers in the outcome, regression quantiles may be biased in the presence of measurement error in the covariates. The impact of function-valued covariates contaminated with heteroscedastic error has not yet been examined previously; although, studies have investigated the case of scalar-valued covariates. We present a two-stage strategy to consistently fit linear quantile regression models with a function-valued covariate that may be measured with error. In the first stage, an instrumental variable is used to estimate the covariance matrix associated with the measurement error. In the second stage, simulation extrapolation (SIMEX) is used to correct for measurement error in the function-valued covariate. Point-wise standard errors are estimated by means of nonparametric bootstrap. We present simulation studies to assess the robustness of the measurement error corrected for functional quantile regression. Our methods are applied to National Health and Examination Survey data to assess the relationship between physical activity and body mass index among adults in the United States.


Assuntos
Análise de Regressão , Simulação por Computador , Humanos , Modelos Lineares
7.
J Nutr ; 152(12): 2789-2801, 2023 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-35918260

RESUMO

BACKGROUND: Dietary supplement (DS) use is widespread in the United States and contributes large amounts of micronutrients to users. Most studies have relied on data from 1 assessment method to characterize the prevalence of DS use. Combining multiple methods enhances the ability to capture nutrient exposures from DSs and examine trends over time. OBJECTIVES: The objective of this study was to characterize DS use and examine trends in any DS as well as micronutrient-containing (MN) DS use in a nationally representative sample of the US population (≥1 y) from the 2007-2018 NHANES using a combined approach. METHODS: NHANES obtains an in-home inventory with a frequency-based dietary supplement and prescription medicine questionnaire (DSMQ), and two 24-h dietary recalls (24HRs). Trends in the prevalence of use and selected types of products used were estimated for the population and by sex, age, race/Hispanic origin, family income [poverty-to-income ratio (PIR)], and household food security (food-secure vs. food-insecure) using the DSMQ or ≥ 1 24HR. Linear trends were tested using orthogonal polynomials (significance set at P < 0.05). RESULTS: DS use increased from 50% in 2007 to 56% in 2018 (P = 0.001); use of MN products increased from 46% to 49% (P = 0.03), and single-nutrient DS (e.g., magnesium, vitamins B-12 and D) use also increased (all P < 0.001). In contrast, multivitamin-mineral use decreased (70% to 56%; P < 0.001). In adults (≥19 y), any (54% to 61%) and MN (49% to 54%) DS use increased, especially in men, non-Hispanic blacks and Hispanics, and low-income adults (PIR ≤130%). In children (1-18 y), any DS use remained stable (∼38%), as did MN use, except for food-insecure children, whose use increased from 24% to 31% over the decade (P = 0.03). CONCLUSIONS: The prevalence of any and MN DS use increased over time in the United States. This may be partially attributed to increased use of single-nutrient products. Population subgroups differed in their DS use.


Assuntos
Micronutrientes , Oligoelementos , Masculino , Humanos , Adulto , Criança , Estados Unidos , Inquéritos Nutricionais , Suplementos Nutricionais , Dieta , Vitaminas
8.
Crit Rev Food Sci Nutr ; 63(12): 1722-1732, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-34470512

RESUMO

A priori dietary indices provide a standardized, reproducible way to evaluate adherence to dietary recommendations across different populations. Existing nutrient-based indices were developed to reflect food/beverage intake; however, given the high prevalence of dietary supplement (DS) use and its potentially large contribution to nutrient intakes for those that use them, exposure classification without accounting for DS is incomplete. The purpose of this article is to review existing nutrient-based indices and describe the development of the Total Nutrient Index (TNI), an index developed to capture usual intakes from all sources of under-consumed micronutrients among the U.S. population. The TNI assesses U.S. adults' total nutrient intakes relative to recommended nutrient standards for eight under-consumed micronutrients identified by the Dietary Guidelines for Americans: calcium, magnesium, potassium, choline, and vitamins A, C, D, E. The TNI is scored from 0 to 100 (truncated at 100). The mean TNI score of U.S. adults (≥19 y; n = 9,954) based on dietary data from NHANES 2011-2014, was 75.4; the mean score for the index ignoring DS contributions was only 69.0 (t-test; p < 0.001). The TNI extends existing measures of diet quality by including nutrient intakes from all sources and was developed for research, monitoring, and policy purposes.Supplemental data for this article is available online at https://doi.org/10.1080/10408398.2021.1967872.


Assuntos
Dieta , Exposição Dietética , Adulto , Humanos , Estados Unidos , Inquéritos Nutricionais , Necessidades Nutricionais , Suplementos Nutricionais , Vitaminas , Micronutrientes , Ingestão de Energia
9.
Biometrics ; 79(3): 2023-2035, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-35841231

RESUMO

We consider analyses of case-control studies assembled from electronic health records (EHRs) where the pool of cases is contaminated by patients who are ineligible for the study. These ineligible patients, referred to as "false cases," should be excluded from the analyses if known. However, the true outcome status of a patient in the case pool is unknown except in a subset whose size may be arbitrarily small compared to the entire pool. To effectively remove the influence of the false cases on estimating odds ratio parameters defined by a working association model of the logistic form, we propose a general strategy to adaptively impute the unknown case status without requiring a correct phenotyping model to help discern the true and false case statuses. Our method estimates the target parameters as the solution to a set of unbiased estimating equations constructed using all available data. It outperforms existing methods by achieving robustness to mismodeling the relationship between the outcome status and covariates of interest, as well as improved estimation efficiency. We further show that our estimator is root-n-consistent and asymptotically normal. Through extensive simulation studies and analysis of real EHR data, we demonstrate that our method has desirable robustness to possible misspecification of both the association and phenotyping models, along with statistical efficiency superior to the competitors.


Assuntos
Registros Eletrônicos de Saúde , Modelos Estatísticos , Humanos , Simulação por Computador , Estudos de Casos e Controles
10.
Int J Mol Sci ; 24(3)2023 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-36769167

RESUMO

Neurological dysfunction following viral infection varies among individuals, largely due to differences in their genetic backgrounds. Gait patterns, which can be evaluated using measures of coordination, balance, posture, muscle function, step-to-step variability, and other factors, are also influenced by genetic background. Accordingly, to some extent gait can be characteristic of an individual, even prior to changes in neurological function. Because neuromuscular aspects of gait are under a certain degree of genetic control, the hypothesis tested was that gait parameters could be predictive of neuromuscular dysfunction following viral infection. The Collaborative Cross (CC) mouse resource was utilized to model genetically diverse populations and the DigiGait treadmill system used to provide quantitative and objective measurements of 131 gait parameters in 142 mice from 23 CC and SJL/J strains. DigiGait measurements were taken prior to infection with the neurotropic virus Theiler's Murine Encephalomyelitis Virus (TMEV). Neurological phenotypes were recorded over 90 days post-infection (d.p.i.), and the cumulative frequency of the observation of these phenotypes was statistically associated with discrete baseline DigiGait measurements. These associations represented spatial and postural aspects of gait influenced by the 90 d.p.i. phenotype score. Furthermore, associations were found between these gait parameters with sex and outcomes considered to show resistance, resilience, or susceptibility to severe neurological symptoms after long-term infection. For example, higher pre-infection measurement values for the Paw Drag parameter corresponded with greater disease severity at 90 d.p.i. Quantitative trait loci significantly associated with these DigiGait parameters revealed potential relationships between 28 differentially expressed genes (DEGs) and different aspects of gait influenced by viral infection. Thus, these potential candidate genes and genetic variations may be predictive of long-term neurological dysfunction. Overall, these findings demonstrate the predictive/prognostic value of quantitative and objective pre-infection DigiGait measurements for viral-induced neuromuscular dysfunction.


Assuntos
Theilovirus , Viroses , Camundongos , Animais , Viroses/genética , Camundongos Endogâmicos , Locos de Características Quantitativas , Marcha
11.
Biostatistics ; 22(4): 819-835, 2021 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-31999331

RESUMO

Huntington disease is an autosomal dominant, neurodegenerative disease without clearly identified biomarkers for when motor-onset occurs. Current standards to determine motor-onset rely on a clinician's subjective judgment that a patient's extrapyramidal signs are unequivocally associated with Huntington disease. This subjectivity can lead to error which could be overcome using an objective, data-driven metric that determines motor-onset. Recent studies of motor-sign decline-the longitudinal degeneration of motor-ability in patients-have revealed that motor-onset is closely related to an inflection point in its longitudinal trajectory. We propose a nonlinear location-shift marker model that captures this motor-sign decline and assesses how its inflection point is linked to other markers of Huntington disease progression. We propose two estimating procedures to estimate this model and its inflection point: one is a parametric method using nonlinear mixed effects model and the other one is a multi-stage nonparametric approach, which we developed. In an empirical study, the parametric approach was sensitive to correct specification of the mean structure of the longitudinal data. In contrast, our multi-stage nonparametric procedure consistently produced unbiased estimates regardless of the true mean structure. Applying our multi-stage nonparametric estimator to Neurobiological Predictors of Huntington Disease, a large observational study of Huntington disease, leads to earlier prediction of motor-onset compared to the clinician's subjective judgment.


Assuntos
Doença de Huntington , Doenças Neurodegenerativas , Biomarcadores , Progressão da Doença , Humanos , Doença de Huntington/diagnóstico , Doença de Huntington/genética , Dinâmica não Linear
12.
J Nutr ; 152(3): 863-871, 2022 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-34928350

RESUMO

BACKGROUND: Most dietary indices reflect foods and beverages and do not include exposures from dietary supplements (DS) that provide substantial amounts of micronutrients. A nutrient-based approach that captures total intake inclusive of DS can strengthen exposure assessment. OBJECTIVES: We examined the construct and criterion validity of the Total Nutrient Index (TNI) among US adults (≥19 years; nonpregnant or lactating). METHODS: The TNI includes 8 underconsumed micronutrients identified by the Dietary Guidelines for Americans: calcium; magnesium; potassium; choline; and vitamins A, C, D, and E. The TNI is expressed as a percentage of the RDA or Adequate Intake to compute micronutrient component scores; the mean of the component scores yields the TNI score, ranging from 0-100. Data from exemplary menus and the 2003-2006 (≥19 years; n = 8861) and 2011-2014 NHANES (≥19 years; n = 9954) were employed. Exemplary menus were used to determine whether the TNI yielded high scores from dietary sources (women, 31-50 years; men ≥ 70 years). TNI scores were correlated with Healthy Eating Index (HEI) 2015 overall and component scores for dairy, fruits, and vegetables; TNI component scores for vitamins A, C, D, and E were correlated with respective biomarker data. TNI scores were compared between groups with known differences in nutrient intake based on the literature. RESULTS: The TNI yielded high scores on exemplary menus (84.8-93.3/100) and was moderately correlated (r = 0.48) with the HEI-2015. Mean TNI scores were significantly different for DS users (83.5) compared with nonusers (67.1); nonsmokers (76.8) compared with smokers (70.3); and those living with food security (76.6) compared with food insecurity (69.1). Correlations of TNI vitamin component scores with available biomarkers ranged from 0.12 (α-tocopherol) to 0.36 (serum 25-hydroxyvitamin D), and were significantly higher than correlations obtained from the diet alone. CONCLUSIONS: The evaluation of validity supports that the TNI is a useful construct to assess total micronutrient exposures of underconsumed micronutrients among US adults.


Assuntos
Micronutrientes , Oligoelementos , Adulto , Dieta , Suplementos Nutricionais , Feminino , Humanos , Lactação , Masculino , Nutrientes , Inquéritos Nutricionais , Estados Unidos , Vitamina A , Vitaminas
13.
Biometrics ; 78(3): 894-907, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-33881782

RESUMO

Data with a huge size present great challenges in modeling, inferences, and computation. In handling big data, much attention has been directed to settings with "large p small n", and relatively less work has been done to address problems with p and n being both large, though data with such a feature have now become more accessible than before, where p represents the number of variables and n stands for the sample size. The big volume of data does not automatically ensure good quality of inferences because a large number of unimportant variables may be collected in the process of gathering informative variables. To carry out valid statistical analysis, it is imperative to screen out noisy variables that have no predictive value for explaining the outcome variable. In this paper, we develop a screening method for handling large-sized survival data, where the sample size n is large and the dimension p of covariates is of non-polynomial order of the sample size n, or the so-called NP-dimension. We rigorously establish theoretical results for the proposed method and conduct numerical studies to assess its performance. Our research offers multiple extensions of existing work and enlarges the scope of high-dimensional data analysis. The proposed method capitalizes on the connections among useful regression settings and offers a computationally efficient screening procedure. Our method can be applied to different situations with large-scale data including genomic data.


Assuntos
Genoma , Genômica , Modelos de Riscos Proporcionais , Tamanho da Amostra
14.
Biometrics ; 78(1): 9-23, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-33021738

RESUMO

The identification of valid surrogate markers of disease or disease progression has the potential to decrease the length and costs of future studies. Most available methods that assess the value of a surrogate marker ignore the fact that surrogates are often measured with error. Failing to adjust for measurement error can erroneously identify a useful surrogate marker as not useful or vice versa. We investigate and propose robust methods to correct for the effect of measurement error when evaluating a surrogate marker using multiple estimators developed for parametric and nonparametric estimates of the proportion of treatment effect explained by the surrogate marker. In addition, we quantify the attenuation bias induced by measurement error and develop inference procedures to allow for variance and confidence interval estimation. Through a simulation study, we show that our proposed estimators correct for measurement error in the surrogate marker and that our inference procedures perform well in finite samples. We illustrate these methods by examining a potential surrogate marker that is measured with error, hemoglobin A1c, using data from the Diabetes Prevention Program clinical trial.


Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Viés , Biomarcadores , Simulação por Computador
15.
Stat Med ; 41(7): 1191-1204, 2022 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-34806208

RESUMO

We develop a generalized partially additive model to build a single semiparametric risk scoring system for physical activity across multiple populations. A score comprised of distinct and objective physical activity measures is a new concept that offers challenges due to the nonlinear relationship between physical behaviors and various health outcomes. We overcome these challenges by modeling each score component as a smooth term, an extension of generalized partially linear single-index models. We use penalized splines and propose two inferential methods, one using profile likelihood and a nonparametric bootstrap, the other using a full Bayesian model, to solve additional computational problems. Both methods exhibit similar and accurate performance in simulations. These models are applied to the National Health and Nutrition Examination Survey and quantify nonlinear and interpretable shapes of score components for all-cause mortality.


Assuntos
Exercício Físico , Modelos Estatísticos , Teorema de Bayes , Humanos , Modelos Lineares , Inquéritos Nutricionais , Fatores de Risco
16.
J Econom ; 230(2): 221-239, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36017081

RESUMO

When predicting crop yield using both functional and multivariate predictors, the prediction performances benefit from the inclusion of the interactions between the two sets of predictors. We assume the interaction depends on a nonparametric, single-index structure of the multivariate predictor and reduce each functional predictor's dimension using functional principal component analysis (FPCA). Allowing the number of FPCA scores to diverge to infinity, we consider a sequence of semiparametric working models with a diverging number of predictors, which are FPCA scores with estimation errors. We show that the parametric component of the model is root-n consistent and asymptotically normal, the overall prediction error is dominated by the estimation of the nonparametric interaction function, and justify a CV-based procedure to select the tuning parameters.

17.
Adv Exp Med Biol ; 1332: 211-227, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34251646

RESUMO

Measuring usual dietary intake in freely living humans is difficult to accomplish. As a part of our recent study, a food frequency questionnaire was completed by healthy adult men and women at days 0 and 90 of the study. Data from the food questionnaire were analyzed with a nutrient analysis program ( www.Harvardsffq.date ). Healthy men and women consumed protein as 19-20% and 17-19% of their total energy intakes, respectively, with animal protein representing about 75 and 70% of their total protein intakes, respectively. The intake of each nutritionally essential amino acid (EAA) by the persons exceeded that recommended for healthy adults with a minimal physical activity. In all individuals, the dietary intake of leucine was the highest, followed by lysine, valine, and isoleucine in descending order, and the ingestion of amino acids that are synthesizable de novo in animal cells (AASAs) was about 20% greater than that of total EAAs. The intake of each AASA met those recommended for healthy adults with a minimal physical activity. Intakes of some AASAs (alanine, arginine, aspartate, glutamate, and glycine) from a typical diet providing 90-110 g food protein/day does not meet the requirements of adults with an intensive physical activity. Within the male or female group, there were not significant differences in the dietary intakes of all amino acids between days 0 and 90 of the study, and this was also true for nearly all other essential nutrients. Our findings will help to improve amino acid nutrition and health in both the general population and exercising individuals.


Assuntos
Aminoácidos , Dieta , Adulto , Ingestão de Alimentos , Ingestão de Energia , Feminino , Humanos , Masculino , Nutrientes
18.
Biometrics ; 76(3): 811-820, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-31863595

RESUMO

In biomedical studies, testing for homogeneity between two groups, where one group is modeled by mixture models, is often of great interest. This paper considers the semiparametric exponential family mixture model proposed by Hong et al. (2017) and studies the score test for homogeneity under this model. The score test is nonregular in the sense that nuisance parameters disappear under the null hypothesis. To address this difficulty, we propose a modification of the score test, so that the resulting test enjoys the Wilks phenomenon. In finite samples, we show that with fixed nuisance parameters the score test is locally most powerful. In large samples, we establish the asymptotic power functions under two types of local alternative hypotheses. Our simulation studies illustrate that the proposed score test is powerful and computationally fast. We apply the proposed score test to an UK ovarian cancer DNA methylation data for identification of differentially methylated CpG sites.


Assuntos
Modelos Estatísticos , Simulação por Computador
19.
Stat Med ; 39(16): 2232-2263, 2020 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-32246531

RESUMO

We continue our review of issues related to measurement error and misclassification in epidemiology. We further describe methods of adjusting for biased estimation caused by measurement error in continuous covariates, covering likelihood methods, Bayesian methods, moment reconstruction, moment-adjusted imputation, and multiple imputation. We then describe which methods can also be used with misclassification of categorical covariates. Methods of adjusting estimation of distributions of continuous variables for measurement error are then reviewed. Illustrative examples are provided throughout these sections. We provide lists of available software for implementing these methods and also provide the code for implementing our examples in the Supporting Information. Next, we present several advanced topics, including data subject to both classical and Berkson error, modeling continuous exposures with measurement error, and categorical exposures with misclassification in the same model, variable selection when some of the variables are measured with error, adjusting analyses or design for error in an outcome variable, and categorizing continuous variables measured with error. Finally, we provide some advice for the often met situations where variables are known to be measured with substantial error, but there is only an external reference standard or partial (or no) information about the type or magnitude of the error.


Assuntos
Teorema de Bayes , Viés , Humanos
20.
Stat Med ; 39(16): 2197-2231, 2020 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-32246539

RESUMO

Measurement error and misclassification of variables frequently occur in epidemiology and involve variables important to public health. Their presence can impact strongly on results of statistical analyses involving such variables. However, investigators commonly fail to pay attention to biases resulting from such mismeasurement. We provide, in two parts, an overview of the types of error that occur, their impacts on analytic results, and statistical methods to mitigate the biases that they cause. In this first part, we review different types of measurement error and misclassification, emphasizing the classical, linear, and Berkson models, and on the concepts of nondifferential and differential error. We describe the impacts of these types of error in covariates and in outcome variables on various analyses, including estimation and testing in regression models and estimating distributions. We outline types of ancillary studies required to provide information about such errors and discuss the implications of covariate measurement error for study design. Methods for ascertaining sample size requirements are outlined, both for ancillary studies designed to provide information about measurement error and for main studies where the exposure of interest is measured with error. We describe two of the simpler methods, regression calibration and simulation extrapolation (SIMEX), that adjust for bias in regression coefficients caused by measurement error in continuous covariates, and illustrate their use through examples drawn from the Observing Protein and Energy (OPEN) dietary validation study. Finally, we review software available for implementing these methods. The second part of the article deals with more advanced topics.


Assuntos
Modelos Estatísticos , Projetos de Pesquisa , Viés , Calibragem , Causalidade , Simulação por Computador , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA