Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Stat Med ; 42(16): 2729-2745, 2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37075804

RESUMO

The National Alzheimer's Coordinating Center Uniform Data Set includes test results from a battery of cognitive exams. Motivated by the need to model the cognitive ability of low-performing patients we create a composite score from ten tests and propose to model this score using a partially linear quantile regression model for longitudinal studies with non-ignorable dropouts. Quantile regression allows for modeling non-central tendencies. The partially linear model accommodates nonlinear relationships between some of the covariates and cognitive ability. The data set includes patients that leave the study prior to the conclusion. Ignoring such dropouts will result in biased estimates if the probability of dropout depends on the response. To handle this challenge, we propose a weighted quantile regression estimator where the weights are inversely proportional to the estimated probability a subject remains in the study. We prove that this weighted estimator is a consistent and efficient estimator of both linear and nonlinear effects.


Assuntos
Disfunção Cognitiva , Humanos , Modelos Lineares , Análise de Regressão , Estudos Longitudinais , Probabilidade
2.
Hum Hered ; 81(2): 88-105, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28076869

RESUMO

Technological advances have led to an explosive growth of high-throughput functional genomic data. Exploiting the correlation among different data types, it is possible to predict one functional genomic data type from other data types. Prediction tools are valuable in understanding the relationship among different functional genomic signals. They also provide a cost-efficient solution to inferring the unknown functional genomic profiles when experimental data are unavailable due to resource or technological constraints. The predicted data may be used for generating hypotheses, prioritizing targets, interpreting disease variants, facilitating data integration, quality control, and many other purposes. This article reviews various applications of prediction methods in functional genomics, discusses analytical challenges, and highlights some common and effective strategies used to develop prediction methods for functional genomic data.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Cromatina/metabolismo , Epigênese Genética , Humanos , Modelos Genéticos , Transcriptoma/genética
3.
J Comput Graph Stat ; 33(1): 138-151, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38706715

RESUMO

Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not generalize to other types of models. In this article we proposed the MinPen framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using common assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to simultaneously detect and exploit relationships between responses. An iterative algorithm is proposed as a solution to the non-convex optimization. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framework. Supplemental material providing proofs, additional simulations, code, and data sets are available online.

4.
Stat Med ; 32(28): 4967-79, 2013 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-23836597

RESUMO

Analysis of health care cost data is often complicated by a high level of skewness, heteroscedastic variances and the presence of missing data. Most of the existing literature on cost data analysis have been focused on modeling the conditional mean. In this paper, we study a weighted quantile regression approach for estimating the conditional quantiles health care cost data with missing covariates. The weighted quantile regression estimator is consistent, unlike the naive estimator, and asymptotically normal. Furthermore, we propose a modified BIC for variable selection in quantile regression when the covariates are missing at random. The quantile regression framework allows us to obtain a more complete picture of the effects of the covariates on the health care cost and is naturally adapted to the skewness and heterogeneity of the cost data. The method is semiparametric in the sense that it does not require to specify the likelihood function for the random error or the covariates. We investigate the weighted quantile regression procedure and the modified BIC via extensive simulations. We illustrate the application by analyzing a real data set from a health care cost study.


Assuntos
Custos de Cuidados de Saúde/estatística & dados numéricos , Análise de Regressão , Simulação por Computador , Feminino , Humanos , Masculino , Farmacêuticos
6.
J Am Stat Assoc ; 113(523): 1243-1254, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30416233

RESUMO

Finding the optimal treatment regime (or a series of sequential treatment regimes) based on individual characteristics has important applications in areas such as precision medicine, government policies and active labor market interventions. In the current literature, the optimal treatment regime is usually defined as the one that maximizes the average benefit in the potential population. This paper studies a general framework for estimating the quantile-optimal treatment regime, which is of importance in many real-world applications. Given a collection of treatment regimes, we consider robust estimation of the quantile-optimal treatment regime, which does not require the analyst to specify an outcome regression model. We propose an alternative formulation of the estimator as a solution of an optimization problem with an estimated nuisance parameter. This novel representation allows us to investigate the asymptotic theory of the estimated optimal treatment regime using empirical process techniques. We derive theory involving a nonstandard convergence rate and a non-normal limiting distribution. The same nonstandard convergence rate would also occur if the mean optimality criterion is applied, but this has not been studied. Thus, our results fill an important theoretical gap for a general class of policy search methods in the literature. The paper investigates both static and dynamic treatment regimes. In addition, doubly robust estimation and alternative optimality criterion such as that based on Gini's mean difference or weighted quantiles are investigated. Numerical simulations demonstrate the performance of the proposed estimator. A data example from a trial in HIV+ patients is used to illustrate the application.

7.
Epigenetics ; 13(2): 163-172, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-28165855

RESUMO

Preterm birth (PTB) affects one in six Black babies in the United States. Epigenetics is believed to play a role in PTB; however, only a limited number of epigenetic studies of PTB have been reported, most of which have focused on cord blood DNA methylation (DNAm) and/or were conducted in white populations. Here we conducted, by far, the largest epigenome-wide DNAm analysis in 300 Black women who delivered early spontaneous preterm (sPTB, n = 150) or full-term babies (n = 150) and replicated the findings in an independent set of Black mother-newborn pairs from the Boston Birth Cohort. DNAm in maternal blood and/or cord blood was measured using the Illumina HumanMethylation450 BeadChip. We identified 45 DNAm loci in maternal blood associated with early sPTB, with a false discovery rate (FDR) <5%. Replication analyses confirmed sPTB associations for cg03915055 and cg06804705, located in the promoter regions of the CYTIP and LINC00114 genes, respectively. Both loci had comparable associations with early sPTB and early medically-indicated PTB, but attenuated associations with late sPTB. These associations could not be explained by cell composition, gestational complications, and/or nearby maternal genetic variants. Analyses in the newborns of the 110 Black women showed that cord blood methylation levels at both loci had no associations with PTB. The findings from this study underscore the role of maternal DNAm in PTB risk, and provide a set of maternal loci that may serve as biomarkers for PTB. Longitudinal studies are needed to clarify temporal relationships between maternal DNAm and PTB risk.


Assuntos
Negro ou Afro-Americano/genética , Metilação de DNA , Nascimento Prematuro/genética , Adulto , Biomarcadores/sangue , Feminino , Sangue Fetal/metabolismo , Loci Gênicos , Estudo de Associação Genômica Ampla/normas , Humanos , Recém-Nascido , Recém-Nascido Prematuro/sangue , Masculino , Nascimento Prematuro/sangue
8.
Nat Commun ; 8(1): 1038, 2017 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-29051481

RESUMO

We evaluate the feasibility of using a biological sample's transcriptome to predict its genome-wide regulatory element activities measured by DNase I hypersensitivity (DH). We develop BIRD, Big Data Regression for predicting DH, to handle this high-dimensional problem. Applying BIRD to the Encyclopedia of DNA Elements (ENCODE) data, we found that to a large extent gene expression predicts DH, and information useful for prediction is contained in the whole transcriptome rather than limited to a regulatory element's neighboring genes. We show applications of BIRD-predicted DH in predicting transcription factor-binding sites (TFBSs), turning publicly available gene expression samples in Gene Expression Omnibus (GEO) into a regulome database, predicting differential regulatory element activities, and facilitating regulome data analyses by serving as pseudo-replicates. Besides improving our understanding of the regulome-transcriptome relationship, this study suggests that transcriptome-based prediction can provide a useful new approach for regulome mapping.


Assuntos
Desoxirribonuclease I/metabolismo , Genoma Humano , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genômica , Humanos , Transcriptoma
9.
Nat Commun ; 8: 15608, 2017 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-28598419

RESUMO

Preterm birth (PTB) contributes significantly to infant mortality and morbidity with lifelong impact. Few robust genetic factors of PTB have been identified. Such 'missing heritability' may be partly due to gene × environment interactions (G × E), which is largely unexplored. Here we conduct genome-wide G × E analyses of PTB in 1,733 African-American women (698 mothers of PTB; 1,035 of term birth) from the Boston Birth Cohort. We show that maternal COL24A1 variants have a significant genome-wide interaction with maternal pre-pregnancy overweight/obesity on PTB risk, with rs11161721 (PG × E=1.8 × 10-8; empirical PG × E=1.2 × 10-8) as the top hit. This interaction is replicated in African-American mothers (PG × E=0.01) from an independent cohort and in meta-analysis (PG × E=3.6 × 10-9), but is not replicated in Caucasians. In adipose tissue, rs11161721 is significantly associated with altered COL24A1 expression. Our findings may provide new insight into the aetiology of PTB and improve our ability to predict and prevent PTB.


Assuntos
Interação Gene-Ambiente , Predisposição Genética para Doença/genética , Colágenos não Fibrilares/genética , Obesidade/genética , Nascimento Prematuro/genética , Adulto , Negro ou Afro-Americano/genética , Índice de Massa Corporal , Feminino , Humanos , Recém-Nascido , Colágenos não Fibrilares/biossíntese , Polimorfismo de Nucleotídeo Único/genética , Gravidez , Nascimento Prematuro/prevenção & controle , Fatores de Risco , Adulto Jovem
10.
Artigo em Inglês | MEDLINE | ID: mdl-27239531

RESUMO

INTRODUCTION: The Uniform Data Set (UDS) contains neuropsychological test scores and demographic information for participants at Alzheimer's disease centers across the United States funded by the National Institute on Aging. Mean regression analysis of neuropsychological tests has been proposed to detect cognitive decline, but the approach requires stringent assumptions. METHODS: We propose using quantile regression to directly model conditional percentiles of neuropsychological test scores. An online application allows users to easily implement the proposed method. RESULTS: Scores from 13 different neuropsychological tests were analyzed for 5413 cognitively normal participants in the UDS. Quantile and mean regression models were fit using age, gender, and years of education. Differences between the mean and quantile regression estimates were found on the individual measures. DISCUSSION: Quantile regression provides more robust estimates of baseline percentiles for cognitively normal adults. This can then serve as standards against which to detect individual cognitive decline.

11.
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA