Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Biometrics ; 79(3): 2010-2022, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-36377514

RESUMO

Clustered data frequently arise in biomedical studies, where observations, or subunits, measured within a cluster are associated. The cluster size is said to be informative, if the outcome variable is associated with the number of subunits in a cluster. In most existing work, the informative cluster size issue is handled by marginal approaches based on within-cluster resampling, or cluster-weighted generalized estimating equations. Although these approaches yield consistent estimation of the marginal models, they do not allow estimation of within-cluster associations and are generally inefficient. In this paper, we propose a semiparametric joint model for clustered interval-censored event time data with informative cluster size. We use a random effect to account for the association among event times of the same cluster as well as the association between event times and the cluster size. For estimation, we propose a sieve maximum likelihood approach and devise a computationally-efficient expectation-maximization algorithm for implementation. The estimators are shown to be strongly consistent, with the Euclidean components being asymptotically normal and achieving semiparametric efficiency. Extensive simulation studies are conducted to evaluate the finite-sample performance, efficiency and robustness of the proposed method. We also illustrate our method via application to a motivating periodontal disease dataset.


Assuntos
Algoritmos , Modelos Estatísticos , Funções Verossimilhança , Análise de Regressão , Simulação por Computador
2.
Stat Sin ; 33(2): 633-662, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37197479

RESUMO

Recent technological advances have made it possible to measure multiple types of many features in biomedical studies. However, some data types or features may not be measured for all study subjects because of cost or other constraints. We use a latent variable model to characterize the relationships across and within data types and to infer missing values from observed data. We develop a penalized-likelihood approach for variable selection and parameter estimation and devise an efficient expectation-maximization algorithm to implement our approach. We establish the asymptotic properties of the proposed estimators when the number of features increases at a polynomial rate of the sample size. Finally, we demonstrate the usefulness of the proposed methods using extensive simulation studies and provide an application to a motivating multi-platform genomics study.

3.
Biom J ; 65(1): e2100139, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-35837982

RESUMO

Recent technological advances have made it possible to collect high-dimensional genomic data along with clinical data on a large number of subjects. In the studies of chronic diseases such as cancer, it is of great interest to integrate clinical and genomic data to build a comprehensive understanding of the disease mechanisms. Despite extensive studies on integrative analysis, it remains an ongoing challenge to model the interaction effects between clinical and genomic variables, due to high dimensionality of the data and heterogeneity across data types. In this paper, we propose an integrative approach that models interaction effects using a single-index varying-coefficient model, where the effects of genomic features can be modified by clinical variables. We propose a penalized approach for separate selection of main and interaction effects. Notably, the proposed methods can be applied to right-censored survival outcomes based on a Cox proportional hazards model. We demonstrate the advantages of the proposed methods through extensive simulation studies and provide applications to a motivating cancer genomic study.


Assuntos
Genômica , Neoplasias , Humanos , Modelos de Riscos Proporcionais , Simulação por Computador , Neoplasias/genética
4.
Lifetime Data Anal ; 29(1): 87-114, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-35831702

RESUMO

The incubation period is a key characteristic of an infectious disease. In the outbreak of a novel infectious disease, accurate evaluation of the incubation period distribution is critical for designing effective prevention and control measures . Estimation of the incubation period distribution based on limited information from retrospective inspection of infected cases is highly challenging due to censoring and truncation. In this paper, we consider a semiparametric regression model for the incubation period and propose a sieve maximum likelihood approach for estimation based on the symptom onset time, travel history, and basic demographics of reported cases. The approach properly accounts for the pandemic growth and selection bias in data collection. We also develop an efficient computation method and establish the asymptotic properties of the proposed estimators. We demonstrate the feasibility and advantages of the proposed methods through extensive simulation studies and provide an application to a dataset on the outbreak of COVID-19.


Assuntos
COVID-19 , Período de Incubação de Doenças Infecciosas , Humanos , Funções Verossimilhança , Estudos Retrospectivos , COVID-19/epidemiologia , Análise de Regressão , Simulação por Computador
5.
Biometrics ; 78(1): 165-178, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-33140426

RESUMO

A flexible class of semiparametric partly linear frailty transformation models is considered for analyzing clustered interval-censored data, which arise naturally in complex diseases and dental research. This class of models features two nonparametric components, resulting in a nonparametric baseline survival function and a potential nonlinear effect of a continuous covariate. The dependence among failure times within a cluster is induced by a shared, unobserved frailty term. A sieve maximum likelihood estimation method based on piecewise linear functions is proposed. The proposed estimators of the regression, dependence, and transformation parameters are shown to be strongly consistent and asymptotically normal, whereas the estimators of the two nonparametric functions are strongly consistent with optimal rates of convergence. An extensive simulation study is conducted to study the finite-sample performance of the proposed estimators. We provide an application to a dental study for illustration.


Assuntos
Fragilidade , Simulação por Computador , Humanos , Funções Verossimilhança , Modelos Lineares , Modelos Estatísticos
6.
Ann Stat ; 50(1): 487-510, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35813218

RESUMO

In long-term follow-up studies, data are often collected on repeated measures of multivariate response variables as well as on time to the occurrence of a certain event. To jointly analyze such longitudinal data and survival time, we propose a general class of semiparametric latent-class models that accommodates a heterogeneous study population with flexible dependence structures between the longitudinal and survival outcomes. We combine nonparametric maximum likelihood estimation with sieve estimation and devise an efficient EM algorithm to implement the proposed approach. We establish the asymptotic properties of the proposed estimators through novel use of modern empirical process theory, sieve estimation theory, and semiparametric efficiency theory. Finally, we demonstrate the advantages of the proposed methods through extensive simulation studies and provide an application to the Atherosclerosis Risk in Communities study.

7.
Stat Med ; 40(10): 2400-2412, 2021 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-33586218

RESUMO

This research is motivated by a periodontal disease dataset that possesses certain special features. The dataset consists of clustered current status time-to-event observations with large and varying cluster sizes, where the cluster size is associated with the disease outcome. Also, heavy censoring is present in the data even with long follow-up time, suggesting the presence of a cured subpopulation. In this paper, we propose a computationally efficient marginal approach, namely the cluster-weighted generalized estimating equation approach, to analyze the data based on a class of semiparametric transformation cure models. The parametric and nonparametric components of the model are estimated using a Bernstein-polynomial based sieve maximum pseudo-likelihood approach. The asymptotic properties of the proposed estimators are studied. Simulation studies are conducted to evaluate the performance of the proposed estimators in scenarios with different degree of informative clustering and within-cluster dependence. The proposed method is applied to the motivating periodontal disease data for illustration.


Assuntos
Modelos Estatísticos , Análise por Conglomerados , Simulação por Computador , Análise Custo-Benefício , Humanos , Funções Verossimilhança
8.
Biometrics ; 72(4): 1173-1183, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27060984

RESUMO

In many classical estimation problems, the parameter space has a boundary. In most cases, the standard asymptotic properties of the estimator do not hold when some of the underlying true parameters lie on the boundary. However, without knowledge of the true parameter values, confidence intervals constructed assuming that the parameters lie in the interior are generally over-conservative. A penalized estimation method is proposed in this article to address this issue. An adaptive lasso procedure is employed to shrink the parameters to the boundary, yielding oracle inference which adapt to whether or not the true parameters are on the boundary. When the true parameters are on the boundary, the inference is equivalent to that which would be achieved with a priori knowledge of the boundary, while if the converse is true, the inference is equivalent to that which is obtained in the interior of the parameter space. The method is demonstrated under two practical scenarios, namely the frailty survival model and linear regression with order-restricted parameters. Simulation studies and real data analyses show that the method performs well with realistic sample sizes and exhibits certain advantages over standard methods.


Assuntos
Modelos Lineares , Modelos Estatísticos , Análise de Sobrevida , Simulação por Computador , Bases de Dados Factuais , Humanos , Funções Verossimilhança , Neoplasias Pulmonares/mortalidade , Análise de Regressão , Tamanho da Amostra , Estados Unidos , United States Department of Veterans Affairs
9.
Stat Methods Med Res ; 33(9): 1673-1685, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39105419

RESUMO

The case-cohort design is a commonly used cost-effective sampling strategy for large cohort studies, where some covariates are expensive to measure or obtain. In this paper, we consider regression analysis under a case-cohort study with interval-censored failure time data, where the failure time is only known to fall within an interval instead of being exactly observed. A common approach to analyzing data from a case-cohort study is the inverse probability weighting approach, where only subjects in the case-cohort sample are used in estimation, and the subjects are weighted based on the probability of inclusion into the case-cohort sample. This approach, though consistent, is generally inefficient as it does not incorporate information outside the case-cohort sample. To improve efficiency, we first develop a sieve maximum weighted likelihood estimator under the Cox model based on the case-cohort sample and then propose a procedure to update this estimator by using information in the full cohort. We show that the update estimator is consistent, asymptotically normal, and at least as efficient as the original estimator. The proposed method can flexibly incorporate auxiliary variables to improve estimation efficiency. A weighted bootstrap procedure is employed for variance estimation. Simulation results indicate that the proposed method works well in practical situations. An application to a Phase 3 HIV vaccine efficacy trial is provided for illustration.


Assuntos
Modelos de Riscos Proporcionais , Humanos , Estudos de Coortes , Funções Verossimilhança , Infecções por HIV/tratamento farmacológico , Vacinas contra a AIDS/uso terapêutico , Estudos de Casos e Controles , Modelos Estatísticos , Análise de Regressão , Interpretação Estatística de Dados
10.
Stat Methods Med Res ; 33(3): 498-514, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38400526

RESUMO

In cancer studies, it is commonplace that a fraction of patients participating in the study are cured, such that not all of them will experience a recurrence, or death due to cancer. Also, it is plausible that some covariates, such as the treatment assigned to the patients or demographic characteristics, could affect both the patients' survival rates and cure/incidence rates. A common approach to accommodate these features in survival analysis is to consider a mixture cure survival model with the incidence rate modeled by a logistic regression model and latency part modeled by the Cox proportional hazards model. These modeling assumptions, though typical, restrict the structure of covariate effects on both the incidence and latency components. As a plausible recourse to attain flexibility, we study a class of semiparametric mixture cure models in this article, which incorporates two single-index functions for modeling the two regression components. A hybrid nonparametric maximum likelihood estimation method is proposed, where the cumulative baseline hazard function for uncured subjects is estimated nonparametrically, and the two single-index functions are estimated via Bernstein polynomials. Parameter estimation is carried out via a curated expectation-maximization algorithm. We also conducted a large-scale simulation study to assess the finite-sample performance of the estimator. The proposed methodology is illustrated via application to two cancer datasets.


Assuntos
Modelos Estatísticos , Neoplasias , Humanos , Incidência , Modelos de Riscos Proporcionais , Análise de Sobrevida , Simulação por Computador , Algoritmos , Funções Verossimilhança
11.
J Med Chem ; 67(19): 17542-17550, 2024 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-39340453

RESUMO

Target identification is crucial for elucidating the mechanisms of bioactive molecules in drug discovery. However, traditional methods assess compounds individually, making it challenging to efficiently examine multiple compounds in parallel, especially for structurally diverse compounds. This study reports a novel strategy called chemical genomics-facilitated chemical proteomics (CGCP) for multiplexing the target identification of bioactive small molecules. CGCP correlates compounds' perturbation of global transcription, or chemical genomic profiles, with their reactivity toward target proteins, enabling simultaneous identification of targets. We demonstrated the utility of CGCP by studying the targets of celastrol (Cel) and four other electrophilic compounds with varying levels of similarity to Cel based on their chemical genomic profiles. We identified multiple novel targets and binding sites shared by the compounds in a single experiment. CGCP enabled multiplexity and improved the efficiency of target identification for structurally distinct compounds, indicating its potential to accelerate drug discovery.


Assuntos
Genômica , Triterpenos Pentacíclicos , Proteômica , Proteômica/métodos , Genômica/métodos , Triterpenos Pentacíclicos/química , Humanos , Descoberta de Drogas/métodos , Triterpenos/química , Triterpenos/farmacologia , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/farmacologia
12.
HGG Adv ; 5(1): 100245, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-37817410

RESUMO

Mendelian randomization has been widely used to assess the causal effect of a heritable exposure variable on an outcome of interest, using genetic variants as instrumental variables. In practice, data on the exposure variable can be incomplete due to high cost of measurement and technical limits of detection. In this paper, we propose a valid and efficient method to handle both unmeasured and undetectable values of the exposure variable in one-sample Mendelian randomization analysis with individual-level data. We estimate the causal effect of the exposure variable on the outcome using maximum likelihood estimation and develop an expectation maximization algorithm for the computation of the estimator. Simulation studies show that the proposed method performs well in making inference on the causal effect. We apply our method to the Hispanic Community Health Study/Study of Latinos, a community-based prospective cohort study, and estimate the causal effect of several metabolites on phenotypes of interest.


Assuntos
Análise da Randomização Mendeliana , Saúde Pública , Humanos , Análise da Randomização Mendeliana/métodos , Estudos Prospectivos , Causalidade , Hispânico ou Latino/genética
13.
HGG Adv ; 5(4): 100338, 2024 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-39095990

RESUMO

Multivariable Mendelian randomization allows simultaneous estimation of direct causal effects of multiple exposure variables on an outcome. When the exposure variables of interest are quantitative omic features, obtaining complete data can be economically and technically challenging: the measurement cost is high, and the measurement devices may have inherent detection limits. In this paper, we propose a valid and efficient method to handle unmeasured and undetectable values of the exposure variables in a one-sample multivariable Mendelian randomization analysis with individual-level data. We estimate the direct causal effects with maximum likelihood estimation and develop an expectation-maximization algorithm to compute the estimators. We show the advantages of the proposed method through simulation studies and provide an application to the Hispanic Community Health Study/Study of Latinos, which has a large amount of unmeasured exposure data.


Assuntos
Hispânico ou Latino , Análise da Randomização Mendeliana , Humanos , Hispânico ou Latino/genética , Análise da Randomização Mendeliana/métodos , Algoritmos , Funções Verossimilhança , Simulação por Computador , Análise Multivariada
14.
Stat Med ; 32(8): 1283-93, 2013 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-22987667

RESUMO

In various medical related researches, excessive zeros, which make the standard Poisson regression model inadequate, often exist in count data. We proposed a covariate-dependent random effect model to accommodate the excess zeros and the heterogeneity in the population simultaneously. This work is motivated by a data set from a survey on the dental health status of Hong Kong preschool children where the response variable is the number of decayed, missing, or filled teeth. The random effect has a sound biological interpretation as the overall oral health status or other personal qualities of an individual child that is unobserved and unable to be quantified easily. The overall measure of oral health status, responsible for accommodating the excessive zeros and also the heterogeneity among the children, is covariate dependent. This covariate-dependent random effect model allows one to distinguish whether a potential covariate has an effect on the conceived overall oral health condition of the children, that is, the random effect, or has a direct effect on the magnitude of the counts, or both. We proposed a multiple imputation approach for estimation of the parameters. We discussed the choice of the imputation size. We evaluated the performance of the proposed estimation method through simulation studies, and we applied the model and method to the dental data.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Modelos Estatísticos , Pré-Escolar , Simulação por Computador , Hong Kong , Humanos , Saúde Bucal
15.
Biom J ; 55(5): 771-88, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23720128

RESUMO

There is a growing interest in the analysis of survival data with a cured proportion particularly in tumor recurrences studies. Biologically, it is reasonable to assume that the recurrence time is mainly affected by the overall health condition of the patient that depends on some covariates such as age, sex, or treatment type received. We propose a semiparametric frailty-Cox cure model to quantify the overall health condition of the patient by a covariate-dependent frailty that has a discrete mass at zero to characterize the cured patients, and a positive continuous part to characterize the heterogeneous health conditions among the uncured patients. A multiple imputation estimation method is proposed for the right-censored case, which is further extended to accommodate interval-censored data. Simulation studies show that the performance of the proposed method is highly satisfactory. For illustration, the model is fitted to a set of right-censored melanoma incidence data and a set of interval-censored breast cosmesis data. Our analysis suggests that patients receiving treatment of radiotherapy with adjuvant chemotherapy have a significantly higher probability of breast retraction, but also a lower hazard rate of breast retraction among those patients who will eventually experience the event with similar health conditions. The interpretation is very different to those based on models without a cure component that the treatment of radiotherapy with adjuvant chemotherapy significantly increases the risk of breast retraction.


Assuntos
Biometria/métodos , Modelos Estatísticos , Neoplasias/terapia , Humanos , Estimativa de Kaplan-Meier , Neoplasias/tratamento farmacológico , Neoplasias/radioterapia , Análise de Sobrevida , Resultado do Tratamento
16.
Stat Methods Med Res ; 32(11): 2083-2095, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37559549

RESUMO

Contemporary works in change-point survival models mainly focus on an unknown universal change-point shared by the whole study population. However, in some situations, the change-point is plausibly individual-specific, such as when it corresponds to the telomere length or menopausal age. Also, maximum-likelihood-based inference for the fixed change-point parameter is notoriously complicated. The asymptotic distribution of the maximum-likelihood estimator is non-standard, and computationally intensive bootstrap techniques are commonly used to retrieve its sampling distribution. This article is motivated by a breast cancer study, where the disease-free survival time of the patients is postulated to be regulated by the menopausal age, which is unobserved. As menopausal age varies across patients, a fixed change-point survival model may be inadequate. Therefore, we propose a novel proportional hazards model with a random change-point. We develop a nonparametric maximum-likelihood estimation approach and devise a stable expectation-maximization algorithm to compute the estimators. Because the model is regular, we employ conventional likelihood theory for inference based on the asymptotic normality of the Euclidean parameter estimators, and the variance of the asymptotic distribution can be consistently estimated by a profile-likelihood approach. A simulation study demonstrates the satisfactory finite-sample performance of the proposed methods, which yield small bias and proper coverage probabilities. The methods are applied to the motivating breast cancer study.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Funções Verossimilhança , Análise de Sobrevida , Modelos de Riscos Proporcionais , Simulação por Computador
17.
J Am Stat Assoc ; 114(528): 1778-1786, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31920211

RESUMO

Analysis of genomic data is often complicated by the presence of missing values, which may arise due to cost or other reasons. The prevailing approach of single imputation is generally invalid if the imputation model is misspecified. In this paper, we propose a robust score statistic based on imputed data for testing the association between a phenotype and a genomic variable with (partially) missing values. We fit a semiparametric regression model for the genomic variable against an arbitrary function of the linear predictor in the phenotype model and impute each missing value by its estimated posterior expectation. We show that the score statistic with such imputed values is asymptotically unbiased under general missing-data mechanisms, even when the imputation model is misspecified. We develop a spline-based method to estimate the semiparametric imputation model and derive the asymptotic distribution of the corresponding score statistic with a consistent variance estimator using sieve approximation theory and empirical process theory. The proposed test is computationally feasible regardless of the number of independent variables in the imputation model. We demonstrate the advantages of the proposed method over existing methods through extensive simulation studies and provide an application to a major cancer genomics study.

18.
Genome Biol ; 20(1): 52, 2019 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-30845957

RESUMO

We propose a statistical boosting method, termed I-Boost, to integrate multiple types of high-dimensional genomics data with clinical data for predicting survival time. I-Boost provides substantially higher prediction accuracy than existing methods. By applying I-Boost to The Cancer Genome Atlas, we show that the integration of multiple genomics platforms with clinical variables improves the prediction of survival time over the use of clinical variables alone; gene expression values are typically more prognostic of survival time than other genomics data types; and gene modules/signatures are at least as prognostic as the collection of individual gene expression data.


Assuntos
Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Genômica/métodos , Neoplasias/mortalidade , Software , Humanos , Modelos Estatísticos , Neoplasias/genética , Prognóstico , Taxa de Sobrevida
19.
J Am Stat Assoc ; 113(522): 893-905, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30083023

RESUMO

Structural equation modeling is commonly used to capture complex structures of relationships among multiple variables, both latent and observed. We propose a general class of structural equation models with a semiparametric component for potentially censored survival times. We consider nonparametric maximum likelihood estimation and devise a combined Expectation-Maximization and Newton-Raphson algorithm for its implementation. We establish conditions for model identifiability and prove the consistency, asymptotic normality, and semiparametric efficiency of the estimators. Finally, we demonstrate the satisfactory performance of the proposed methods through simulation studies and provide an application to a motivating cancer study that contains a variety of genomic variables. Supplementary materials for this article are available online.

20.
PLoS One ; 9(1): e84406, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24400088

RESUMO

OBJECTIVES: To investigate Hong Kong secondary school students' knowledge of emergency management of dental trauma. METHOD: A questionnaire survey on randomly selected secondary school students using cluster sampling. RESULTS: Only 36.6% (209/571) of the respondents were able to correctly identify the appropriate place for treatment of dental injury. 55.2% of the respondents knew the suitable time for treatment. Only 24.7% of the respondents possessed the knowledge of how to correctly manage fractured teeth. Only 23.6% of them knew how to manage displaced teeth. 62.5% of them correctly answered that knocked-out deciduous teeth should not be replanted to the original position, but few of them (23.6%) knew that permanent teeth should be replanted. Moreover, 37.1% of the respondents correctly identified at least one of the appropriate media for storing a knocked-out tooth. First-aid training and acquisition of dental injury information from other sources were significant factors that positive responses from these questions would lead to higher scores. CONCLUSION: Hong Kong secondary school students' knowledge of emergency management of dental trauma is considered insufficient. An educational campaign in secondary schools dedicated to students is recommended. Prior first-aid training and acquisition of dental injury information from other sources positively relate to the level of knowledge. Dental trauma emergency management is recommended to be added to first-aid publications and be taught to students and health professionals. TRIAL REGISTRATION: Hong Kong Clinical Trial Centre HKCTR-1344.


Assuntos
Primeiros Socorros , Conhecimentos, Atitudes e Prática em Saúde , Estudantes , Traumatismos Dentários/epidemiologia , Ferimentos e Lesões/epidemiologia , Adolescente , Criança , Assistência Odontológica , Inquéritos de Saúde Bucal , Feminino , Hong Kong/epidemiologia , Humanos , Masculino , Instituições Acadêmicas , Inquéritos e Questionários , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA