Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Stat Med ; 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38857600

RESUMO

Analysis of competing risks data has been an important topic in survival analysis due to the need to account for the dependence among the competing events. Also, event times are often recorded on discrete time scales, rendering the models tailored for discrete-time nature useful in the practice of survival analysis. In this work, we focus on regression analysis with discrete-time competing risks data, and consider the errors-in-variables issue where the covariates are prone to measurement errors. Viewing the true covariate value as a parameter, we develop the conditional score methods for various discrete-time competing risks models, including the cause-specific and subdistribution hazards models that have been popular in competing risks data analysis. The proposed estimators can be implemented by efficient computation algorithms, and the associated large sample theories can be simply obtained. Simulation results show satisfactory finite sample performances, and the application with the competing risks data from the scleroderma lung study reveals the utility of the proposed methods.

2.
J Epidemiol ; 33(1): 52-61, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-34053962

RESUMO

BACKGROUND: This cohort was established to evaluate whether 38-year radiation exposure (since the start of nuclear reactor operations) is related to cancer risk in residents near three nuclear power plants (NPPs). METHODS: This cohort study enrolled all residents who lived within 8 km of any of the three NPPs in Taiwan from 1978 to 2016 (n = 214,502; person-years = 4,660,189). The control population (n = 257,475; person-years = 6,282,390) from three towns comprised all residents having lived more than 15 km from all three NPPs. Radiation exposure will be assessed via computer programs GASPAR-II and LADTAP-II by following methodologies provided in the United States Nuclear Regulatory Commission regulatory guides. We calculated the cumulative individual tissue organ equivalent dose and cumulative effective dose for each resident. This study presents the number of new cancer cases and prevalence in the residence-nearest NPP group and control group in the 38-year research observation period. CONCLUSION: TNPECS provides a valuable platform for research and opens unique possibilities for testing whether radiation exposure since the start of operations of nuclear reactors will affect health across the life course. The release of radioactive nuclear species caused by the operation of NPPs caused residents to have an effective dose between 10-7 and 10-3 mSv/year. The mean cumulative medical radiation exposure dose between the residence-nearest NPP group and the control group was not different (7.69; standard deviation, 18.39 mSv and 7.61; standard deviation, 19.17 mSv; P = 0.114).


Assuntos
Neoplasias , Exposição à Radiação , Humanos , Estudos de Coortes , Japão , Neoplasias/epidemiologia , Centrais Nucleares , Exposição à Radiação/efeitos adversos , Taiwan/epidemiologia , Estados Unidos
3.
Biom J ; 65(3): e2100361, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36285659

RESUMO

Joint analysis of recurrent and nonrecurrent terminal events has attracted substantial attention in literature. However, there lacks formal methodology for such analysis when the event time data are on discrete scales, even though some modeling and inference strategies have been developed for discrete-time survival analysis. We propose a discrete-time joint modeling approach for the analysis of recurrent and terminal events where the two types of events may be correlated with each other. The proposed joint modeling assumes a shared frailty to account for the dependence among recurrent events and between the recurrent and the terminal terminal events. Also, the joint modeling allows for time-dependent covariates and rich families of transformation models for the recurrent and terminal events. A major advantage of our approach is that it does not assume a distribution for the frailty, nor does it assume a Poisson process for the analysis of the recurrent event. The utility of the proposed analysis is illustrated by simulation studies and two real applications, where the application to the biochemists' rank promotion data jointly analyzes the biochemists' citation numbers and times to rank promotion, and the application to the scleroderma lung study data jointly analyzes the adverse events and off-drug time among patients with the symptomatic scleroderma-related interstitial lung disease.


Assuntos
Fragilidade , Modelos Estatísticos , Humanos , Recidiva , Simulação por Computador , Análise de Sobrevida
4.
BMC Bioinformatics ; 23(1): 202, 2022 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-35637439

RESUMO

BACKGROUND: In the context of biomedical and epidemiological research, gene-environment (G-E) interaction is of great significance to the etiology and progression of many complex diseases. In high-dimensional genetic data, two general models, marginal and joint models, are proposed to identify important interaction factors. Most existing approaches for identifying G-E interactions are limited owing to the lack of robustness to outliers/contamination in response and predictor data. In particular, right-censored survival outcomes make the associated feature screening even challenging. In this article, we utilize the overlapping group screening (OGS) approach to select important G-E interactions related to clinical survival outcomes by incorporating the gene pathway information under a joint modeling framework. RESULTS: Simulation studies under various scenarios are carried out to compare the performances of our proposed method with some commonly used methods. In the real data applications, we use our proposed method to identify G-E interactions related to the clinical survival outcomes of patients with head and neck squamous cell carcinoma, and esophageal carcinoma in The Cancer Genome Atlas clinical survival genetic data, and further establish corresponding survival prediction models. Both simulation and real data studies show that our method performs well and outperforms existing methods in the G-E interaction selection, effect estimation, and survival prediction accuracy. CONCLUSIONS: The OGS approach is useful for selecting important environmental factors, genes and G-E interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The same idea of the OGS approach can apply to other outcome models, such as the proportional odds survival time model, the logistic regression model for binary outcomes, and the multinomial logistic regression model for multi-class outcomes.


Assuntos
Interação Gene-Ambiente , Neoplasias , Simulação por Computador , Genômica , Humanos , Neoplasias/genética , Pesquisa
5.
Bioinformatics ; 37(15): 2150-2156, 2021 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-33595070

RESUMO

MOTIVATION: In high-dimensional genetic/genomic data, the identification of genes related to clinical survival trait is a challenging and important issue. In particular, right-censored survival outcomes and contaminated biomarker data make the relevant feature screening difficult. Several independence screening methods have been developed, but they fail to account for gene-gene dependency information, and may be sensitive to outlying feature data. RESULTS: We improve the inverse probability-of-censoring weighted (IPCW) Kendall's tau statistic by using Google's PageRank Markov matrix to incorporate feature dependency network information. Also, to tackle outlying feature data, the nonparanormal approach transforming the feature data to multivariate normal variates are utilized in the graphical lasso procedure to estimate the network structure in feature data. Simulation studies under various scenarios show that the proposed network-adjusted weighted Kendall's tau approach leads to more accurate feature selection and survival prediction than the methods without accounting for feature dependency network information and outlying feature data. The applications on the clinical survival outcome data of diffuse large B-cell lymphoma and of The Cancer Genome Atlas lung adenocarcinoma patients demonstrate clearly the advantages of the new proposal over the alternative methods. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

6.
Bioinformatics ; 36(9): 2763-2769, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31926011

RESUMO

MOTIVATION: In gene expression and genome-wide association studies, the identification of interaction effects is an important and challenging issue owing to its ultrahigh-dimensional nature. In particular, contaminated data and right-censored survival outcome make the associated feature screening even challenging. RESULTS: In this article, we propose an inverse probability-of-censoring weighted Kendall's tau statistic to measure association of a survival trait with biomarkers, as well as a Kendall's partial correlation statistic to measure the relationship of a survival trait with an interaction variable conditional on the main effects. The Kendall's partial correlation is then used to conduct interaction screening. Simulation studies under various scenarios are performed to compare the performance of our proposal with some commonly available methods. In the real data application, we utilize our proposed method to identify epistasis associated with the clinical survival outcomes of non-small-cell lung cancer, diffuse large B-cell lymphoma and lung adenocarcinoma patients. Both simulation and real data studies demonstrate that our method performs well and outperforms existing methods in identifying main and interaction biomarkers. AVAILABILITY AND IMPLEMENTATION: R-package 'IPCWK' is available to implement this method, together with a reference manual describing how to perform the 'IPCWK' package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Estudo de Associação Genômica Ampla , Humanos , Neoplasias Pulmonares/genética , Fenótipo
7.
Stat Med ; 39(29): 4372-4385, 2020 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-32871614

RESUMO

Survival analysis has been conventionally performed on a continuous time scale. In practice, the survival time is often recorded or handled on a discrete scale; when this is the case, the discrete-time survival analysis would provide analysis results more relevant to the actual data scale. Besides, data on time-dependent covariates in the survival analysis are usually collected through intermittent follow-ups, resulting in the missing and mismeasured covariate data. In this work, we propose the sufficient discrete hazard (SDH) approach to discrete-time survival analysis with longitudinal covariates that are subject to missingness and mismeasurement. The SDH method employs the conditional score idea available for dealing with mismeasured covariates, and the penalized least squares for estimating the missing covariate value using the regression spline basis. The SDH method is developed for the single event analysis with the logistic discrete hazard model, and for the competing risks analysis with the multinomial logit model. Simulation results revel good finite-sample performances of the proposed estimator and the associated asymptotic theory. The proposed SDH method is applied to the scleroderma lung study data, where the time to medication withdrawal and time to death were recorded discretely in months, for illustration.


Assuntos
Projetos de Pesquisa , Simulação por Computador , Humanos , Modelos de Riscos Proporcionais , Medição de Risco , Análise de Sobrevida
8.
Stat Med ; 39(22): 2936-2948, 2020 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-32578241

RESUMO

In controlled trials, "treatment switching" occurs when patients in one treatment group switch to alternative treatments during the trial, and poses challenges to treatment effect evaluation owing to crossover of the treatments groups. In this work, we assume that treatment switching can occur after some disease progression event and view the progression and death events as two semicompeting risks. The proposed model consists of a copula model for the joint distribution of time-to-progression (TTP) and overall survival (OS) up to the earlier of the two events, as well as a conditional hazard model for OS subsequent to progression. The copula model facilitates assessing the marginal distributions of TTP and OS separately from the association between the two events, and, in particular, the treatment effect on OS in the absence of treatment switching. The proposed conditional hazard model for death subsequent to progression allows us to assess the treatment switching (crossover) effect on OS given occurrence of progression and covariates. Semiparametric proportional hazards models are employed in the marginal models for TTP and OS. A nonparametric maximum likelihood procedure is developed for model inference, which is verified through asymptotic theory and simulation studies. The proposed analysis is applied to a lung cancer dataset to illustrate its real utility.


Assuntos
Modelos Estatísticos , Troca de Tratamento , Simulação por Computador , Humanos , Probabilidade , Modelos de Riscos Proporcionais
9.
Biom J ; 62(5): 1164-1175, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32022280

RESUMO

We propose a joint analysis of recurrent and nonrecurrent event data subject to general types of interval censoring. The proposed analysis allows for general semiparametric models, including the Box-Cox transformation and inverse Box-Cox transformation models for the recurrent and nonrecurrent events, respectively. A frailty variable is used to account for the potential dependence between the recurrent and nonrecurrent event processes, while leaving the distribution of the frailty unspecified. We apply the pseudolikelihood for interval-censored recurrent event data, usually termed as panel count data, and the sufficient likelihood for interval-censored nonrecurrent event data by conditioning on the sufficient statistic for the frailty and using the working assumption of independence over examination times. Large sample theory and a computation procedure for the proposed analysis are established. We illustrate the proposed methodology by a joint analysis of the numbers of occurrences of basal cell carcinoma over time and time to the first recurrence of squamous cell carcinoma based on a skin cancer dataset, as well as a joint analysis of the numbers of adverse events and time to premature withdrawal from study medication based on a scleroderma lung disease dataset.


Assuntos
Fragilidade , Modelos Estatísticos , Carcinoma Basocelular , Carcinoma de Células Escamosas , Doença Crônica , Fragilidade/diagnóstico , Fragilidade/epidemiologia , Humanos , Pneumopatias , Recidiva Local de Neoplasia , Esclerodermia Localizada , Neoplasias Cutâneas
10.
BMC Bioinformatics ; 19(1): 335, 2018 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-30241463

RESUMO

BACKGROUND: The development of a disease is a complex process that may result from joint effects of multiple genes. In this article, we propose the overlapping group screening (OGS) approach to determining active genes and gene-gene interactions incorporating prior pathway information. The OGS method is developed to overcome the challenges in genome-wide data analysis that the number of the genes and gene-gene interactions is far greater than the sample size, and the pathways generally overlap with one another. The OGS method is further proposed for patients' survival prediction based on gene expression data. RESULTS: Simulation studies demonstrate that the performance of the OGS approach in identifying the true main and interaction effects is good and the survival prediction accuracy of OGS with the Lasso penalty is better than the ordinary Lasso method. In real data analysis, we identify several significant genes and/or epistasis interactions that are associated with clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) and non-small-cell lung cancer (NSCLC) by utilizing prior pathway information from the KEGG pathway and the GO biological process databases, respectively. CONCLUSIONS: The OGS approach is useful for selecting important genes and epistasis interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The OGS approach is generally applicable to various types of outcome data (quantitative, qualitative, censored event time data) and regression models (e.g. linear, logistic, and Cox's regression models).


Assuntos
Carcinoma Pulmonar de Células não Pequenas/mortalidade , Epistasia Genética , Loci Gênicos , Neoplasias Pulmonares/mortalidade , Linfoma Difuso de Grandes Células B/mortalidade , Transcriptoma , Algoritmos , Carcinoma Pulmonar de Células não Pequenas/genética , Simulação por Computador , Bases de Dados Factuais , Perfilação da Expressão Gênica , Humanos , Neoplasias Pulmonares/genética , Linfoma Difuso de Grandes Células B/genética , Valor Preditivo dos Testes , Taxa de Sobrevida
11.
PLoS Comput Biol ; 13(6): e1005601, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28622336

RESUMO

Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.


Assuntos
Interpretação Estatística de Dados , Bases de Conhecimento , Modelos Biológicos , Modelos Estatísticos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Algoritmos , Simulação por Computador , Proteômica/métodos
12.
Biometrics ; 74(4): 1223-1231, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-29665618

RESUMO

We develop a joint analysis approach for recurrent and nonrecurrent event processes subject to case I interval censorship, which are also known in literature as current count and current status data, respectively. We use a shared frailty to link the recurrent and nonrecurrent event processes, while leaving the distribution of the frailty fully unspecified. Conditional on the frailty, the recurrent event is assumed to follow a nonhomogeneous Poisson process, and the mean function of the recurrent event and the survival function of the nonrecurrent event are assumed to follow some general form of semiparametric transformation models. Estimation of the models is based on the pseudo-likelihood and the conditional score techniques. The resulting estimators for the regression parameters and the unspecified baseline functions are shown to be consistent with rates of square and cubic roots of the sample size, respectively. Asymptotic normality with closed-form asymptotic variance is derived for the estimator of the regression parameters. We apply the proposed method to a fracture-osteoporosis survey data to identify risk factors jointly for fracture and osteoporosis in elders, while accounting for association between the two events within a subject.


Assuntos
Biometria/métodos , Interpretação Estatística de Dados , Idoso Fragilizado , Funções Verossimilhança , Distribuição de Poisson , Idoso , Idoso de 80 Anos ou mais , Simulação por Computador , Fraturas Ósseas , Fragilidade , Humanos , Osteoporose , Recidiva , Fatores de Risco , Tamanho da Amostra
13.
Biometrics ; 74(3): 934-943, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29534287

RESUMO

We propose a model selection criterion for semiparametric marginal mean regression based on generalized estimating equations. The work is motivated by a longitudinal study on the physical frailty outcome in the elderly, where the cluster size, that is, the number of the observed outcomes in each subject, is "informative" in the sense that it is related to the frailty outcome itself. The new proposal, called Resampling Cluster Information Criterion (RCIC), is based on the resampling idea utilized in the within-cluster resampling method (Hoffman, Sen, and Weinberg, 2001, Biometrika 88, 1121-1134) and accommodates informative cluster size. The implementation of RCIC, however, is free of performing actual resampling of the data and hence is computationally convenient. Compared with the existing model selection methods for marginal mean regression, the RCIC method incorporates an additional component accounting for variability of the model over within-cluster subsampling, and leads to remarkable improvements in selecting the correct model, regardless of whether the cluster size is informative or not. Applying the RCIC method to the longitudinal frailty study, we identify being female, old age, low income and life satisfaction, and chronic health conditions as significant risk factors for physical frailty in the elderly.


Assuntos
Análise por Conglomerados , Modelos Estatísticos , Análise de Regressão , Idoso , Fragilidade , Humanos , Estudos Longitudinais , Fatores de Risco , Tamanho da Amostra
14.
Biom J ; 60(1): 20-33, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28910499

RESUMO

This work develops a joint model selection criterion for simultaneously selecting the marginal mean regression and the correlation/covariance structure in longitudinal data analysis where both the outcome and the covariate variables may be subject to general intermittent patterns of missingness under the missing at random mechanism. The new proposal, termed "joint longitudinal information criterion" (JLIC), is based on the expected quadratic error for assessing model adequacy, and the second-order weighted generalized estimating equation (WGEE) estimation for mean and covariance models. Simulation results reveal that JLIC outperforms existing methods performing model selection for the mean regression and the correlation structure in a two stage and hence separate manner. We apply the proposal to a longitudinal study to identify factors associated with life satisfaction in the elderly of Taiwan.


Assuntos
Biometria/métodos , Humanos , Estudos Longitudinais , Modelos Teóricos , Análise Multivariada , Análise de Regressão
15.
Stat Med ; 36(21): 3380-3397, 2017 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-28574584

RESUMO

Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Índice de Massa Corporal , Métodos Epidemiológicos , Estudos Longitudinais , Análise de Regressão , Adolescente , Simulação por Computador , Humanos , Obesidade/epidemiologia , Pais , Medição de Risco/métodos , Fatores de Risco , Instituições Acadêmicas , Estudantes
16.
Lifetime Data Anal ; 23(1): 83-101, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27325570

RESUMO

We consider ordered bivariate gap time while data on the first gap time are unobservable. This study is motivated by the HIV infection and AIDS study, where the initial HIV contracting time is unavailable, but the diagnosis times for HIV and AIDS are available. We are interested in studying the risk factors for the gap time between initial HIV contraction and HIV diagnosis, and gap time between HIV and AIDS diagnoses. Besides, the association between the two gap times is also of interest. Accordingly, in the data analysis we are faced with two-fold complexity, namely data on the first gap time is completely missing, and the second gap time is subject to induced informative censoring due to dependence between the two gap times. We propose a modeling framework for regression analysis of bivariate gap time under the complexity of the data. The estimating equations for the covariate effects on, as well as the association between, the two gap times are derived through maximum likelihood and suitable counting processes. Large sample properties of the resulting estimators are developed by martingale theory. Simulations are performed to examine the performance of the proposed analysis procedure. An application of data from the HIV and AIDS study mentioned above is reported for illustration.


Assuntos
Infecções por HIV , Modelos Estatísticos , Análise de Regressão , Humanos , Probabilidade , Fatores de Tempo
17.
Biostatistics ; 16(4): 740-53, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26012353

RESUMO

Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Análise de Regressão , Projetos de Pesquisa , Envelhecimento , Humanos , Estudos Longitudinais
18.
Stat Med ; 35(2): 202-13, 2016 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-26250879

RESUMO

The receiver operating characteristic (ROC) curve and the area under the ROC curve have been popularly employed in evaluating the diagnosis accuracy for diseases with binary outcome categories and have been naturally used as the utility measures for finding the 'optimal' linear combination of multiple biomarkers, in the hope to improve the diagnostic accuracy based on each single biomarker. For diseases with more than two outcome categories, the ROC manifold and the hypervolume under the ROC manifold (HUM) have been analogously proposed as diagnostic accuracy measures. However, finding optimal combinations of biomarkers based on the HUM criterion is less easily feasible in computation, especially when the number of disease categories is more than three and the number of biomarkers is large. In this study, we propose two new indices for evaluating the diagnostic accuracy for multi-category diagnosis, which are related to the lower and upper bounds of HUM, and involve only diagnostic accuracies for comparing adjacent pairs of outcome categories. We then propose finding the optimal linear combinations of biomarkers for multi-category diagnosis using the new indices as the criterion functions. Simulations and real data examples show that the optimal linear combinations identified by the new proposal perform quite well in diagnostic accuracy and can be much more efficient in computation than the HUM-based method.


Assuntos
Biomarcadores/análise , Técnicas e Procedimentos Diagnósticos/estatística & dados numéricos , Curva ROC , Doença de Alzheimer/diagnóstico , Bioestatística/métodos , Simulação por Computador , Cardiopatias/diagnóstico , Humanos , Modelos Lineares , Análise Multivariada , Estatísticas não Paramétricas
19.
Am J Epidemiol ; 180(3): 308-17, 2014 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-24966224

RESUMO

Bias caused by missing or incomplete information on confounding factors constitutes an important challenge in observational studies. The incorporation of external data on more detailed confounding information into the main study data may help remove confounding bias. This work was motivated by a study of the association between chronic obstructive pulmonary disease and herpes zoster. Analyses were based on administrative databases in which information on important confounders-cigarette smoking and alcohol consumption-was lacking. We consider adjusting for the confounding bias arising from missing confounders by incorporating a validation sample with data on smoking and alcohol consumption obtained from a small-scale National Health Interview Survey study. We propose a 2-stage calibration (TSC) method, which summarizes the confounding information through propensity scores and combines the analysis results from the main and the validation study samples, where the propensity score adjustment from the main sample is crude and that from the validation sample is more precise. Unlike the existing methods, the validity of the TSC approach does not rely on any specific measurement error model. When applying the TSC method to the motivating study above, the odds ratio of herpes zoster associated with chronic obstructive pulmonary disease is 1.91 (95% confidence interval: 1.62, 2.26) after adjustment for cumulative smoking and alcohol consumption.


Assuntos
Fatores de Confusão Epidemiológicos , Herpes Zoster/complicações , Pontuação de Propensão , Doença Pulmonar Obstrutiva Crônica/complicações , Consumo de Bebidas Alcoólicas , Calibragem , Comorbidade , Bases de Dados Factuais , Humanos , Estudos Observacionais como Assunto , Razão de Chances , Projetos de Pesquisa , Fumar , Estudos de Validação como Assunto
20.
Ann Hum Genet ; 78(4): 299-305, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24766627

RESUMO

One of the greatest challenges in genetic studies is the determination of gene-environment interactions due to underlying complications and inadequate statistical power. With the increased sample size gained by using case-parent trios and unrelated cases and controls, the performance may be much improved. Focusing on a dichotomous trait, a two-stage approach was previously proposed to deal with gene-environment interaction when utilizing mixed study samples. Theoretically, the two-stage association analysis uses likelihood functions such that the computational algorithms may not converge in the maximum likelihood estimation with small study samples. In an effort to avoid such convergence issues, we propose a logistic regression framework model, based on the combined haplotype relative risk (CHRR) method, which intuitively pools the case-parent trios and unrelated subjects in a two by two table. A positive feature of the logistic regression model is the effortless adjustment for either discrete or continuous covariates. According to computer simulations, under the circumstances in which the two-stage test converges in larger sample sizes, we discovered that the performances of the two tests were quite similar; the two-stage test is more powerful under the dominant and additive disease models, but the extended CHRR is more powerful under the recessive disease model.


Assuntos
Interação Gene-Ambiente , Modelos Logísticos , Algoritmos , Alelos , Estudos de Casos e Controles , Simulação por Computador , Frequência do Gene , Humanos , Modelos Genéticos , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA