RESUMO
Recent technological advances have made it possible to collect high-dimensional genomic data along with clinical data on a large number of subjects. In the studies of chronic diseases such as cancer, it is of great interest to integrate clinical and genomic data to build a comprehensive understanding of the disease mechanisms. Despite extensive studies on integrative analysis, it remains an ongoing challenge to model the interaction effects between clinical and genomic variables, due to high dimensionality of the data and heterogeneity across data types. In this paper, we propose an integrative approach that models interaction effects using a single-index varying-coefficient model, where the effects of genomic features can be modified by clinical variables. We propose a penalized approach for separate selection of main and interaction effects. Notably, the proposed methods can be applied to right-censored survival outcomes based on a Cox proportional hazards model. We demonstrate the advantages of the proposed methods through extensive simulation studies and provide applications to a motivating cancer genomic study.
Assuntos
Genômica , Neoplasias , Humanos , Modelos de Riscos Proporcionais , Simulação por Computador , Neoplasias/genéticaRESUMO
Estimating optimal individualized treatment rules (ITRs) in single or multi-stage clinical trials is one key solution to personalized medicine and has received more and more attention in statistical community. Recent development suggests that using machine learning approaches can significantly improve the estimation over model-based methods. However, proper inference for the estimated ITRs has not been well established in machine learning based approaches. In this paper, we propose a entropy learning approach to estimate the optimal individualized treatment rules (ITRs). We obtain the asymptotic distributions for the estimated rules so further provide valid inference. The proposed approach is demonstrated to perform well in finite sample through extensive simulation studies. Finally, we analyze data from a multi-stage clinical trial for depression patients. Our results offer novel findings that are otherwise not revealed with existing approaches.
RESUMO
High-throughput profiling is now common in biomedical research. In this paper we consider the layout of an etiology study composed of a failure time response, and gene expression measurements. In current practice, a widely adopted approach is to select genes according to a preliminary marginal screening and a follow-up penalized regression for model building. Confounders, including for example clinical risk factors and environmental exposures, usually exist and need to be properly accounted for. We propose covariate-adjusted screening and variable selection procedures under the accelerated failure time model. While penalizing the high-dimensional coefficients to achieve parsimonious model forms, our procedure also properly adjust the low-dimensional confounder effects to achieve more accurate estimation of regression coefficients. We establish the asymptotic properties of our proposed methods and carry out simulation studies to assess the finite sample performance. Our methods are illustrated with a real gene expression data analysis where proper adjustment of confounders produces more meaningful results.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sobrevida , Perfilação da Expressão Gênica , HumanosRESUMO
Comparisons of cognitive processing in monolinguals and bilinguals have revealed a bilingual advantage in inhibitory control. Recent studies have demonstrated advantages associated with exposure to two languages in infancy. However, the domain specificity and scope of the infant bilingual advantage in infancy remains unclear. In the present study, 114 monolingual and bilingual infants were compared in a very basic task of information processing-visual habituation-at 6 months of age. Bilingual infants demonstrated greater efficiency in stimulus encoding as well as in improved recognition memory for familiar stimuli as compared to monolinguals. Findings reveal a generalized cognitive advantage in bilingual infants that is broad in scope, early to emerge, and not specific to language.
Assuntos
Desenvolvimento Infantil/fisiologia , Habituação Psicofisiológica/fisiologia , Multilinguismo , Reconhecimento Psicológico/fisiologia , Percepção Visual/fisiologia , Feminino , Humanos , Lactente , MasculinoRESUMO
In this paper, we extend the definitions of the net reclassification improvement (NRI) and the integrated discrimination improvement (IDI) in the context of multicategory classification. Both measures were proposed in Pencina and others (2008. Evaluating the added predictive ability of a new marker: from area under the receiver operating characteristic (ROC) curve to reclassification and beyond. Statistics in Medicine 27, 157-172) as numeric characterizations of accuracy improvement for binary diagnostic tests and were shown to have certain advantage over analyses based on ROC curves or other regression approaches. Estimation and inference procedures for the multiclass NRI and IDI are provided in this paper along with necessary asymptotic distributional results. Simulations are conducted to study the finite-sample properties of the proposed estimators. Two medical examples are considered to illustrate our methodology.
Assuntos
Testes Diagnósticos de Rotina/estatística & dados numéricos , Modelos Estatísticos , Área Sob a Curva , Biomarcadores/análise , Bioestatística , Simulação por Computador , Expressão Gênica , Humanos , Leucemia/classificação , Leucemia/genética , Modelos Logísticos , Curva ROC , Sinovite/diagnósticoRESUMO
Background: Fournier's gangrene (FG) is a rare life-threatening form of necrotizing fasciitis. The risk factors for septic shock in patients with FG are unclear. This study aimed to identify potential risk factors and develop a prediction model for septic shock in patients with FG. Methods: This retrospective cohort study included patients who were treated for FG between May 2013 and May 2020 at the Sixth Affiliated Hospital, Sun Yat-sen University (Guangzhou, China). The patients were divided into a septic shock group and a non-septic shock group. An L1-penalized logistic regression model was used to detect the main effect of important factors and a penalized Quadratic Discriminant Analysis method was used to identify possible interaction effects between different factors. The selected main factors and interactions were used to obtain a logistic regression model based on the Bayesian information criterion. Results: A total of 113 patients with FG were enrolled and allocated to the septic shock group (n = 24) or non-septic shock group (n = 89). The best model selected identified by backward logistic regression based on Bayesian information criterion selected temperature, platelets, total bilirubin (TBIL) level, and pneumatosis on pelvic computed tomography/magnetic resonance images as the main linear effect and Na+ × TBIL as the interaction effect. The area under the ROC curve of the probability of FG with septic shock by our model was 0.84 (95% confidence interval, 0.78-0.95). The Harrell's concordance index for the nomogram was 0.864 (95% confidence interval, 0.78-0.95). Conclusion: We have developed a prediction model for evaluation of the risk of septic shock in patients with FG that could assist clinicians in identifying critically ill patients with FG and prevent them from reaching a crisis state.
RESUMO
BACKGROUND: Older adults have been reported to be a population with high-risk of death in the COVID-19 outbreak. Rapid detection of high-risk patients is crucial to reduce mortality in this population. The aim of this study was to evaluate the prognositc accuracy of the Modified Early Warning Score (MEWS) for in-hospital mortality in older adults with COVID-19. METHODS: A retrospective cohort study was conducted in Wuhan Hankou Hospital in China from 1 January 2020 to 29 February 2020. Receiver operating characteristic (ROC) analysis was used to evaluate the predictive value of MEWS, Acute Physiology and Chronic Health Evaluation II (APACHE II), Sequential Organ Function Assessment (SOFA), quick Sequential Organ Function Assessment (qSOFA), Pneumonia Severity Index (PSI), Combination of Confusion, Urea, Respiratory Rate, Blood Pressure, and Age ≥65 (CURB-65), and the Systemic Inflammatory Response Syndrome Criteria (SIRS) for in-hospital mortality. Logistic regression models were performed to detect the high-risk older adults with COVID-19. RESULTS: Among the 235 patients included in this study, 37 (15.74%) died and 131 (55.74%) were male, with an average age of 70.61 years (SD 8.02). ROC analysis suggested that the capacity of MEWS in predicting in-hospital mortality was as good as the APACHE II, SOFA, PSI and qSOFA (Difference in AUROC: MEWS vs. APACHE II, -0.025 (95% CI [-0.075 to 0.026]); MEWS vs. SOFA, -0.013 (95% CI [-0.049 to 0.024]); MEWS vs. PSI, -0.015 (95% CI [-0.065 to 0.035]); MEWS vs. qSOFA, 0.024 (95% CI [-0.029 to 0.076]), all P > 0.05), but was significantly higher than SIRS and CURB-65 (Difference in AUROC: MEWS vs. SIRS, 0.218 (95% CI [0.156-0.279]); MEWS vs. CURB-65, 0.064 (95% CI [0.002-0.125]), all P < 0.05). Logistic regression models implied that the male patients (≥75 years) had higher risk of death than the other older adults (estimated coefficients: 1.16, P = 0.044). Our analysis further suggests that the cut-off points of the MEWS score for the male patients (≥75 years) subpopulation and the other elderly patients should be 2.5 and 3.5, respectively. CONCLUSIONS: MEWS is an efficient tool for rapid assessment of elderly COVID-19 patients. MEWS has promising performance in predicting in-hospital mortality and identifying the high-risk group in elderly patients with COVID-19.