Pesquisa | BVS Educação Profissional em Saúde

1.

Predicting the onset of breast cancer using mammogram imaging data with irregular boundary.

Jiang, Shu; Cao, Jiguo; Colditz, Graham A; Rosner, Bernard.

Biostatistics ; 24(2): 358-371, 2023 04 14.

Artigo em Inglês | MEDLINE | ID: mdl-34435196

RESUMO

With mammography being the primary breast cancer screening strategy, it is essential to make full use of the mammogram imaging data to better identify women who are at higher and lower than average risk. Our primary goal in this study is to extract mammogram-based features that augment the well-established breast cancer risk factors to improve prediction accuracy. In this article, we propose a supervised functional principal component analysis (sFPCA) over triangulations method for extracting features that are ordered by the magnitude of association with the failure time outcome. The proposed method accommodates the irregular boundary issue posed by the breast area within the mammogram imaging data with flexible bivariate splines over triangulations. We also provide an eigenvalue decomposition algorithm that is computationally efficient. Compared to the conventional unsupervised FPCA method, the proposed method results in a lower Brier Score and higher area under the ROC curve (AUC) in simulation studies. We apply our method to data from the Joanne Knight Breast Health Cohort at Siteman Cancer Center. Our approach not only obtains the best prediction performance comparing to unsupervised FPCA and benchmark models but also reveals important risk patterns within the mammogram images. This demonstrates the importance of utilizing additional supervised image-based features to clarify breast cancer risk.

Assuntos

Neoplasias da Mama , Mamografia , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Mamografia/métodos , Algoritmos , Análise de Componente Principal

2.

Neuroimaging feature extraction using a neural network classifier for imaging genetics.

Beaulac, Cédric; Wu, Sidi; Gibson, Erin; Miranda, Michelle F; Cao, Jiguo; Rocha, Leno; Beg, Mirza Faisal; Nathoo, Farouk S.

BMC Bioinformatics ; 24(1): 271, 2023 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-37391692

RESUMO

BACKGROUND: Dealing with the high dimension of both neuroimaging data and genetic data is a difficult problem in the association of genetic data to neuroimaging. In this article, we tackle the latter problem with an eye toward developing solutions that are relevant for disease prediction. Supported by a vast literature on the predictive power of neural networks, our proposed solution uses neural networks to extract from neuroimaging data features that are relevant for predicting Alzheimer's Disease (AD) for subsequent relation to genetics. The neuroimaging-genetic pipeline we propose is comprised of image processing, neuroimaging feature extraction and genetic association steps. We present a neural network classifier for extracting neuroimaging features that are related with the disease. The proposed method is data-driven and requires no expert advice or a priori selection of regions of interest. We further propose a multivariate regression with priors specified in the Bayesian framework that allows for group sparsity at multiple levels including SNPs and genes. RESULTS: We find the features extracted with our proposed method are better predictors of AD than features used previously in the literature suggesting that single nucleotide polymorphisms (SNPs) related to the features extracted by our proposed method are also more relevant for AD. Our neuroimaging-genetic pipeline lead to the identification of some overlapping and more importantly some different SNPs when compared to those identified with previously used features. CONCLUSIONS: The pipeline we propose combines machine learning and statistical methods to benefit from the strong predictive performance of blackbox models to extract relevant features while preserving the interpretation provided by Bayesian models for genetic association. Finally, we argue in favour of using automatic feature extraction, such as the method we propose, in addition to ROI or voxelwise analysis to find potentially novel disease-relevant SNPs that may not be detected when using ROIs or voxels alone.

Assuntos

Doença de Alzheimer , Neuroimagem , Humanos , Teorema de Bayes , Processamento de Imagem Assistida por Computador , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Redes Neurais de Computação

3.

Supervised two-dimensional functional principal component analysis with time-to-event outcomes and mammogram imaging data.

Jiang, Shu; Cao, Jiguo; Rosner, Bernard; Colditz, Graham A.

Biometrics ; 79(2): 1359-1369, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-34854477

RESUMO

Screening mammography aims to identify breast cancer early and secondarily measures breast density to classify women at higher or lower than average risk for future breast cancer in the general population. Despite the strong association of individual mammography features to breast cancer risk, the statistical literature on mammogram imaging data is limited. While functional principal component analysis (FPCA) has been studied in the literature for extracting image-based features, it is conducted independently of the time-to-event response variable. With the consideration of building a prognostic model for precision prevention, we present a set of flexible methods, supervised FPCA (sFPCA) and functional partial least squares (FPLS), to extract image-based features associated with the failure time while accommodating the added complication from right censoring. Throughout the article, we hope to demonstrate that one method is favored over the other under different clinical setups. The proposed methods are applied to the motivating data set from the Joanne Knight Breast Health cohort at Siteman Cancer Center. Our approaches not only obtain the best prediction performance compared to the benchmark model, but also reveal different risk patterns within the mammograms.

Assuntos

Neoplasias da Mama , Mamografia , Feminino , Humanos , Neoplasias da Mama/diagnóstico , Análise de Componente Principal , Detecção Precoce de Câncer/métodos

4.

A Bayesian spatial model for imaging genetics.

Song, Yin; Ge, Shufei; Cao, Jiguo; Wang, Liangliang; Nathoo, Farouk S.

Biometrics ; 78(2): 742-753, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-33765325

RESUMO

We develop a Bayesian bivariate spatial model for multivariate regression analysis applicable to studies examining the influence of genetic variation on brain structure. Our model is motivated by an imaging genetics study of the Alzheimer's Disease Neuroimaging Initiative (ADNI), where the objective is to examine the association between images of volumetric and cortical thickness values summarizing the structure of the brain as measured by magnetic resonance imaging (MRI) and a set of 486 single nucleotide polymorphism (SNPs) from 33 Alzheimer's disease (AD) candidate genes obtained from 632 subjects. A bivariate spatial process model is developed to accommodate the correlation structures typically seen in structural brain imaging data. First, we allow for spatial correlation on a graph structure in the imaging phenotypes obtained from a neighborhood matrix for measures on the same hemisphere of the brain. Second, we allow for correlation in the same measures obtained from different hemispheres (left/right) of the brain. We develop a mean-field variational Bayes algorithm and a Gibbs sampling algorithm to fit the model. We also incorporate Bayesian false discovery rate (FDR) procedures to select SNPs. We implement the methodology in a new release of the R package bgsmtr. We show that the new spatial model demonstrates superior performance over a standard model in our application. Data used in the preparation of this article were obtained from the ADNI database (https://adni.loni.usc.edu).

Assuntos

Doença de Alzheimer , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Teorema de Bayes , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Humanos , Imageamento por Ressonância Magnética , Neuroimagem

5.

Dynamic prediction with time-dependent marker in survival analysis using supervised functional principal component analysis.

Shi, Haolun; Jiang, Shu; Cao, Jiguo.

Stat Med ; 41(18): 3547-3560, 2022 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-35574725

RESUMO

Time-varying biomarkers reflect important information on disease progression over time. Dynamic prediction for event occurrence on a real-time basis, utilizing time-varying information, is crucial in making accurate clinical decisions. Functional principal component analysis (FPCA) has been widely adopted in the literature for extracting features from time-varying biomarker trajectories. However, feature extraction via FPCA is conducted independent of the time-to-event response, which may not produce optimal results when the goal lies in prediction. With this consideration, we propose a novel supervised FPCA, where the functional principal components are determined to optimize the association between the time-varying biomarker and time-to-event outcome. The proposed framework also accommodates irregularly spaced and sparse longitudinal data. Our method is empirically shown to retain better discrimination and calibration performance than the unsupervised FPCA method in simulation studies. Application of the proposed method is also illustrated in the Alzheimer's Disease Neuroimaging Initiative database.

Assuntos

Neuroimagem , Biomarcadores/análise , Progressão da Doença , Humanos , Análise de Componente Principal , Análise de Sobrevida

6.

uTPI: A utility-based toxicity probability interval design for phase I/II dose-finding trials.

Shi, Haolun; Cao, Jiguo; Yuan, Ying; Lin, Ruitao.

Stat Med ; 40(11): 2626-2649, 2021 05 20.

Artigo em Inglês | MEDLINE | ID: mdl-33650708

RESUMO

Unlike chemotherapy, the maximum tolerated dose (MTD) of molecularly targeted agents and immunotherapy may not pose significant clinical benefit over the lower doses. By simultaneously considering both toxicity and efficacy endpoints, phase I/II trials can identify a more clinically meaningful dose for subsequent phase II trials than traditional toxicity-based phase I trials in terms of risk-benefit tradeoff. To strengthen and simplify the current practice of phase I/II trials, we propose a utility-based toxicity probability interval (uTPI) design for finding the optimal biological dose, based on a numerical utility that provides a clinically meaningful, one-dimensional summary representation of the patient's bivariate toxicity and efficacy outcome. The uTPI design does not rely on any parametric specification of the dose-response relationship, and it directly models the dose desirability through a quasi binomial likelihood. Toxicity probability intervals are used to screen out overly toxic dose levels, and then the dose escalation/de-escalation decisions are made adaptively by comparing the posterior desirability distributions of the adjacent levels of the current dose. The uTPI design is flexible in accommodating various dose desirability formulations, while only requiring minimum design parameters. It has a clear decision structure such that a dose-assignment decision table can be calculated before the trial starts and can be used throughout the trial, which simplifies the practical implementation of the design. Extensive simulation studies demonstrate that the proposed uTPI design yields desirable as well as robust performance under various scenarios.

Assuntos

Antineoplásicos , Teorema de Bayes , Ensaios Clínicos Fase I como Assunto , Ensaios Clínicos Fase II como Assunto , Simulação por Computador , Relação Dose-Resposta a Droga , Humanos , Dose Máxima Tolerável , Modelos Estatísticos , Probabilidade , Projetos de Pesquisa

7.

Functional principal component analysis for longitudinal data with informative dropout.

Shi, Haolun; Dong, Jianghu; Wang, Liangliang; Cao, Jiguo.

Stat Med ; 40(3): 712-724, 2021 02 10.

Artigo em Inglês | MEDLINE | ID: mdl-33179286

RESUMO

In longitudinal studies, the values of biomarkers are often informatively missing due to dropout. The conventional functional principal component analysis typically disregards the missing information and simply treats the unobserved data points as missing completely at random. As a result, the estimation of the mean function and the covariance surface might be biased, resulting in a biased estimation of the functional principal components. We propose the informatively missing functional principal component analysis (imFunPCA), which is well suited for cases where the longitudinal trajectories are subject to informative missingness. Computation of the functional principal components in our approach is based on the likelihood of the data, where information of both the observed and missing data points are incorporated. We adopt a regression-based orthogonal approximation method to decompose the latent stochastic process based on a set of orthonormal empirical basis functions. Under the case of informative missingness, we show via simulation studies that the performance of our approach is superior to that of the conventional ones. We apply our method on a longitudinal dataset of kidney glomerular filtration rates for patients post renal transplantation.

Assuntos

Modelos Estatísticos , Humanos , Estudos Longitudinais , Análise de Componente Principal , Probabilidade , Análise de Regressão

8.

Spectral dynamic causal modelling of resting-state fMRI: an exploratory study relating effective brain connectivity in the default mode network to genetics.

Nie, Yunlong; Opoku, Eugene; Yasmin, Laila; Song, Yin; Wang, Jie; Wu, Sidi; Scarapicchia, Vanessa; Gawryluk, Jodie; Wang, Liangliang; Cao, Jiguo; Nathoo, Farouk S.

Stat Appl Genet Mol Biol ; 19(3)2020 08 31.

Artigo em Inglês | MEDLINE | ID: mdl-32866136

RESUMO

We conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer's disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data. We relate longitudinal effective brain connectivity estimated using spectral DCM to SNPs using both linear mixed effect (LME) models as well as function-on-scalar regression (FSR). In both cases we implement a parametric bootstrap for testing SNP coefficients and make comparisons with p-values obtained from asymptotic null distributions. In both networks at an initial q-value threshold of 0.1 no effects are found. We report on exploratory patterns of associations with relatively high ranks that exhibit stability to the differing assumptions made by both FSR and LME.

Assuntos

Doença de Alzheimer/diagnóstico por imagem , Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Disfunção Cognitiva/diagnóstico por imagem , Conectoma/métodos , Imageamento por Ressonância Magnética/métodos , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/genética , Encéfalo/patologia , Disfunção Cognitiva/genética , Bases de Dados Genéticas , Feminino , Humanos , Modelos Lineares , Masculino , Modelos Teóricos , Polimorfismo de Nucleotídeo Único

9.

Quantitative assessment of field strength, total intracranial volume, sex, and age effects on the goodness of harmonization for volumetric analysis on the ADNI database.

Ma, Da; Popuri, Karteek; Bhalla, Mahadev; Sangha, Oshin; Lu, Donghuan; Cao, Jiguo; Jacova, Claudia; Wang, Lei; Beg, Mirza Faisal.

Hum Brain Mapp ; 40(5): 1507-1527, 2019 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-30431208

RESUMO

When analyzing large multicenter databases, the effects of multiple confounding covariates increase the variability in the data and may reduce the ability to detect changes due to the actual effect of interest, for example, changes due to disease. Efficient ways to evaluate the effect of covariates toward the data harmonization are therefore important. In this article, we showcase techniques to assess the "goodness of harmonization" of covariates. We analyze 7,656 MR images in the multisite, multiscanner Alzheimer's Disease Neuroimaging Initiative (ADNI) database. We present a comparison of three methods for estimating total intracranial volume to assess their robustness and correct the brain structure volumes using the residual method and the proportional (normalization by division) method. We then evaluated the distribution of brain structure volumes over the entire ADNI database before and after accounting for multiple covariates such as total intracranial volume, scanner field strength, sex, and age using two techniques: (a) Zscapes, a panoramic visualization technique to analyze the entire database and (b) empirical cumulative distributions functions. The results from this study highlight the importance of assessing the goodness of data harmonization as a necessary preprocessing step when pooling large data set with multiple covariates, prior to further statistical data analysis.

Assuntos

Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Idoso , Idoso de 80 Anos ou mais , Envelhecimento , Disfunção Cognitiva/diagnóstico por imagem , Estudos Transversais , Interpretação Estatística de Dados , Bases de Dados Factuais , Progressão da Doença , Feminino , Humanos , Processamento de Imagem Assistida por Computador , Estudos Longitudinais , Imageamento por Ressonância Magnética , Masculino , Reprodutibilidade dos Testes , Caracteres Sexuais

10.

Standardisation and application of the single-breath determination of nitric oxide uptake in the lung.

Zavorsky, Gerald S; Hsia, Connie C W; Hughes, J Michael B; Borland, Colin D R; Guénard, Hervé; van der Lee, Ivo; Steenbruggen, Irene; Naeije, Robert; Cao, Jiguo; Dinh-Xuan, Anh Tuan.

Eur Respir J ; 49(2)2017 02.

Artigo em Inglês | MEDLINE | ID: mdl-28179436

RESUMO

Diffusing capacity of the lung for nitric oxide (DLNO), otherwise known as the transfer factor, was first measured in 1983. This document standardises the technique and application of single-breath DLNO This panel agrees that 1) pulmonary function systems should allow for mixing and measurement of both nitric oxide (NO) and carbon monoxide (CO) gases directly from an inspiratory reservoir just before use, with expired concentrations measured from an alveolar "collection" or continuously sampled via rapid gas analysers; 2) breath-hold time should be 10âs with chemiluminescence NO analysers, or 4-6âs to accommodate the smaller detection range of the NO electrochemical cell; 3) inspired NO and oxygen concentrations should be 40-60âppm and close to 21%, respectively; 4) the alveolar oxygen tension (PAO2 ) should be measured by sampling the expired gas; 5) a finite specific conductance in the blood for NO (Î¸NO) should be assumed as 4.5âmL·min-1·mmHg-1·mL-1 of blood; 6) the equation for 1/Î¸CO should be (0.0062·PAO2 +1.16)·(ideal haemoglobin/measured haemoglobin) based on breath-holding PAO2 and adjusted to an average haemoglobin concentration (male 14.6âg·dL-1, female 13.4âg·dL-1); 7) a membrane diffusing capacity ratio (DMNO/DMCO) should be 1.97, based on tissue diffusivity.

Assuntos

Volume Sanguíneo , Óxido Nítrico/sangue , Alvéolos Pulmonares/irrigação sanguínea , Capacidade de Difusão Pulmonar/normas , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Permeabilidade Capilar , Monóxido de Carbono/sangue , Feminino , Hemoglobinas/análise , Humanos , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Oxigênio/sangue , Adulto Jovem

11.

Estimating varying coefficients for partial differential equation models.

Zhang, Xinyu; Cao, Jiguo; Carroll, Raymond J.

Biometrics ; 73(3): 949-959, 2017 09.

Artigo em Inglês | MEDLINE | ID: mdl-28076654

RESUMO

Partial differential equations (PDEs) are used to model complex dynamical systems in multiple dimensions, and their parameters often have important scientific interpretations. In some applications, PDE parameters are not constant but can change depending on the values of covariates, a feature that we call varying coefficients. We propose a parameter cascading method to estimate varying coefficients in PDE models from noisy data. Our estimates of the varying coefficients are shown to be consistent and asymptotically normally distributed. The performance of our method is evaluated by a simulation study and by an empirical study estimating three varying coefficients in a PDE model arising from LIDAR data.

Assuntos

Modelos Teóricos , Simulação por Computador

12.

Estimating time-varying directed gene regulation networks.

Nie, Yunlong; Wang, LiangLiang; Cao, Jiguo.

Biometrics ; 73(4): 1231-1242, 2017 12.

Artigo em Inglês | MEDLINE | ID: mdl-28369708

RESUMO

The problem of modeling the dynamical regulation process within a gene network has been of great interest for a long time. We propose to model this dynamical system with a large number of nonlinear ordinary differential equations (ODEs), in which the regulation function is estimated directly from data without any parametric assumption. Most current research assumes the gene regulation network is static, but in reality, the connection and regulation function of the network may change with time or environment. This change is reflected in our dynamical model by allowing the regulation function varying with the gene expression and forcing this regulation function to be zero if no regulation happens. We introduce a statistical method called functional SCAD to estimate a time-varying sparse and directed gene regulation network, and simultaneously, to provide a smooth estimation of the regulation function and identify the interval in which no regulation effect exists. The finite sample performance of the proposed method is investigated in a Monte Carlo simulation study. Our method is demonstrated by estimating a time-varying directed gene regulation network of 20 genes involved in muscle development during the embryonic stage of Drosophila melanogaster.

Assuntos

Redes Reguladoras de Genes , Método de Monte Carlo , Dinâmica não Linear , Animais , Drosophila melanogaster/embriologia , Desenvolvimento Muscular , Fatores de Tempo

13.

Parametric functional principal component analysis.

Sang, Peijun; Wang, Liangliang; Cao, Jiguo.

Biometrics ; 73(3): 802-810, 2017 09.

Artigo em Inglês | MEDLINE | ID: mdl-28295173

RESUMO

Functional principal component analysis (FPCA) is a popular approach in functional data analysis to explore major sources of variation in a sample of random curves. These major sources of variation are represented by functional principal components (FPCs). Most existing FPCA approaches use a set of flexible basis functions such as B-spline basis to represent the FPCs, and control the smoothness of the FPCs by adding roughness penalties. However, the flexible representations pose difficulties for users to understand and interpret the FPCs. In this article, we consider a variety of applications of FPCA and find that, in many situations, the shapes of top FPCs are simple enough to be approximated using simple parametric functions. We propose a parametric approach to estimate the top FPCs to enhance their interpretability for users. Our parametric approach can also circumvent the smoothing parameter selecting process in conventional nonparametric FPCA methods. In addition, our simulation study shows that the proposed parametric FPCA is more robust when outlier curves exist. The parametric FPCA method is demonstrated by analyzing several datasets from a variety of applications.

Assuntos

Análise de Componente Principal

14.

Interpretable functional principal component analysis.

Lin, Zhenhua; Wang, Liangliang; Cao, Jiguo.

Biometrics ; 72(3): 846-54, 2016 09.

Artigo em Inglês | MEDLINE | ID: mdl-26683051

RESUMO

Functional principal component analysis (FPCA) is a popular approach to explore major sources of variation in a sample of random curves. These major sources of variation are represented by functional principal components (FPCs). The intervals where the values of FPCs are significant are interpreted as where sample curves have major variations. However, these intervals are often hard for naïve users to identify, because of the vague definition of "significant values". In this article, we develop a novel penalty-based method to derive FPCs that are only nonzero precisely in the intervals where the values of FPCs are significant, whence the derived FPCs possess better interpretability than the FPCs derived from existing methods. To compute the proposed FPCs, we devise an efficient algorithm based on projection deflation techniques. We show that the proposed interpretable FPCs are strongly consistent and asymptotically normal under mild conditions. Simulation studies confirm that with a competitive performance in explaining variations of sample curves, the proposed FPCs are more interpretable than the traditional counterparts. This advantage is demonstrated by analyzing two real datasets, namely, electroencephalography data and Canadian weather data.

Assuntos

Modelos Estatísticos , Análise de Componente Principal/métodos , Algoritmos , Canadá , Simulação por Computador , Eletroencefalografia/estatística & dados numéricos , Humanos , Tempo (Meteorologia)

15.

On the selection of ordinary differential equation models with application to predator-prey dynamical models.

Zhang, Xinyu; Cao, Jiguo; Carroll, Raymond J.

Biometrics ; 71(1): 131-138, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25287611

RESUMO

We consider model selection and estimation in a context where there are competing ordinary differential equation (ODE) models, and all the models are special cases of a "full" model. We propose a computationally inexpensive approach that employs statistical estimation of the full model, followed by a combination of a least squares approximation (LSA) and the adaptive Lasso. We show the resulting method, here called the LSA method, to be an (asymptotically) oracle model selection method. The finite sample performance of the proposed LSA method is investigated with Monte Carlo simulations, in which we examine the percentage of selecting true ODE models, the efficiency of the parameter estimation compared to simply using the full and true models, and coverage probabilities of the estimated confidence intervals for ODE parameters, all of which have satisfactory performances. Our method is also demonstrated by selecting the best predator-prey ODE to model a lynx and hare population dynamical system among some well-known and biologically interpretable ODE models.

Assuntos

Biometria/métodos , Lynx/fisiologia , Modelos Biológicos , Modelos Estatísticos , Comportamento Predatório/fisiologia , Coelhos/fisiologia , Algoritmos , Animais , Simulação por Computador , Humanos

16.

Estimating the intensity of ward admission and its effect on emergency department access block.

Luo, Wei; Cao, Jiguo; Gallagher, Marcus; Wiles, Janet.

Stat Med ; 32(15): 2681-94, 2013 Jul 10.

Artigo em Inglês | MEDLINE | ID: mdl-23172783

RESUMO

Emergency department access block is an urgent problem faced by many public hospitals today. When access block occurs, patients in need of acute care cannot access inpatient wards within an optimal time frame. A widely held belief is that access block is the end product of a long causal chain, which involves poor discharge planning, insufficient bed capacity, and inadequate admission intensity to the wards. This paper studies the last link of the causal chain-the effect of admission intensity on access block, using data from a metropolitan hospital in Australia. We applied several modern statistical methods to analyze the data. First, we modeled the admission events as a nonhomogeneous Poisson process and estimated time-varying admission intensity with penalized regression splines. Next, we established a functional linear model to investigate the effect of the time-varying admission intensity on emergency department access block. Finally, we used functional principal component analysis to explore the variation in the daily time-varying admission intensities. The analyses suggest that improving admission practice during off-peak hours may have most impact on reducing the number of ED access blocks.

Assuntos

Bioestatística/métodos , Serviço Hospitalar de Emergência/estatística & dados numéricos , Admissão do Paciente/estatística & dados numéricos , Austrália , Acessibilidade aos Serviços de Saúde , Hospitais Urbanos , Humanos , Funções Verossimilhança , Modelos Lineares , Modelos Estatísticos , Distribuição de Poisson , Análise de Componente Principal , Fatores de Tempo

17.

Automatic search intervals for the smoothing parameter in penalized splines.

Li, Zheyuan; Cao, Jiguo.

Stat Comput ; 33(1): 1, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36415568

RESUMO

The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps ( ≥ version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines. Supplementary Information: The online version contains supplementary material available at 10.1007/s11222-022-10178-z.

18.

Identifying regions of interest in mammogram images.

Jiang, Shu; Cao, Jiguo; Colditz, Graham A.

Stat Methods Med Res ; 32(5): 895-903, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-36951095

RESUMO

Screening mammography is the primary preventive strategy for early detection of breast cancer and an essential input to breast cancer risk prediction and application of prevention/risk management guidelines. Identifying regions of interest within mammogram images that are associated with 5- or 10-year breast cancer risk is therefore clinically meaningful. The problem is complicated by the irregular boundary issue posed by the semi-circular domain of the breast area within mammograms. Accommodating the irregular domain is especially crucial when identifying regions of interest, as the true signal comes only from the semi-circular domain of the breast region, and noise elsewhere. We address these challenges by introducing a proportional hazards model with imaging predictors characterized by bivariate splines over triangulation. The model sparsity is enforced with the group lasso penalty function. We apply the proposed method to the motivating Joanne Knight Breast Health Cohort to illustrate important risk patterns and show that the proposed method is able to achieve higher discriminatory performance.

Assuntos

Neoplasias da Mama , Mamografia , Humanos , Feminino , Mamografia/métodos , Neoplasias da Mama/diagnóstico , Detecção Precoce de Câncer , Risco , Modelos de Riscos Proporcionais

19.

Jointly modelling multiple transplant outcomes by a competing risk model via functional principal component analysis.

Dong, Jianghu James; Shi, Haolun; Wang, Liangliang; Zhang, Ying; Cao, Jiguo.

J Appl Stat ; 50(1): 43-59, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36530777

RESUMO

In many clinical studies, longitudinal biomarkers are often used to monitor the progression of a disease. For example, in a kidney transplant study, the glomerular filtration rate (GFR) is used as a longitudinal biomarker to monitor the progression of the kidney function and the patient's state of survival that is characterized by multiple time-to-event outcomes, such as kidney transplant failure and death. It is known that the joint modelling of longitudinal and survival data leads to a more accurate and comprehensive estimation of the covariates' effect. While most joint models use the longitudinal outcome as a covariate for predicting survival, very few models consider the further decomposition of the variation within the longitudinal trajectories and its effect on survival. We develop a joint model that uses functional principal component analysis (FPCA) to extract useful features from the longitudinal trajectories and adopt the competing risk model to handle multiple time-to-event outcomes. The longitudinal trajectories and the multiple time-to-event outcomes are linked via the shared functional features. The application of our model on a real kidney transplant data set reveals the significance of these functional features, and a simulation study is carried out to validate the accurateness of the estimation method.

20.

Computational Efficiency and Precision for Replicated-Count and Batch-Marked Hidden Population Models.

Parker, Matthew R P; Cowen, Laura L E; Cao, Jiguo; Elliott, Lloyd T.

J Agric Biol Environ Stat ; 28(1): 43-58, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36065440

RESUMO

We address two computational issues common to open-population N-mixture models, hidden integer-valued autoregressive models, and some hidden Markov models. The first issue is computation time, which can be dramatically improved through the use of a fast Fourier transform. The second issue is tractability of the model likelihood function for large numbers of hidden states, which can be solved by improving numerical stability of calculations. As an illustrative example, we detail the application of these methods to the open-population N-mixture models. We compare computational efficiency and precision between these methods and standard methods employed by state-of-the-art ecological software. We show faster computing times (a â¼ 6 to â¼ 30 times speed improvement for population size upper bounds of 500 and 1000, respectively) over state-of-the-art ecological software for N-mixture models. We also apply our methods to compute the size of a large elk population using an N-mixture model and show that while our methods converge, previous software cannot produce estimates due to numerical issues. These solutions can be applied to many ecological models to improve precision when logs of sums exist in the likelihood function and to improve computational efficiency when convolutions are present in the likelihood function. Supplementary materials accompanying this paper appear online. Supplementary materials for this article are available at 10.1007/s13253-022-00509-y.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA