RESUMO
Quantile regression as an alternative to modeling the conditional mean function provides a comprehensive picture of the relationship between a response and covariates. It is particularly attractive in applications focused on the upper or lower conditional quantiles of the response. However, conventional quantile regression estimators are often unstable at the extreme tails, owing to data sparsity, especially for heavy-tailed distributions. Assuming that the functional predictor has a linear effect on the upper quantiles of the response, we develop a novel estimator for extreme conditional quantiles using a functional composite quantile regression based on a functional principal component analysis and an extrapolation technique from extreme value theory. We establish the asymptotic normality of the proposed estimator under some regularity conditions, and compare it with other estimation methods using Monte Carlo simulations. Finally, we demonstrate the proposed method by empirically analyzing two real data sets.
RESUMO
Abundance estimation from capture-recapture data is of great importance in many disciplines. Analysis of capture-recapture data is often complicated by the existence of one-inflation and heterogeneity problems. Simultaneously taking these issues into account, existing abundance estimation methods are usually constructed on the basis of conditional likelihood under one-inflated zero-truncated count models. However, the resulting Horvitz-Thompson-type estimators may be unstable, and the resulting Wald-type confidence intervals may exhibit severe undercoverage. In this paper, we propose a semiparametric empirical likelihood (EL) approach to abundance estimation under one-inflated binomial and Poisson regression models. To facilitate the computation of the EL method, we develop an expectation-maximization algorithm. We also propose a new score test for the existence of one-inflation and prove its asymptotic normality. Our simulation studies indicate that compared with existing estimators, the proposed score test is more powerful and the maximum EL estimator has a smaller mean square error. The advantages of our approaches are further demonstrated by analyses of prinia data from Hong Kong and drug user data from Bangkok.
Assuntos
Algoritmos , Modelos Estatísticos , Simulação por Computador , Funções Verossimilhança , Probabilidade , TailândiaRESUMO
In this article, we propose a novel estimator of extreme conditional quantiles in partial functional linear regression models with heavy-tailed distributions. The conventional quantile regression estimators are often unstable at the extreme tails due to data sparsity, especially for heavy-tailed distributions. We first estimate the slope function and the partially linear coefficient using a functional quantile regression based on functional principal component analysis, which is a robust alternative to the ordinary least squares regression. The extreme conditional quantiles are then estimated by using a new extrapolation technique from extreme value theory. We establish the asymptotic normality of the proposed estimator and illustrate its finite sample performance by simulation studies and an empirical analysis of diffusion tensor imaging data from a cognitive disorder study.
Dans cet article, un nouvel estimateur de quantiles conditionnels extrêmes est élaboré dans le cadre de modèles de régression linéaire fonctionnelle partielle avec des distributions à queues lourdes. Il est bien connu que la rareté des observations dans les ailes extrêmes de distributions à queues lourdes rend souvent les estimateurs de régression quantile usuels instables. Pour parer à la non robustesse des moindres carrés classiques, les auteurs ont commencé par estimer la fonction de pente et le coefficient partiellement linéaire d'une régression quantile en ayant recours à une approche basée sur l'analyse en composantes principales fonctionnelles. Ensuite, ils ont estimé les quantiles conditionnels extrêmes à l'aide d'une nouvelle technique d'extrapolation issue de la théorie des valeurs extrêmes. En plus d'établir la normalité asymptotique de l'estimateur proposé, les auteurs illustrent ses bonnes performances à distance finie par le biais d'une étude de simulation et une mise en oeuvre pratique sur les données d'imagerie de diffusion par tenseurs provenant d'une étude portant sur des troubles cognitifs.
RESUMO
The Sharpe ratio function is a commonly used risk/return measure in financial econometrics. To estimate this function, most existing methods take a two-step procedure that first estimates the mean and volatility functions separately and then applies the plug-in method. In this paper, we propose a direct method via local maximum likelihood to simultaneously estimate the Sharpe ratio function and the negative log-volatility function as well as their derivatives. We establish the joint limiting distribution of the proposed estimators, and moreover extend the proposed method to estimate the multivariate Sharpe ratio function. We also evaluate the numerical performance of the proposed estimators through simulation studies, and compare them with existing methods. Finally, we apply the proposed method to the three-month US Treasury bill data and that captures a well-known covariate-dependent effect on the Sharpe ratio.
RESUMO
In this paper, by virtual of the inverse probability weighted technique, we considered the jump-preserving estimation on the nonparametric regression models with missing data on response variable. First, we used local piecewise-linear expansion respectively with left and right kernel to approximate the unknown regression function. Second, we obtained the left- and right-limit estimation of regression function at each observed points and then determinated the final estimators by residual sums of squares. Third, we presented the convergence rate of estimators and the residual sums of squares. Finally, we illustrated the performance of our proposed method through some simulation studies and a conjunctivitis example from The Affiliated Hospital of Hangzhou Normal University.
RESUMO
Recent techniques have achieved remarkable improvements depended on mining subtle yet distinctive features for fine-grained visual classification (FGVC). While prior works directly combine discriminative features extracted from different parts, we argue that the potential interactions between different parts and their abilities to category predictions should be taken into consideration, which enables significant parts to contribute more to the decision of the sub-category. To this end, we present a Cross-Part Convolutional Neural Network (CP-CNN) in a weakly supervised manner to explore cross-learning among multi-regional features. Specifically, the context transformer is implemented to encourage joint feature learning across different parts under the guidance of a navigator. The part with the highest confidence is regarded as a navigator to deliver distinguishing characteristics to the others with lower confidence while the complementary information is retained. To locate discriminative but subtle parts precisely, a part proposal generator (PPG) is designed with the feature enhancement blocks, through which complex scale variations caused by the viewpoint diversity can be effectively alleviated. Extensive experiments on three benchmark datasets demonstrate that our proposed method consistently outperforms existing state-of-the-art methods.
RESUMO
BACKGROUND: Gene copy number variations (CNVs) contribute to genetic diversity and disease prevalence across populations. Substantial efforts have been made to decipher the relationship between CNVs and pathogenesis but with limited success. RESULTS: We have developed a novel computational framework X-CNV ( www.unimd.org/XCNV ), to predict the pathogenicity of CNVs by integrating more than 30 informative features such as allele frequency (AF), CNV length, CNV type, and some deleterious scores. Notably, over 14 million CNVs across various ethnic groups, covering nearly 93% of the human genome, were unified to calculate the AF. X-CNV, which yielded area under curve (AUC) values of 0.96 and 0.94 in training and validation sets, was demonstrated to outperform other available tools in terms of CNV pathogenicity prediction. A meta-voting prediction (MVP) score was developed to quantitively measure the pathogenic effect, which is based on the probabilistic value generated from the XGBoost algorithm. The proposed MVP score demonstrated a high discriminative power in determining pathogenetic CNVs for inherited traits/diseases in different ethnic groups. CONCLUSIONS: The ability of the X-CNV framework to quantitatively prioritize functional, deleterious, and disease-causing CNV on a genome-wide basis outperformed current CNV-annotation tools and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
Assuntos
Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Software , Algoritmos , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Curva ROC , Navegador , Fluxo de TrabalhoRESUMO
The heart is the first organ to form during embryogenesis and its development is a complex process. In this study, we identified 120 ligand-receptor pairs including 65 ligands and 58 receptors specifically expressed in one of the nine cell types. The correlation analysis of the cell proportions revealed that the cell-to-cell contact exhibited spatial patterns in human fetal heart. Specifically, the cardiomyocytes (CMs) proportion might have negative correlation with proportion of endothelial cell in left atrium and ventricle during the heart development. In contrast, fibroblast-like cells and macrophages were jointly increased with the gestation. Furthermore, the ligand in CM, NPPA (Natriuretic Peptide A), and receptor in endothelial cell (EC), NPR3 (Natriuretic Peptide Receptor 3), were specifically expressed in atrial CM and endocardial cells, respectively, indicating that the atrial CM might communicate with endocardial cells via NPPA-NRP3 interaction. Moreover, the interplay between fibroblast-like cell and macrophage was observed in both left and right atriums via the ligand-receptor interactions of COL1A1/COL1A2 (Collagen Type I Alpha 1/2 Chain)-CD36 and CTGF (connective tissue growth factor)-ITGB2 (Integrin Subunit Beta 2). Functional enrichment analysis revealed that the ligand-receptor interactions might be associated with the intracellular activation of cGMP-PKG signaling pathway in ECs, PDGF-beta signaling pathway in fibroblast-like cell, and Toll-like receptor signaling in macrophage, respectively. Collectively, the present study unveiled the potential cell-cell communication and underlying mechanism involved in cardiac development, which broadened our insights into developmental biology of heart.
Assuntos
Coração , Análise de Célula Única , Receptores Toll-Like/metabolismo , Comunicação Celular , Cadeia alfa 1 do Colágeno Tipo I , Humanos , Ligantes , RNA-SeqRESUMO
PURPOSE: Resistance exercise (RE) can improve many cardiovascular disease (CVD) risk factors, but specific data on the effects on CVD events and mortality are lacking. We investigated the associations of RE with CVD and all-cause mortality and further examined the mediation effect of body mass index (BMI) between RE and CVD outcomes. METHODS: We included 12,591 participants (mean age, 47 yr) who received at least two clinical examinations 1987-2006. RE was assessed by a self-reported medical history questionnaire. RESULTS: During a mean follow-up of 5.4 and 10.5 yr, 205 total CVD events (morbidity and mortality combined) and 276 all-cause deaths occurred, respectively. Compared with no RE, weekly RE frequencies of one, two, three times or total amount of 1-59 min were associated with approximately 40%-70% decreased risk of total CVD events, independent of aerobic exercise (AE) (all P values <0.05). However, there was no significant risk reduction for higher weekly RE of more than four times or ≥60 min. Similar results were observed for CVD morbidity and all-cause mortality. In the stratified analyses by AE, weekly RE of one time or 1-59 min was associated with lower risks of total CVD events and CVD morbidity regardless of meeting the AE guidelines. Our mediation analysis showed that RE was associated with the risk of total CVD events in two ways: RE had a direct U-shaped association with CVD risk (P value for quadratic trend <0.001) and RE indirectly lowered CVD risk by decreasing BMI. CONCLUSION: Even one time or less than 1 h·wk of RE, independent of AE, is associated with reduced risks of CVD and all-cause mortality. BMI mediates the association of RE with total CVD events.
Assuntos
Doenças Cardiovasculares/epidemiologia , Mortalidade , Treinamento Resistido , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Índice de Massa Corporal , Feminino , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Morbidade , Fatores de Risco , Texas , Adulto JovemRESUMO
It is well known that the exposure to ambient air pollution might cause serious respiratory illnesses and that the weather conditions may also contribute to the seriousness. However, quantifying the effects of pollution and the weather condition is a difficult task due to the nonlinear nature of these impacts. The problem is further complicated by the possibly cumulative effects of these impacts. In this paper, the nonparametric additive (NPA) models, which have the advantage of ease in interpretation and forecasting, are employed for modeling the effects of pollution and weather. All models are derived by the local linear method. The variables in the final selected NPA model are chosen by cross-validation method together with bootstrap test for the data of Hong Kong. For comparison the final selected linear regression (LR) model by the backward elimination method is also considered. It is found, interestingly, that the variables selected by nonparametric method and the usual backward elimination method for linear models are different. Furthermore, by comparing forecasted values obtained from the NPA and LR models and true values the final selected NPA model is shown to outperform the LR model.