Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Energy Econ ; 125: 106788, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37361516

RESUMO

Since the onset of the COVID-19 pandemic, energy price predictability has worsened. We evaluate the effectiveness of the two machine learning methods of shrinkage and combination on the spot prices of crude oil before and during the COVID-19 epidemic. The results demonstrated that COVID-19 increased economic uncertainty and diminished the predictive capacity of numerous models. Shrinkage methods have always been regarded as having an excellent out-of-sample forecast performance. However, during the COVID period, the combination methods provide more accurate information than the shrinkage methods. The reason is that the outbreak of the epidemic has altered the correlation between specific predictors and crude oil prices, and shrinkage methods are incapable of identifying this change, resulting in the loss of information.

2.
Ann Inst Stat Math ; 67(1): 93-127, 2015 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-25620808

RESUMO

We consider model selection and estimation for partial spline models and propose a new regularization method in the context of smoothing splines. The regularization method has a simple yet elegant form, consisting of roughness penalty on the nonparametric component and shrinkage penalty on the parametric components, which can achieve function smoothing and sparse estimation simultaneously. We establish the convergence rate and oracle properties of the estimator under weak regularity conditions. Remarkably, the estimated parametric components are sparse and efficient, and the nonparametric component can be estimated with the optimal rate. The procedure also has attractive computational properties. Using the representer theory of smoothing splines, we reformulate the objective function as a LASSO-type problem, enabling us to use the LARS algorithm to compute the solution path. We then extend the procedure to situations when the number of predictors increases with the sample size and investigate its asymptotic properties in that context. Finite-sample performance is illustrated by simulations.

3.
Genet Epidemiol ; 37(7): 704-14, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23893343

RESUMO

To date, numerous genetic variants have been identified as associated with diverse phenotypic traits. However, identified associations generally explain only a small proportion of trait heritability and the predictive power of models incorporating only known-associated variants has been small. Multiple regression is a popular framework in which to consider the joint effect of many genetic variants simultaneously. Ordinary multiple regression is seldom appropriate in the context of genetic data, due to the high dimensionality of the data and the correlation structure among the predictors. There has been a resurgence of interest in the use of penalised regression techniques to circumvent these difficulties. In this paper, we focus on ridge regression, a penalised regression approach that has been shown to offer good performance in multivariate prediction problems. One challenge in the application of ridge regression is the choice of the ridge parameter that controls the amount of shrinkage of the regression coefficients. We present a method to determine the ridge parameter based on the data, with the aim of good performance in high-dimensional prediction problems. We establish a theoretical justification for our approach, and demonstrate its performance on simulated genetic data and on a real data example. Fitting a ridge regression model to hundreds of thousands to millions of genetic variants simultaneously presents computational challenges. We have developed an R package, ridge, which addresses these issues. Ridge implements the automatic choice of ridge parameter presented in this paper, and is freely available from CRAN.


Assuntos
Variação Genética/genética , Modelos Genéticos , Fenótipo , Algoritmos , Transtorno Bipolar/genética , Predisposição Genética para Doença/genética , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética , Curva ROC , Análise de Regressão , Software
4.
Artigo em Inglês | MEDLINE | ID: mdl-36674257

RESUMO

It is commonly recognized that setting a reasonable carbon price can promote the healthy development of a carbon trading market, so it is especially important to improve the accuracy of carbon price forecasting. In this paper, we propose and evaluate a hybrid carbon price prediction model based on so-called double shrinkage methods, which combines factor screening, dimensionality reduction, and model prediction. In order to verify the effectiveness and superiority of the proposed model, this paper takes data from the Guangdong carbon trading market for empirical analysis. The sample interval is from 5 August 2013 to 25 March 2022. Based on the results of the empirical analysis, several main findings can be summarized. First, the double shrinkage methods proposed in this paper yield more accurate prediction results than various alternative models based on the direct application of factor screening methods or dimensionality reduction methods, when comparing R2, root-mean-square error (RMSE), and root absolute error (RAE). Second, LSTM-based double shrinkage methods have superior prediction performance compared to LR-based double shrinkage methods. Third, these findings are robust with the use of normalized data, different data frequencies, different carbon trading markets, and different dataset divisions. This study provides new ideas for carbon price prediction, which might have a theoretical and practical contributions to complex and non-linear time series analysis.


Assuntos
Carbono , Comércio , Projetos de Pesquisa , Previsões
5.
Cancer Inform ; 12: 143-53, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23966761

RESUMO

BACKGROUND: Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity. RESULTS: The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes. CONCLUSIONS: High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention. AVAILABILITY: The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html.

6.
J Biom Biostat ; 1: 005, 2013 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-24511433

RESUMO

In this article, we present a selective overview of some recent developments in Bayesian model and variable selection methods for high dimensional linear models. While most of the reviews in literature are based on conventional methods, we focus on recently developed methods, which have proven to be successful in dealing with high dimensional variable selection. First, we give a brief overview of the traditional model selection methods (viz. Mallow's Cp, AIC, BIC, DIC), followed by a discussion on some recently developed methods (viz. EBIC, regularization), which have occupied the minds of many statisticians. Then, we review high dimensional Bayesian methods with a particular emphasis on Bayesian regularization methods, which have been used extensively in recent years. We conclude by briefly addressing the asymptotic behaviors of Bayesian variable selection methods for high dimensional linear models under different regularity conditions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA