Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Mol Biol Evol ; 41(7)2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38916040

RESUMO

Phylogenomic analyses of long sequences, consisting of many genes and genomic segments, reconstruct organismal relationships with high statistical confidence. But, inferred relationships can be sensitive to excluding just a few sequences. Currently, there is no direct way to identify fragile relationships and the associated individual gene sequences in species. Here, we introduce novel metrics for gene-species sequence concordance and clade probability derived from evolutionary sparse learning models. We validated these metrics using fungi, plant, and animal phylogenomic datasets, highlighting the ability of the new metrics to pinpoint fragile clades and the sequences responsible. The new approach does not necessitate the investigation of alternative phylogenetic hypotheses, substitution models, or repeated data subset analyses. Our methodology offers a streamlined approach to evaluating major inferred clades and identifying sequences that may distort reconstructed phylogenies using large datasets.


Assuntos
Genômica , Filogenia , Animais , Genômica/métodos , Modelos Genéticos , Evolução Molecular , Plantas/genética , Fungos/genética
2.
BMC Bioinformatics ; 23(Suppl 3): 128, 2022 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-35413798

RESUMO

BACKGROUND: With the development of noninvasive imaging technology, collecting different imaging measurements of the same brain has become more and more easy. These multimodal imaging data carry complementary information of the same brain, with both specific and shared information being intertwined. Within these multimodal data, it is essential to discriminate the specific information from the shared information since it is of benefit to comprehensively characterize brain diseases. While most existing methods are unqualified, in this paper, we propose a parameter decomposition based sparse multi-view canonical correlation analysis (PDSMCCA) method. PDSMCCA could identify both modality-shared and -specific information of multimodal data, leading to an in-depth understanding of complex pathology of brain disease. RESULTS: Compared with the SMCCA method, our method obtains higher correlation coefficients and better canonical weights on both synthetic data and real neuroimaging data. This indicates that, coupled with modality-shared and -specific feature selection, PDSMCCA improves the multi-view association identification and shows meaningful feature selection capability with desirable interpretation. CONCLUSIONS: The novel PDSMCCA confirms that the parameter decomposition is a suitable strategy to identify both modality-shared and -specific imaging features. The multimodal association and the diverse information of multimodal imaging data enable us to better understand the brain disease such as Alzheimer's disease.


Assuntos
Doença de Alzheimer , Análise de Correlação Canônica , Algoritmos , Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos
3.
Sensors (Basel) ; 22(18)2022 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-36146264

RESUMO

Space-time adaptive processing (STAP) is a well-known technique for slow-moving target detection in the clutter spreading environment. For an airborne conformal array radar, conventional STAP methods are unable to provide good performance in suppressing clutter because of the geometry-induced range-dependent clutter, non-uniform spatial steering vector, and polarization sensitivity. In this paper, a knowledge aided STAP method based on sparse learning via iterative minimization (SLIM) combined with Laplace distribution is proposed to improve the STAP performance for a conformal array. The proposed method can avoid selecting the user parameter. the proposed method constructs a dictionary matrix that is composed of the space-time steering vector by using the prior knowledge of the range cell under test (CUT) distributed in clutter ridge. Then, the estimated sparse parameters and noise power can be used to calculate a relatively accurate clutter plus noise covariance matrix (CNCM). This method could achieve superior performance of clutter suppression for a conformal array. Simulation results demonstrate the effectiveness of this method.

4.
Appl Soft Comput ; 115: 108088, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34840541

RESUMO

The coronavirus disease 2019 (COVID-19) pandemic caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to a sharp increase in hospitalized patients with multi-organ disease pneumonia. Early and automatic diagnosis of COVID-19 is essential to slow down the spread of this epidemic and reduce the mortality of patients infected with SARS-CoV-2. In this paper, we propose a joint multi-center sparse learning (MCSL) and decision fusion scheme exploiting chest CT images for automatic COVID-19 diagnosis. Specifically, considering the inconsistency of data in multiple centers, we first convert CT images into histogram of oriented gradient (HOG) images to reduce the structural differences between multi-center data and enhance the generalization performance. We then exploit a 3-dimensional convolutional neural network (3D-CNN) model to learn the useful information between and within 3D HOG image slices and extract multi-center features. Furthermore, we employ the proposed MCSL method that learns the intrinsic structure between multiple centers and within each center, which selects discriminative features to jointly train multi-center classifiers. Finally, we fuse these decisions made by these classifiers. Extensive experiments are performed on chest CT images from five centers to validate the effectiveness of the proposed method. The results demonstrate that the proposed method can improve COVID-19 diagnosis performance and outperform the state-of-the-art methods.

5.
Sensors (Basel) ; 21(19)2021 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-34640730

RESUMO

Deep learning models, especially recurrent neural networks (RNNs), have been successfully applied to automatic modulation classification (AMC) problems recently. However, deep neural networks are usually overparameterized, i.e., most of the connections between neurons are redundant. The large model size hinders the deployment of deep neural networks in applications such as Internet-of-Things (IoT) networks. Therefore, reducing parameters without compromising the network performance via sparse learning is often desirable since it can alleviates the computational and storage burdens of deep learning models. In this paper, we propose a sparse learning algorithm that can directly train a sparsely connected neural network based on the statistics of weight magnitude and gradient momentum. We first used the MNIST and CIFAR10 datasets to demonstrate the effectiveness of this method. Subsequently, we applied it to RNNs with different pruning strategies on recurrent and non-recurrent connections for AMC problems. Experimental results demonstrated that the proposed method can effectively reduce the parameters of the neural networks while maintaining model performance. Moreover, we show that appropriate sparsity can further improve network generalization ability.


Assuntos
Internet das Coisas , Redes Neurais de Computação , Algoritmos , Movimento (Física) , Neurônios
6.
BMC Bioinformatics ; 21(1): 182, 2020 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393178

RESUMO

BACKGROUND: In addition to causing the pandemic influenza outbreaks of 1918 and 2009, subtype H1N1 influenza A viruses (IAVs) have caused seasonal epidemics since 1977. Antigenic property of influenza viruses are determined by both protein sequence and N-linked glycosylation of influenza glycoproteins, especially hemagglutinin (HA). The currently available computational methods are only considered features in protein sequence but not N-linked glycosylation. RESULTS: A multi-task learning sparse group least absolute shrinkage and selection operator (LASSO) (MTL-SGL) regression method was developed and applied to derive two types of predominant features including protein sequence and N-linked glycosylation in hemagglutinin (HA) affecting variations in serologic data for human and swine H1N1 IAVs. Results suggested that mutations and changes in N-linked glycosylation sites are associated with the rise of antigenic variants of H1N1 IAVs. Furthermore, the implicated mutations are predominantly located at five reported antibody-binding sites, and within or close to the HA receptor binding site. All of the three N-linked glycosylation sites (i.e. sequons NCSV at HA 54, NHTV at HA 125, and NLSK at HA 160) identified by MTL-SGL to determine antigenic changes were experimentally validated in the H1N1 antigenic variants using mass spectrometry analyses. Compared with conventional sparse learning methods, MTL-SGL achieved a lower prediction error and higher accuracy, indicating that grouped features and MTL in the MTL-SGL method are not only able to handle serologic data generated from multiple reagents, supplies, and protocols, but also perform better in genetic sequence-based antigenic quantification. CONCLUSIONS: In summary, the results of this study suggest that mutations and variations in N-glycosylation in HA caused antigenic variations in H1N1 IAVs and that the sequence-based antigenicity predictive model will be useful in understanding antigenic evolution of IAVs.


Assuntos
Algoritmos , Antígenos Virais/imunologia , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Vírus da Influenza A Subtipo H1N1/genética , Vírus da Influenza A Subtipo H1N1/imunologia , Mutação/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , Genoma Viral , Glicosilação , Glicoproteínas de Hemaglutininação de Vírus da Influenza/química , Humanos , Vírus da Influenza A/imunologia , Influenza Humana/virologia , Polissacarídeos/imunologia , Reprodutibilidade dos Testes , Suínos
7.
Sensors (Basel) ; 20(17)2020 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-32872609

RESUMO

In recent years, a series of matching pursuit and hard thresholding algorithms have been proposed to solve the sparse representation problem with ℓ0-norm constraint. In addition, some stochastic hard thresholding methods were also proposed, such as stochastic gradient hard thresholding (SG-HT) and stochastic variance reduced gradient hard thresholding (SVRGHT). However, each iteration of all the algorithms requires one hard thresholding operation, which leads to a high per-iteration complexity and slow convergence, especially for high-dimensional problems. To address this issue, we propose a new stochastic recursive gradient support pursuit (SRGSP) algorithm, in which only one hard thresholding operation is required in each outer-iteration. Thus, SRGSP has a significantly lower computational complexity than existing methods such as SG-HT and SVRGHT. Moreover, we also provide the convergence analysis of SRGSP, which shows that SRGSP attains a linear convergence rate. Our experimental results on large-scale synthetic and real-world datasets verify that SRGSP outperforms state-of-the-art related methods for tackling various sparse representation problems. Moreover, we conduct many experiments on two real-world sparse representation applications such as image denoising and face recognition, and all the results also validate that our SRGSP algorithm obtains much better performance than other sparse representation learning optimization methods in terms of PSNR and recognition rates.

8.
Sensors (Basel) ; 20(16)2020 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-32784460

RESUMO

As a spontaneous facial expression, a micro-expression can reveal the psychological responses of human beings. Thus, micro-expression recognition can be widely studied and applied for its potentiality in clinical diagnosis, psychological research, and security. However, micro-expression recognition is a formidable challenge due to the short-lived time frame and low-intensity of the facial actions. In this paper, a sparse spatiotemporal descriptor for micro-expression recognition is developed by using the Enhanced Local Cube Binary Pattern (Enhanced LCBP). The proposed Enhanced LCBP is composed of three complementary binary features containing Spatial Difference Local Cube Binary Patterns (Spatial Difference LCBP), Temporal Direction Local Cube Binary Patterns (Temporal Direction LCBP), and Temporal Gradient Local Cube Binary Patterns (Temporal Gradient LCBP). With the application of Enhanced LCBP, it would no longer be a problem to provide binary features with spatiotemporal domain complementarity to capture subtle facial changes. In addition, due to the redundant information existing among the division grids, which affects the ability of descriptors to distinguish micro-expressions, the Multi-Regional Joint Sparse Learning is designed to perform feature selection for the division grids, thus paying more attention to the critical local regions. Finally, the Multi-kernel Support Vector Machine (SVM) is employed to fuse the selected features for the final classification. The proposed method exhibits great advantage and achieves promising results on four spontaneous micro-expression datasets. Through further observation of parameter evaluation and confusion matrix, the sufficiency and effectiveness of the proposed method are proved.


Assuntos
Reconhecimento Facial Automatizado , Expressão Facial , Máquina de Vetores de Suporte , Face , Humanos
9.
Sensors (Basel) ; 20(18)2020 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-32899751

RESUMO

This study focuses on driver-behavior identification and its application to finding embedded solutions in a connected car environment. We present a lightweight, end-to-end deep-learning framework for performing driver-behavior identification using in-vehicle controller area network (CAN-BUS) sensor data. The proposed method outperforms the state-of-the-art driver-behavior profiling models. Particularly, it exhibits significantly reduced computations (i.e., reduced numbers both of floating-point operations and parameters), more efficient memory usage (compact model size), and less inference time. The proposed architecture features depth-wise convolution, along with augmented recurrent neural networks (long short-term memory or gated recurrent unit), for time-series classification. The minimum time-step length (window size) required in the proposed method is significantly lower than that required by recent algorithms. We compared our results with compressed versions of existing models by applying efficient channel pruning on several layers of current models. Furthermore, our network can adapt to new classes using sparse-learning techniques, that is, by freezing relatively strong nodes at the fully connected layer for the existing classes and improving the weaker nodes by retraining them using data regarding the new classes. We successfully deploy the proposed method in a container environment using NVIDIA Docker in an embedded system (Xavier, TX2, and Nano) and comprehensively evaluate it with regard to numerous performance metrics.

10.
Mol Pharm ; 16(7): 3157-3166, 2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31136190

RESUMO

As microRNAs (miRNAs) have been reported to be a type of novel high-value small molecule (SM) drug targets for disease treatments, many researchers are engaged in the field of exploring new SM-miRNA associations. Nevertheless, because of the high cost, adopting traditional biological experiments constrains the efficiency of discovering new associations between SMs and miRNAs. Therefore, as an important auxiliary tool, reliable computational models will be of great help to reveal SM-miRNA associations. In this article, we developed a computational model of sparse learning and heterogeneous graph inference for small molecule-miRNA association prediction (SLHGISMMA). Initially, the sparse learning method (SLM) was implemented to decompose the SM-miRNA adjacency matrix. Then, we integrated the reacquired association information together with the similarity information of SMs and miRNAs into a heterogeneous graph to infer potential SM-miRNA associations. Here, the main innovation of SLHGISMMA lies in the introduction of SLM to eliminate noises of the original adjacency matrix to some extent, which plays an important role in performance improvement. In addition, to assess SLHGISMMA' performance, four different kinds of cross-validations were performed based on two datasets. As a result, based on dataset 1 (dataset 2), SLHGISMMA achieved area under the curves of 0.9273 (0.7774), 0.9365 (0.7973), 0.7703 (0.6556), and 0.9241 ± 0.0052 (0.7724 ± 0.0032) in global leave-one-out cross-validation (LOOCV), miRNA-fixed local LOOCV, SM-fixed local LOOCV, and 5-fold cross-validation, respectively. Moreover, in the case study on three important SMs via removing their known associations, the results showed that most of the top 50 predicted miRNAs were confirmed by the database SM2miR v1.0 or the experimental literature.


Assuntos
Biologia Computacional/métodos , Decitabina/uso terapêutico , Estradiol/uso terapêutico , Fluoruracila/uso terapêutico , MicroRNAs/metabolismo , Neoplasias/tratamento farmacológico , Neoplasias/metabolismo , Algoritmos , Área Sob a Curva , Simulação por Computador , Humanos , Curva ROC
11.
Sensors (Basel) ; 16(10)2016 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-27763511

RESUMO

During the acquisition process hyperspectral images (HSI) are inevitably corrupted by various noises, which greatly influence their visual impression and subsequent applications. In this paper, a novel Bayesian approach integrating hierarchical sparse learning and spectral-spatial information is proposed for HSI denoising. Based on the structure correlations, spectral bands with similar and continuous features are segmented into the same band-subset. To exploit local similarity, each subset is then divided into overlapping cubic patches. All patches can be regarded as consisting of clean image component, Gaussian noise component and sparse noise component. The first term is depicted by a linear combination of dictionary elements, where Gaussian process with Gamma distribution is applied to impose spatial consistency on dictionary. The last two terms are utilized to fully depict the noise characteristics. Furthermore, the sparseness of the model is adaptively manifested through Beta-Bernoulli process. Calculated by Gibbs sampler, the proposed model can directly predict the noise and dictionary without priori information of the degraded HSI. The experimental results on both synthetic and real HSI demonstrate that the proposed approach can better suppress the existing various noises and preserve the structure/spectral-spatial information than the compared state-of-art approaches.

12.
Neuroimage ; 100: 91-105, 2014 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-24911377

RESUMO

Recent studies on AD/MCI diagnosis have shown that the tasks of identifying brain disease and predicting clinical scores are highly related to each other. Furthermore, it has been shown that feature selection with a manifold learning or a sparse model can handle the problems of high feature dimensionality and small sample size. However, the tasks of clinical score regression and clinical label classification were often conducted separately in the previous studies. Regarding the feature selection, to our best knowledge, most of the previous work considered a loss function defined as an element-wise difference between the target values and the predicted ones. In this paper, we consider the problems of joint regression and classification for AD/MCI diagnosis and propose a novel matrix-similarity based loss function that uses high-level information inherent in the target response matrix and imposes the information to be preserved in the predicted response matrix. The newly devised loss function is combined with a group lasso method for joint feature selection across tasks, i.e., predictions of clinical scores and a class label. In order to validate the effectiveness of the proposed method, we conducted experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, and showed that the newly devised loss function helped enhance the performances of both clinical score prediction and disease status identification, outperforming the state-of-the-art methods.


Assuntos
Doença de Alzheimer/diagnóstico , Encéfalo/patologia , Disfunção Cognitiva/diagnóstico , Interpretação Estatística de Dados , Computação Matemática , Neuroimagem/métodos , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/líquido cefalorraquidiano , Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Disfunção Cognitiva/líquido cefalorraquidiano , Disfunção Cognitiva/diagnóstico por imagem , Simulação por Computador , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Prognóstico , Cintilografia
13.
Neural Netw ; 178: 106407, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38823068

RESUMO

Support tensor machine (STM), as a higher-order extension of support vector machine, is adept at effectively addressing tensorial data classification problems, which maintains the inherent structure in tensors and mitigates the curse of dimensionality. However, it needs to resort to the alternating projection iterative technique, which is very time-consuming. To overcome this shortcoming, we propose an efficient sequential safe static and dynamic screening rule (SS-SDSR) for accelerating STM in this paper. Its main idea is to reduce every projection iterative sub-model by identifying and deleting the redundant variables before and during the training process without sacrificing accuracy. Its construction mainly consists of two parts: (1) The static screening rule and dynamic screening rule are first built based on the variational inequality and duality gap, respectively. (2) The sequential screening process is achieved by using the static screening rule with the different adjacent parameters and applying the dynamic screening rule under the same parameter. In the experiment, on the one hand, to verify the influence of different parameter intervals, screening frequencies, and forms of data on the effectiveness of our method, three experiments on artificial datasets are conducted, which indicate that our method is effective for any forms of data when the parameter interval is small and the screening frequency is appropriate. On the other hand, to demonstrate the feasibility and validity of our SS-SDSR, numerical experiments on eleven vector-based datasets, and six tensor-based datasets are conducted and compared with the other five algorithms. Experimental results illustrate the effectiveness and safety of our SS-SDSR.


Assuntos
Máquina de Vetores de Suporte , Algoritmos , Redes Neurais de Computação , Humanos
14.
Neural Netw ; 175: 106295, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38614023

RESUMO

Multi-view unsupervised feature selection (MUFS) is an efficient approach for dimensional reduction of heterogeneous data. However, existing MUFS approaches mostly assign the samples the same weight, thus the diversity of samples is not utilized efficiently. Additionally, due to the presence of various regularizations, the resulting MUFS problems are often non-convex, making it difficult to find the optimal solutions. To address this issue, a novel MUFS method named Self-paced Regularized Adaptive Multi-view Unsupervised Feature Selection (SPAMUFS) is proposed. Specifically, the proposed approach firstly trains the MUFS model with simple samples, and gradually learns complex samples by using self-paced regularizer. l2,p-norm (0

Assuntos
Algoritmos , Aprendizado de Máquina não Supervisionado , Humanos , Redes Neurais de Computação
15.
Neural Netw ; 167: 775-786, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37729791

RESUMO

Much mathematical effort has been devoted to developing Principal Component Analysis (PCA), which is the most popular feature extraction method. To suppress the negative effect of noise on PCA performance, there have been extensive studies and applications of a large number of robust PCAs achieving outstanding results. However, existing methods suffer from at least two shortcomings: (1) They expressed PCA as a reconstruction model measured by Euclidean distance, which only considers the relationship between the data and its reconstruction and ignores the differences between different data points; (2) They did not consider the class-specificity distribution information contained in the data itself, thus lacking discriminative properties. To overcome the above problems, we propose a Sparse Discriminant Principal Components Analysis (SDPCA) model based on contrastive learning and class-specificity distribution. Specifically, we use contrastive learning to measure the relationship between samples and their reconstructions, which fully takes the discriminative information between data into account in PCA. In order to make the extracted low-dimensional features profoundly reflect the class-specificity distribution of the data, we minimize the squared ℓ1,2-norm of the low-dimensional embedding. In addition, to reduce the effects of redundant features and noise and to improve the interpretability of PCA at the same time, we impose sparsity constraints on the projection matrix using the squared ℓ1,2-norm. Our experimental results on different types of benchmark databases demonstrate that our model has state-of-the-art performance.


Assuntos
Aprendizado de Máquina , Análise de Componente Principal , Bases de Dados Factuais
16.
Comput Biol Med ; 152: 106367, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36516575

RESUMO

Alzheimer's disease (AD) is highly prevalent and a significant cause of dementia and death in elderly individuals. Motivated by breakthroughs of multi-task learning (MTL), efforts have been made to extend MTL to improve the Alzheimer's disease cognitive score prediction by exploiting structure correlation. Though important and well-studied, three key aspects are yet to be fully handled in an unified framework: (i) appropriately modeling the inherent task relationship; (ii) fully exploiting the task relatedness by considering the underlying feature structure. (iii) automatically determining the weight of each task. To this end, we present the Bi-Graph guided self-Paced Multi-Task Feature Learning (BGP-MTFL) framework for exploring the relationship among multiple tasks to improve overall learning performance of cognitive score prediction. The framework consists of the two correlation regularization for features and tasks, ℓ2,1 regularization and self-paced learning scheme. Moreover, we design an efficient optimization method to solve the non-smooth objective function of our approach based on the Alternating Direction Method of Multipliers (ADMM) combined with accelerated proximal gradient (APG). The proposed model is comprehensively evaluated on the Alzheimer's disease neuroimaging initiative (ADNI) datasets. Overall, the proposed algorithm achieves an nMSE (normalized Mean Squared Error) of 3.923 and an wR (weighted R-value) of 0.416 for predicting eighteen cognitive scores, respectively. The empirical study demonstrates that the proposed BGP-MTFL model outperforms the state-of-the-art AD prediction approaches and enables identifying more stable biomarkers.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Idoso , Doença de Alzheimer/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos , Aprendizagem , Cognição
17.
Patterns (N Y) ; 4(12): 100890, 2023 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-38106611

RESUMO

Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering unified inconsistent notation sub-structures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the safe pattern pruning method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.

18.
Neural Netw ; 155: 523-535, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36166979

RESUMO

The L1-regularized regression with Kullback-Leibler divergence (KL-L1R) is a popular regression technique. Although many efforts have been devoted to its efficient implementation, it remains challenging when the number of features is extremely large. In this paper, to accelerate KL-L1R, we introduce a novel and fast sequential safe feature elimination rule (FER) based on its sparsity, local regularity properties, and duality theory. It takes negligible time to select and delete most redundant features before and during the training process. Only one reduced model needs to be solved, which makes the computational time shortened. To further speed up the reduced model, the Newton coordinate descent method (Newton-CDM) is chosen as a solver. The superiority of FER is safety, i.e., its solution is exactly the same as the original KL-L1R. Numerical experiments on three artificial datasets, five real-world datasets, and one handwritten digit dataset demonstrate the feasibility and validity of our FER.


Assuntos
Algoritmos
19.
Neural Netw ; 156: 160-169, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36270199

RESUMO

Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of traditional machine learning models. This paper proposes periodic perturbations (prune and regrow) of DNN weights, especially at the self-supervised pre-training stage of deep autoencoders. The proposed weight perturbation strategy outperforms dropout learning or weight regularization (L1 or L2) for four out of six tabular data sets in downstream classification tasks. Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets, resulting in compressed pretrained models. The proposed pretrained model compression improves the accuracy of downstream classification, unlike traditional weight pruning methods that trade off performance for model compression. Our experiments reveal that a pretrained deep autoencoder with weight perturbation can outperform traditional machine learning in tabular data classification, whereas baseline fully-connected DNNs yield the worst classification accuracy. However, traditional machine learning models are superior to any deep model when a tabular data set contains uncorrelated variables. Therefore, the performance of deep models with tabular data is contingent on the types and statistics of constituent variables.


Assuntos
Compressão de Dados , Redes Neurais de Computação , Aprendizado de Máquina , Fenômenos Físicos
20.
Comput Biol Med ; 140: 105090, 2021 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-34875406

RESUMO

Alzheimer's disease (AD) is a gradually progressive neurodegenerative disease affecting cognition functions. Predicting the cognitive scores from neuroimage measures and identifying relevant imaging biomarkers are important research topics in the study of AD. Despite machine learning algorithms having many successful applications, the prediction model suffers from the so-called curse of dimensionality. Multi-task feature learning (MTFL) has helped tackle this problem incorporating the correlations among multiple clinical cognitive scores. However, MTFL neglects the inherent correlation among brain imaging measures. In order to better predict the cognitive scores and identify stable biomarkers, we first propose a generalized multi-task formulation framework that incorporates the task and feature correlation structures simultaneously. Second, we present a novel feature-aware sparsity-inducing norm (FAS-norm) penalty to incorporate a useful correlation between brain regions by exploiting correlations among features. Three multi-task learning models that incorporate the FAS-norm penalty are proposed following our framework. Finally, the algorithm based on the alternating direction method of multipliers (ADMM) is developed to optimize the non-smooth problems. We comprehensively evaluate the proposed models on the cross-sectional and longitudinal Alzheimer's disease neuroimaging initiative datasets. The inputs are the thickness measures and the volume measures of the cortical regions of interest. Compared with MTFL, our methods achieve an average decrease of 4.28% in overall error in the cross-sectional analysis and an average decrease of 7.97% in the Alzheimer's Disease Assessment Scale cognitive total score longitudinal analysis. Moreover, our methods identify sensitive and stable biomarkers to physicians, such as the hippocampus, lateral ventricle, and corpus callosum.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa