RESUMO
Growing evidence has shown that the brain connectivity network experiences alterations for complex diseases such as Alzheimer's disease (AD). Network comparison, also known as differential network analysis, is thus particularly powerful to reveal the disease pathologies and identify clinical biomarkers for medical diagnoses (classification). Data from neurophysiological measurements are multidimensional and in matrix-form. Naive vectorization method is not sufficient as it ignores the structural information within the matrix. In the article, we adopt the Kronecker product covariance matrices framework to capture both spatial and temporal correlations of the matrix-variate data while the temporal covariance matrix is treated as a nuisance parameter. By recognizing that the strengths of network connections may vary across subjects, we develop an ensemble-learning procedure, which identifies the differential interaction patterns of brain regions between the case group and the control group and conducts medical diagnosis (classification) of the disease simultaneously. Simulation studies are conducted to assess the performance of the proposed method. We apply the proposed procedure to the functional connectivity analysis of an functional magnetic resonance imaging study on AD. The hub nodes and differential interaction patterns identified are consistent with existing experimental studies, and satisfactory out-of-sample classification performance is achieved for medical diagnosis of AD.
Assuntos
Doença de Alzheimer , Encéfalo , Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética/métodosRESUMO
Graphical models play an important role in neuroscience studies, particularly in brain connectivity analysis. Typically, observations/samples are from several heterogenous groups and the group membership of each observation/sample is unavailable, which poses a great challenge for graph structure learning. In this paper, we propose a method which can achieve Simultaneous Clustering and Estimation of Heterogeneous Graphs (briefly denoted as SCEHG) for matrix-variate functional magnetic resonance imaging (fMRI) data. Unlike the conventional clustering methods which rely on the mean differences of various groups, the proposed SCEHG method fully exploits the group differences of conditional dependence relationships among brain regions for learning cluster structure. In essence, by constructing individual-level between-region network measures, we formulate clustering as penalized regression with grouping and sparsity pursuit, which transforms the unsupervised learning into supervised learning. A modified difference of convex programming with the alternating direction method of multipliers (DC-ADMM) algorithm is proposed to solve the corresponding optimization problem. We also propose a generalized criterion to specify the number of clusters. Extensive simulation studies illustrate the superiority of the SCEHG method over some state-of-the-art methods in terms of both clustering and graph recovery accuracy. We also apply the SCEHG procedure to analyze fMRI data associated with attention-deficit hyperactivity disorder (ADHD), which illustrates its empirical usefulness.
Assuntos
Encéfalo , Imageamento por Ressonância Magnética , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Mapeamento Encefálico/métodos , Simulação por Computador , Algoritmos , Análise por ConglomeradosRESUMO
Handling missing values in matrix data is an important step in data analysis. To date, many methods to estimate missing values based on data pattern similarity have been proposed. Most previously proposed methods perform missing value imputation based on data trends over the entire feature space. However, individual missing values are likely to show similarity to data patterns in local feature space. In addition, most existing methods focus on single class data, while multiclass analysis is frequently required in various fields. Missing value imputation for multiclass data must consider the characteristics of each class. In this paper, we propose two methods based on closed itemsets, CIimpute and ICIimpute, to achieve missing value imputation using local feature space for multiclass matrix data. CIimpute estimates missing values using closed itemsets extracted from each class. ICIimpute is an improved method of CIimpute in which an attribute reduction process is introduced. Experimental results demonstrate that attribute reduction considerably reduces computational time and improves imputation accuracy. Furthermore, it is shown that, compared to existing methods, ICIimpute provides superior imputation accuracy but requires more computational time.
RESUMO
With the rapid growth of neuroimaging technologies, a great effort has been dedicated recently to investigate the dynamic changes in brain activity. Examples include time course calcium imaging and dynamic brain functional connectivity. In this paper, we propose a novel nonparametric matrix response regression model to characterize the nonlinear association between 2D image outcomes and predictors such as time and patient information. Our estimation procedure can be formulated as a nuclear norm regularization problem, which can capture the underlying low-rank structure of the dynamic 2D images. We present a computationally efficient algorithm, derive the asymptotic theory, and show that the method outperforms other existing approaches in simulations. We then apply the proposed method to a calcium imaging study for estimating the change of fluorescent intensities of neurons, and an electroencephalography study for a comparison in the dynamic connectivity covariance matrices between alcoholic and control individuals. For both studies, the method leads to a substantial improvement in prediction error.
Assuntos
Análise de Dados , Processamento de Imagem Assistida por Computador , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Cálcio , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , NeuroimagemRESUMO
Brain functional connectivity reveals the synchronization of brain systems through correlations in neurophysiological measures of brain activities. Growing evidence now suggests that the brain connectivity network experiences alterations with the presence of numerous neurological disorders, thus differential brain network analysis may provide new insights into disease pathologies. The data from neurophysiological measurement are often multidimensional and in a matrix form, posing a challenge in brain connectivity analysis. Existing graphical model estimation methods either assume a vector normal distribution that in essence requires the columns of the matrix data to be independent or fail to address the estimation of differential networks across different populations. To tackle these issues, we propose an innovative matrix-variate differential network (MVDN) model. We exploit the D-trace loss function and a Lasso-type penalty to directly estimate the spatial differential partial correlation matrix and use an alternating direction method of multipliers algorithm for the optimization problem. Theoretical and simulation studies demonstrate that MVDN significantly outperforms other state-of-the-art methods in dynamic differential network analysis. We illustrate with a functional connectivity analysis of an attention deficit hyperactivity disorder dataset. The hub nodes and differential interaction patterns identified are consistent with existing experimental studies.
Assuntos
Encéfalo , Imageamento por Ressonância Magnética , Algoritmos , Encéfalo/diagnóstico por imagem , Mapeamento Encefálico/métodos , Imageamento por Ressonância Magnética/métodos , Distribuição NormalRESUMO
This paper proposes a new method of mixed gas identification based on a convolutional neural network for time series classification. In view of the superiority of convolutional neural networks in the field of computer vision, we applied the concept to the classification of five mixed gas time series data collected by an array of eight MOX gas sensors. Existing convolutional neural networks are mostly used for processing visual data, and are rarely used in gas data classification and have great limitations. Therefore, the idea of mapping time series data into an analogous-image matrix data is proposed. Then, five kinds of convolutional neural networks-VGG-16, VGG-19, ResNet18, ResNet34 and ResNet50-were used to classify and compare five kinds of mixed gases. By adjusting the parameters of the convolutional neural networks, the final gas recognition rate is 96.67%. The experimental results show that the method can classify the gas data quickly and effectively, and effectively combine the gas time series data with classical convolutional neural networks, which provides a new idea for the identification of mixed gases.
RESUMO
BACKGROUND: the chemometric processing of second-order chromatographic-spectral data is usually carried out with the aid of multivariate curve resolution-alternating least-squares (MCR-ALS). Recently, an alternative procedure was described based on the estimation of image moments for each data matrix and subsequent application of multiple linear regression after suitable variable selection. RESULTS: The analysis of both simulated and experimental data leads to the conclusion that the image moment method, although can cope with chromatographic lack of reproducibility across injections, it only performs well in the absence of uncalibrated interferents. MCR-ALS, on the other hand, provides good analytical results in all studied situations, whether the test samples contain uncalibrated interferents or not. SIGNIFICANCE: The results are useful to assess the real usefulness of newly proposed methodologies for second-order calibration in the case of chromatographic-spectral data sets, especially when samples contain unexpected chemical constituents.
RESUMO
BACKGROUND: the chemometric processing of second-order chromatographic-spectral data is usually carried out with the aid of multivariate curve resolution-alternating least-squares (MCR-ALS). When baseline contributions occur in the data, the background profile retrieved with MCR-ALS may show abnormal lumps or negative dips at the position of the remaining component peaks. RESULTS: The phenomenon is shown to be due to remaining rotational ambiguity in the obtained profiles, as confirmed by the estimation of the boundaries of the range of feasible bilinear profiles. To avoid the abnormal features in the retrieved profile, a new background interpolation constraint is proposed and described in detail. Both simulated and experimental data are employed to support the need of the new MCR-ALS constraint. In the latter case, the estimated analyte concentrations agreed with those previously reported. SIGNIFICANCE: The developed procedure helps to reduce the extent of rotational ambiguity in the solution and to better interpret the results on physicochemical grounds.
RESUMO
We present a new measure for evaluating the performance of control charts to detect abrupt changes of finite matrix sequences. The objective is to minimize the probability that the control chart fails to raise the alarm at unknown change point time for a given in-control average run length. We construct and prove the optimal control chart with dynamic control limits in different pre- and post-change distributions. We validate the optimality of the proposed chart by conducting exhaustive experiments on both simulation study and real-world data.
RESUMO
An optimized method for bacterial strain differentiation, based on combination of Repeated Sequences and Whole Genome Alignment Differential Analysis (RS&WGADA), is presented in this report. In this analysis, 51 Acinetobacter baumannii multidrug-resistance strains from one hospital environment and patients from 14 hospital wards were classified on the basis of polymorphisms of repeated sequences located in CRISPR region, variation in the gene encoding the EmrA-homologue of E. coli, and antibiotic resistance patterns, in combination with three newly identified polymorphic regions in the genomes of A. baumannii clinical isolates. Differential analysis of two similarity matrices between different genotypes and resistance patterns allowed to distinguish three significant correlations (p < 0.05) between 172 bp DNA insertion combined with resistance to chloramphenicol and gentamycin. Interestingly, 45 and 55 bp DNA insertions within the CRISPR region were identified, and combined during analyses with resistance/susceptibility to trimethoprim/sulfamethoxazole. Moreover, 184 or 1374 bp DNA length polymorphisms in the genomic region located upstream of the GTP cyclohydrolase I gene, associated mainly with imipenem susceptibility, was identified. In addition, considerable nucleotide polymorphism of the gene encoding the gamma/tau subunit of DNA polymerase III, an enzyme crucial for bacterial DNA replication, was discovered. The differentiation analysis performed using the above described approach allowed us to monitor the distribution of A. baumannii isolates in different wards of the hospital in the time frame of several years, indicating that the optimized method may be useful in hospital epidemiological studies, particularly in identification of the source of primary infections.
Assuntos
Infecções por Acinetobacter , Acinetobacter baumannii , Farmacorresistência Bacteriana Múltipla , Infecções por Acinetobacter/microbiologia , Acinetobacter baumannii/efeitos dos fármacos , Acinetobacter baumannii/genética , Antibacterianos/farmacologia , Farmacorresistência Bacteriana Múltipla/genética , Hospitais , Humanos , Sequenciamento Completo do GenomaRESUMO
We propose a novel linear discriminant analysis (LDA) approach for the classification of high-dimensional matrix-valued data that commonly arises from imaging studies. Motivated by the equivalence of the conventional LDA and the ordinary least squares, we consider an efficient nuclear norm penalized regression that encourages a low-rank structure. Theoretical properties including a nonasymptotic risk bound and a rank consistency result are established. Simulation studies and an application to electroencephalography data show the superior performance of the proposed method over the existing approaches.
RESUMO
Alzheimer's disease is rapidly becoming an endemic for people over the age of 65. A vital path towards reversing this ominous trend is the building of reliable diagnostic devices for definite and early diagnoses in lieu of the longitudinal, usually inconclusive and non-generalize-able methods currently in use. In this article, we present a survey of methods for mining pools of mass spectrometer saliva data in relation to diagnosing Alzheimer's disease. The computational methods provides new approaches for appropriately gleaning latent information from mass spectra data. They improve traditional machine learning algorithms and are most fit for handling matrix data points including solving problems beyond protein identifications and biomarker discovery.
RESUMO
As the number of compounds and the volume of bioactivity data rapidly grow, advanced computational methods are required to study structure-activity relationships (SARs) on a large scale. Herein, the SAR matrix (SARM) methodology is described that was designed to systematically extract structural relationships between bioactive compounds from large databases, explore structure-activity relationships, and navigate multitarget activity spaces, which is one of the core tasks in chemogenomics. In addition, the SARM approach was designed to visualize structural and structure-activity relationships, which is often of critical importance for making this information available in an intuitive form for practical applications.