Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
J Integr Neurosci ; 23(4): 81, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38682217

RESUMEN

BACKGROUND: Alzheimer's disease (AD) is an irreversible primary brain disease with insidious onset. The rise of imaging genetics research has led numerous researchers to examine the complex association between genes and brain phenotypes from the perspective of computational biology. METHODS: Given that most previous studies have assumed that imaging data and genetic data are linearly related and are therefore unable to explore their nonlinear relationship, our study applied a joint depth semi-supervised nonnegative matrix decomposition (JDSNMF) algorithm to solve this problem. The JDSNMF algorithm jointly decomposed multimodal imaging genetics data into both a standard basis matrix and multiple feature matrices. During the decomposition process, the coefficient matrix A multilayer nonlinear transformation was performed using a neural network to capture nonlinear features. RESULTS: The results using a real dataset demonstrated that the algorithm can fully exploit the association between strongly correlated image genetics data and effectively detect biomarkers of AD. Our results might provide a reference for identifying biologically significant imaging genetic correlations, and help to elucidate disease-related mechanisms. CONCLUSIONS: The diagnostic model constructed by the top features of the three modality data sets mined by the algorithm has high accuracy, and these features are expected to become new therapeutic targets for AD.


Asunto(s)
Enfermedad de Alzheimer , Aprendizaje Automático , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/diagnóstico por imagen , Humanos , Neuroimagen/métodos , Marcadores Genéticos , Imagen por Resonancia Magnética , Encéfalo/diagnóstico por imagen , Encéfalo/metabolismo , Algoritmos , Anciano , Redes Neurales de la Computación
2.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33834202

RESUMEN

The low capture rate of expressed RNAs from single-cell sequencing technology is one of the major obstacles to downstream functional genomics analyses. Recently, a number of imputation methods have emerged for single-cell transcriptome data, however, recovering missing values in very sparse expression matrices remains a substantial challenge. Here, we propose a new algorithm, WEDGE (WEighted Decomposition of Gene Expression), to impute gene expression matrices by using a biased low-rank matrix decomposition method. WEDGE successfully recovered expression matrices, reproduced the cell-wise and gene-wise correlations and improved the clustering of cells, performing impressively for applications with sparse datasets. Overall, this study shows a potent approach for imputing sparse expression matrix data, and our WEDGE algorithm should help many researchers to more profitably explore the biological meanings embedded in their single-cell RNA sequencing datasets. The source code of WEDGE has been released at https://github.com/QuKunLab/WEDGE.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , COVID-19/sangre , COVID-19/genética , COVID-19/virología , Análisis por Conglomerados , Simulación por Computador , Genómica/métodos , Humanos , Leucocitos Mononucleares/clasificación , Leucocitos Mononucleares/metabolismo , Reproducibilidad de los Resultados , SARS-CoV-2/fisiología , Índice de Severidad de la Enfermedad
3.
BMC Med Res Methodol ; 23(1): 15, 2023 01 16.
Artículo en Inglés | MEDLINE | ID: mdl-36647014

RESUMEN

INTRODUCTION: Surveys are common research tools, and questionnaires revisions are a common occurrence in longitudinal studies. Revisions can, at times, introduce systematic shifts in measures of interest. We formulate that questionnaire revision are a stochastic process with transition matrices. Thus, revision shifts can be reduced by first estimating these transition matrices, which can be utilized in estimation of interested measures. MATERIALS AND METHOD: An ideal survey response model is defined by mapping between the true value of a participant's response to an interval in the grouped data type scale. A population completed surveys multiple times, as modeled with multiple stochastic process. This included stochastic processes related to true values and intervals. While multiple factors contribute to changes in survey responses, here, we explored the method that can mitigate the effects of questionnaire revision. We proposed the Version Alignment Method (VAM), a data preprocessing tool, which can separate the transitions according to revisions from all transitions via solving an optimization problem and using the revision-related transitions to remove the revision effect. To verify VAM, we used simulation data to study the estimation error and a real life MJ dataset containing large amounts of long-term questionnaire responses with several questionnaire revisions to study its feasibility. RESULT: We compared the difference of the annual average between consecutive years. Without adjustment, the difference is 0.593 when the revision occurred, while VAM brought it down to 0.115, where difference between years without revision was in the 0.005, 0.125 range. Furthermore, our method rendered the responses to the same set of intervals, thus comparing the relative frequency of items before and after revisions became possible. The average estimation error in L infinity was 0.0044 which occupied the 95% CI which was constructed by bootstrap analysis. CONCLUSION: Questionnaire revisions can induce different response bias and information loss, thus causing inconsistencies in the estimated measures. Conventional methods can only partly remedy this issue. Our proposal, VAM, can estimate the aggregate difference of all revision-related systematic errors and can reduce the differences, thus reducing inconsistencies in the final estimations of longitudinal studies.


Asunto(s)
Falla de Prótesis , Humanos , Tiempo , Encuestas y Cuestionarios , Reoperación
4.
Sensors (Basel) ; 23(4)2023 Feb 11.
Artículo en Inglés | MEDLINE | ID: mdl-36850660

RESUMEN

Anomaly detection of hyperspectral remote sensing data has recently become more attractive in hyperspectral image processing. The low-rank and sparse matrix decomposition-based anomaly detection algorithm (LRaSMD) exhibits poor detection performance in complex scenes with multiple background edges and noise. Therefore, this study proposes a weighted sparse hyperspectral anomaly detection method. First, using the idea of matrix decomposition in mathematics, the original hyperspectral data matrix is reconstructed into three sub-matrices with low rank, small sparsity and representing noise, respectively. Second, to suppress the noise interference in the complex background, we employed the low-rank, background image as a reference, built a local spectral and spatial dictionary through the sliding window strategy, reconstructed the HSI pixels of the original data, and extracted the sparse coefficient. We proposed the sparse coefficient divergence evaluation index (SCDI) as a weighting factor to weight the sparse anomaly map to obtain a significant anomaly map to suppress the background edge, noise, and other residues caused by decomposition, and enhance the abnormal target. Finally, abnormal pixels are segmented based on the adaptive threshold. The experimental results demonstrate that, on a real-scene hyperspectral dataset with a complicated background, the proposed method outperforms the existing representative algorithms in terms of detection performance.

5.
Sensors (Basel) ; 23(4)2023 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-36850908

RESUMEN

The zero-shot image classification (ZSIC) is designed to solve the classification problem when the sample is very small, or the category is missing. A common method is to use attribute or word vectors as a priori category features (auxiliary information) and complete the domain transfer from training of seen classes to recognition of unseen classes by building a mapping between image features and a priori category features. However, feature extraction of the whole image lacks discrimination, and the amount of information of single attribute features or word vector features of categories is insufficient, which makes the matching degree between image features and prior class features not high and affects the accuracy of the ZSIC model. To this end, a spatial attention mechanism is designed, and an image feature extraction module based on this attention mechanism is constructed to screen critical features with discrimination. A semantic information fusion method based on matrix decomposition is proposed, which first decomposes the attribute features and then fuses them with the extracted word vector features of a dataset to achieve information expansion. Through the above two improvement measures, the classification accuracy of the ZSIC model for unseen images is improved. The experimental results on public datasets verify the effect and superiority of the proposed methods.

6.
BMC Genomics ; 23(Suppl 1): 269, 2022 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-35387615

RESUMEN

BACKGROUND: In biological systems, metabolomics can not only contribute to the discovery of metabolic signatures for disease diagnosis, but is very helpful to illustrate the underlying molecular disease-causing mechanism. Therefore, identification of disease-related metabolites is of great significance for comprehensively understanding the pathogenesis of diseases and improving clinical medicine. RESULTS: In the paper, we propose a disease and literature driven metabolism prediction model (DLMPM) to identify the potential associations between metabolites and diseases based on latent factor model. We build the disease glossary with disease terms from different databases and an association matrix based on the mapping between diseases and metabolites. The similarity of diseases and metabolites is used to complete the association matrix. Finally, we predict potential associations between metabolites and diseases based on the matrix decomposition method. In total, 1,406 direct associations between diseases and metabolites are found. There are 119,206 unknown associations between diseases and metabolites predicted with a coverage rate of 80.88%. Subsequently, we extract training sets and testing sets based on data increment from the database of disease-related metabolites and assess the performance of DLMPM on 19 diseases. As a result, DLMPM is proven to be successful in predicting potential metabolic signatures for human diseases with an average AUC value of 82.33%. CONCLUSION: In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases.


Asunto(s)
Metabolómica , Publicaciones , Biología Computacional/métodos , Bases de Datos Factuales , Humanos , Metabolómica/métodos
7.
Artículo en Inglés | MEDLINE | ID: mdl-35992040

RESUMEN

Independent component analysis (ICA) is an unsupervised learning method popular in functional magnetic resonance imaging (fMRI). Group ICA has been used to search for biomarkers in neurological disorders including autism spectrum disorder and dementia. However, current methods use a principal component analysis (PCA) step that may remove low-variance features. Linear non-Gaussian component analysis (LNGCA) enables simultaneous dimension reduction and feature estimation including low-variance features in single-subject fMRI. A group LNGCA model is proposed to extract group components shared by more than one subject. Unlike group ICA methods, this novel approach also estimates individual (subject-specific) components orthogonal to the group components. To determine the total number of components in each subject, a parametric resampling test is proposed that samples spatially correlated Gaussian noise to match the spatial dependence observed in data. In simulations, estimated group components achieve higher accuracy compared to group ICA. The method is applied to a resting-state fMRI study on autism spectrum disorder in 342 children (252 typically developing, 90 with autism), where the group signals include resting-state networks. The discovered group components appear to exhibit different levels of temporal engagement in autism versus typically developing children, as revealed using group LNGCA. This novel approach to matrix decomposition is a promising direction for feature detection in neuroimaging.

8.
Sensors (Basel) ; 22(21)2022 Oct 24.
Artículo en Inglés | MEDLINE | ID: mdl-36365821

RESUMEN

The work presented here develops a computer vision framework that is view angle independent for vehicle segmentation and classification from roadway traffic systems installed by the Virginia Department of Transportation (VDOT). An automated technique for extracting a region of interest is discussed to speed up the processing. The VDOT traffic videos are analyzed for vehicle segmentation using an improved robust low-rank matrix decomposition technique. It presents a new and effective thresholding method that improves segmentation accuracy and simultaneously speeds up the segmentation processing. Size and shape physical descriptors from morphological properties and textural features from the Histogram of Oriented Gradients (HOG) are extracted from the segmented traffic. Furthermore, a multi-class support vector machine classifier is employed to categorize different traffic vehicle types, including passenger cars, passenger trucks, motorcycles, buses, and small and large utility trucks. It handles multiple vehicle detections through an iterative k-means clustering over-segmentation process. The proposed algorithm reduced the processed data by an average of 40%. Compared to recent techniques, it showed an average improvement of 15% in segmentation accuracy, and it is 55% faster than the compared segmentation techniques on average. Moreover, a comparative analysis of 23 different deep learning architectures is presented. The resulting algorithm outperformed the compared deep learning algorithms for the quality of vehicle classification accuracy. Furthermore, the timing analysis showed that it could operate in real-time scenarios.


Asunto(s)
Algoritmos , Máquina de Vectores de Soporte , Análisis por Conglomerados
9.
Entropy (Basel) ; 24(11)2022 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-36359619

RESUMEN

Multi-focus image fusion integrates images from multiple focus regions of the same scene in focus to produce a fully focused image. However, the accurate retention of the focused pixels to the fusion result remains a major challenge. This study proposes a multi-focus image fusion algorithm based on Hessian matrix decomposition and salient difference focus detection, which can effectively retain the sharp pixels in the focus region of a source image. First, the source image was decomposed using a Hessian matrix to obtain the feature map containing the structural information. A focus difference analysis scheme based on the improved sum of a modified Laplacian was designed to effectively determine the focusing information at the corresponding positions of the structural feature map and source image. In the process of the decision-map optimization, considering the variability of image size, an adaptive multiscale consistency verification algorithm was designed, which helped the final fused image to effectively retain the focusing information of the source image. Experimental results showed that our method performed better than some state-of-the-art methods in both subjective and quantitative evaluation.

10.
BMC Bioinformatics ; 22(1): 152, 2021 Mar 24.
Artículo en Inglés | MEDLINE | ID: mdl-33761868

RESUMEN

BACKGROUND: Recent studies have confirmed that N7-methylguanosine (m7G) modification plays an important role in regulating various biological processes and has associations with multiple diseases. Wet-lab experiments are cost and time ineffective for the identification of disease-associated m7G sites. To date, tens of thousands of m7G sites have been identified by high-throughput sequencing approaches and the information is publicly available in bioinformatics databases, which can be leveraged to predict potential disease-associated m7G sites using a computational perspective. Thus, computational methods for m7G-disease association prediction are urgently needed, but none are currently available at present. RESULTS: To fill this gap, we collected association information between m7G sites and diseases, genomic information of m7G sites, and phenotypic information of diseases from different databases to build an m7G-disease association dataset. To infer potential disease-associated m7G sites, we then proposed a heterogeneous network-based model, m7G Sites and Diseases Associations Inference (m7GDisAI) model. m7GDisAI predicts the potential disease-associated m7G sites by applying a matrix decomposition method on heterogeneous networks which integrate comprehensive similarity information of m7G sites and diseases. To evaluate the prediction performance, 10 runs of tenfold cross validation were first conducted, and m7GDisAI got the highest AUC of 0.740(± 0.0024). Then global and local leave-one-out cross validation (LOOCV) experiments were implemented to evaluate the model's accuracy in global and local situations respectively. AUC of 0.769 was achieved in global LOOCV, while 0.635 in local LOOCV. A case study was finally conducted to identify the most promising ovarian cancer-related m7G sites for further functional analysis. Gene Ontology (GO) enrichment analysis was performed to explore the complex associations between host gene of m7G sites and GO terms. The results showed that m7GDisAI identified disease-associated m7G sites and their host genes are consistently related to the pathogenesis of ovarian cancer, which may provide some clues for pathogenesis of diseases. CONCLUSION: The m7GDisAI web server can be accessed at http://180.208.58.66/m7GDisAI/ , which provides a user-friendly interface to query disease associated m7G. The list of top 20 m7G sites predicted to be associted with 177 diseases can be achieved. Furthermore, detailed information about specific m7G sites and diseases are also shown.


Asunto(s)
Biología Computacional , Neoplasias , Fosfatidilinositol 3-Quinasas , Ontología de Genes , Guanosina/análogos & derivados , Humanos , Neoplasias/diagnóstico
11.
J Proteome Res ; 20(5): 2291-2298, 2021 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-33661642

RESUMEN

Recent advances in the liquid chromatography/mass spectrometry (LC/MS) technology have improved the sensitivity, resolution, and speed of proteome analysis, resulting in increasing demand for more sophisticated algorithms to interpret complex mass spectrograms. Here, we propose a novel statistical method, proteomic mass spectrogram decomposition (ProtMSD), for joint identification and quantification of peptides and proteins. Given the proteomic mass spectrogram and the reference mass spectra of all possible peptide ions associated with proteins as a dictionary, ProtMSD estimates the chromatograms of those peptide ions under a group sparsity constraint without using the conventional careful preprocessing (e.g., thresholding and peak picking). We show that the method was significantly improved using protein-peptide hierarchical relationships, isotopic distribution profiles, reference retention times of peptide ions, and prelearned mass spectra of noise. We examined the concept of database search, library search, and match-between-runs. Our ProtMSD showed excellent agreements of 3277 peptide ions (94.79%) and 493 proteins (98.21%) with Mascot/Skyline for an Escherichia coli proteome sample and of 4460 peptide ions (103%) and 588 proteins (101%) with match-between-runs by MaxQuant for a yeast proteome sample. This is the first attempt to use a matrix decomposition technique as a tool for LC/MS-based proteome identification and quantification.


Asunto(s)
Proteoma , Proteómica , Cromatografía Liquida , Espectrometría de Masas , Péptidos
12.
Entropy (Basel) ; 23(7)2021 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-34356393

RESUMEN

One of the tasks of data science is the decomposition of large matrices in order to understand their structures. A special case of this is when we decompose relations, i.e., logical matrices. In this paper, we present a method based on the similarity of rows and columns, which uses correlation clustering to cluster the rows and columns of the matrix, facilitating the visualization of the relation by rearranging the rows and columns. In this article, we compare our method with Gunther Schmidt's problems and solutions. Our method produces the original solutions by selecting its parameters from a small set. However, with other parameters, it provides solutions with even lower entropy.

13.
Hum Genomics ; 13(Suppl 1): 46, 2019 10 22.
Artículo en Inglés | MEDLINE | ID: mdl-31639067

RESUMEN

BACKGROUND: As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. RESULTS: To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. CONCLUSIONS: Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.


Asunto(s)
Algoritmos , Bases de Datos Genéticas , Regulación Neoplásica de la Expresión Génica , Análisis por Conglomerados , Humanos , Neoplasias/genética
14.
Stat Med ; 39(18): 2403-2422, 2020 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-32346898

RESUMEN

Many challenging problems in biomedical research rely on understanding how variables are associated with each other and influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely acknowledged as a very natural and formal language to describe relationships among variables and have been extensively used for studying complex diseases and traits. In this work, we propose methods that leverage observational Gaussian family data for learning a decomposition of undirected and directed acyclic PGMs according to the influence of genetic and environmental factors. Many structure learning algorithms are strongly based on a conditional independence test. For independent measurements of normally distributed variables, conditional independence can be tested through standard tests for zero partial correlation. In family data, the assumption of independent measurements does not hold since related individuals are correlated due to mainly genetic factors. Based on univariate polygenic linear mixed models, we propose tests that account for the familial dependence structure and allow us to assess the significance of the partial correlation due to genetic (between-family) factors and due to other factors, denoted here as environmental (within-family) factors, separately. Then, we extend standard structure learning algorithms, including the IC/PC and the really fast causal inference (RFCI) algorithms, to Gaussian family data. The algorithms learn the most likely PGM and its decomposition into two components, one explained by genetic factors and the other by environmental factors. The proposed methods are evaluated by simulation studies and applied to the Genetic Analysis Workshop 13 simulated dataset, which captures significant features of the Framingham Heart Study.


Asunto(s)
Algoritmos , Modelos Estadísticos , Simulación por Computador , Humanos , Modelos Genéticos , Modelos Teóricos , Distribución Normal
15.
Biomed Eng Online ; 19(1): 37, 2020 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-32466753

RESUMEN

Vessel diseases are often accompanied by abnormalities related to vascular shape and size. Therefore, a clear visualization of vasculature is of high clinical significance. Ultrasound color flow imaging (CFI) is one of the prominent techniques for flow visualization. However, clutter signals originating from slow-moving tissue are one of the main obstacles to obtain a clear view of the vascular network. Enhancement of the vasculature by suppressing the clutters is a significant and irreplaceable step for many applications of ultrasound CFI. Currently, this task is often performed by singular value decomposition (SVD) of the data matrix. This approach exhibits two well-known limitations. First, the performance of SVD is sensitive to the proper manual selection of the ranks corresponding to clutter and blood subspaces. Second, SVD is prone to failure in the presence of large random noise in the dataset. A potential solution to these issues is using decomposition into low-rank and sparse matrices (DLSM) framework. SVD is one of the algorithms for solving the minimization problem under the DLSM framework. Many other algorithms under DLSM avoid full SVD and use approximated SVD or SVD-free ideas which may have better performance with higher robustness and less computing time. In practice, these models separate blood from clutter based on the assumption that steady clutter represents a low-rank structure and that the moving blood component is sparse. In this paper, we present a comprehensive review of ultrasound clutter suppression techniques and exploit the feasibility of low-rank and sparse decomposition schemes in ultrasound clutter suppression. We conduct this review study by adapting 106 DLSM algorithms and validating them against simulation, phantom, and in vivo rat datasets. Two conventional quality metrics, signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR), are used for performance evaluation. In addition, computation times required by different algorithms for generating clutter suppressed images are reported. Our extensive analysis shows that the DLSM framework can be successfully applied to ultrasound clutter suppression.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Ultrasonografía , Algoritmos , Animales , Humanos , Relación Señal-Ruido
16.
Molecules ; 25(8)2020 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-32340308

RESUMEN

Conventional proton nuclear magnetic resonance (1H-NMR) has been widely used for identification and quantification of small molecular components in food. However, identification of major soluble macromolecular components from conventional 1H-NMR spectra is difficult. This is because the baseline appearance is masked by the dense and high-intensity signals from small molecular components present in the sample mixtures. In this study, we introduced an integrated analytical strategy based on the combination of additional measurement using a diffusion filter, covariation peak separation, and matrix decomposition in a small-scale training dataset. This strategy is aimed to extract signal profiles of soluble macromolecular components from conventional 1H-NMR spectral data in a large-scale dataset without the requirement of re-measurement. We applied this method to the conventional 1H-NMR spectra of water-soluble fish muscle extracts and investigated the distribution characteristics of fish diversity and muscle soluble macromolecular components, such as lipids and collagens. We identified a cluster of fish species with low content of lipids and high content of collagens in muscle, which showed great potential for the development of functional foods. Because this mechanical data processing method requires additional measurement of only a small-scale training dataset without special sample pretreatment, it should be immediately applicable to extract macromolecular signals from accumulated conventional 1H-NMR databases of other complex gelatinous mixtures in foods.


Asunto(s)
Peces , Sustancias Macromoleculares , Músculos/química , Espectroscopía de Protones por Resonancia Magnética , Animales , Bases de Datos Factuales , Sustancias Macromoleculares/análisis , Sustancias Macromoleculares/química , Solubilidad
17.
BMC Bioinformatics ; 20(1): 185, 2019 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-30987598

RESUMEN

BACKGROUND: For many practical hypothesis testing (H-T) applications, the data are correlated and/or with heterogeneous variance structure. The regression t-test for weighted linear mixed-effects regression (LMER) is a legitimate choice because it accounts for complex covariance structure; however, high computational costs and occasional convergence issues make it impractical for analyzing high-throughput data. In this paper, we propose computationally efficient parametric and semiparametric tests based on a set of specialized matrix techniques dubbed as the PB-transformation. The PB-transformation has two advantages: 1. The PB-transformed data will have a scalar variance-covariance matrix. 2. The original H-T problem will be reduced to an equivalent one-sample H-T problem. The transformed problem can then be approached by either the one-sample Student's t-test or Wilcoxon signed rank test. RESULTS: In simulation studies, the proposed methods outperform commonly used alternative methods under both normal and double exponential distributions. In particular, the PB-transformed t-test produces notably better results than the weighted LMER test, especially in the high correlation case, using only a small fraction of computational cost (3 versus 933 s). We apply these two methods to a set of RNA-seq gene expression data collected in a breast cancer study. Pathway analyses show that the PB-transformed t-test reveals more biologically relevant findings in relation to breast cancer than the weighted LMER test. CONCLUSIONS: As fast and numerically stable replacements for the weighted LMER test, the PB-transformed tests are especially suitable for "messy" high-throughput data that include both independent and matched/repeated samples. By using our method, the practitioners no longer have to choose between using partial data (applying paired tests to only the matched samples) or ignoring the correlation in the data (applying two sample tests to data with some correlated samples). Our method is implemented as an R package 'PBtest' and is available at https://github.com/yunzhang813/PBtest-R-Package .


Asunto(s)
Neoplasias de la Mama/genética , Simulación por Computador , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Modelos Genéticos , Curva ROC , Análisis de Regresión
18.
Sensors (Basel) ; 19(8)2019 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-31018490

RESUMEN

Accurate and sufficient node location information is crucial for Wireless Sensor Networks (WSNs) applications. However, the existing range-based localization methods often suffer from incomplete and detorted range measurements. To address this issue, some methods based on low-rank matrix recovery have been proposed, which usually assume noises follow single Gaussian distribution or/and single Laplacian distribution, and thus cannot handle the case with wider noise distributions beyond Gaussian and Laplacian ones. In this paper, a novel Anomaly-aware Node Localization (ANLoC) method is proposed to simultaneously impute missing range measurements and detect node anomaly in complex environments. Specifically, by utilizing inherent low-rank property of Euclidean Distance Matrix (EDM), we formulate range measurements imputation problem as a Robust ℓ 2 , 1 -norm Regularized Matrix Decomposition (RRMD) model, where complex noise is fitted by Mixture of Gaussian (MoG) distribution, and node anomaly is sifted by ℓ 2 , 1 -norm regularization. Meanwhile, an efficient optimization algorithm is designed to solve proposed RRMD model based on Expectation Maximization (EM) method. Furthermore, with the imputed EDM, all unknown nodes can be easily positioned by using Multi-Dimensional Scaling (MDS) method. Finally, some experiments are designed to evaluate performance of the proposed method, and experimental results demonstrate that our method outperforms three state-of-the-art node localization methods.

19.
Sensors (Basel) ; 19(9)2019 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-31083296

RESUMEN

Addressing the problems of visual surveillance for anti-UAV, a new flying small target detection method is proposed based on Gaussian mixture background modeling in a compressive sensing domain and low-rank and sparse matrix decomposition of local image. First of all, images captured by stationary visual sensors are broken into patches and the candidate patches which perhaps contain targets are identified by using a Gaussian mixture background model in a compressive sensing domain. Subsequently, the candidate patches within a finite time period are separated into background images and target images by low-rank and sparse matrix decomposition. Finally, flying small target detection is achieved over separated target images by threshold segmentation. The experiment results using visible and infrared image sequences of flying UAV demonstrate that the proposed methods have effective detection performance and outperform the baseline methods in precision and recall evaluation.

20.
Biostatistics ; 18(4): 651-665, 2017 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-28369170

RESUMEN

This article proposes a procedure for describing the relationship between high-dimensional data sets, such as multimodal brain images and genetic data. We propose a supervised technique to incorporate the clinical outcome to determine a score, which is a linear combination of variables with hieratical structures to multimodalities. This approach is expected to obtain interpretable and predictive scores. The proposed method was applied to a study of Alzheimer's disease (AD). We propose a diagnostic method for AD that involves using whole-brain magnetic resonance imaging (MRI) and positron emission tomography (PET), and we select effective brain regions for the diagnostic probability and investigate the genome-wide association with the regions using single nucleotide polymorphisms (SNPs). The two-step dimension reduction method, which we previously introduced, was considered applicable to such a study and allows us to partially incorporate the proposed method. We show that the proposed method offers classification functions with feasibility and reasonable prediction accuracy based on the receiver operating characteristic (ROC) analysis and reasonable regions of the brain and genomes. Our simulation study based on the synthetic structured data set showed that the proposed method outperformed the original method and provided the characteristic for the supervised feature.


Asunto(s)
Enfermedad de Alzheimer/diagnóstico , Estudio de Asociación del Genoma Completo/métodos , Modelos Estadísticos , Neuroimagen/métodos , Enfermedad de Alzheimer/diagnóstico por imagen , Enfermedad de Alzheimer/genética , Humanos , Imagen por Resonancia Magnética , Tomografía de Emisión de Positrones , Aprendizaje Automático Supervisado
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA