Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 24(1): 435, 2023 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-37974081

RESUMO

Biclustering of biologically meaningful binary information is essential in many applications related to drug discovery, like protein-protein interactions and gene expressions. However, for robust performance in recently emerging large health datasets, it is important for new biclustering algorithms to be scalable and fast. We present a rapid unsupervised biclustering (RUBic) algorithm that achieves this objective with a novel encoding and search strategy. RUBic significantly reduces the computational overhead on both synthetic and experimental datasets shows significant computational benefits, with respect to several state-of-the-art biclustering algorithms. In 100 synthetic binary datasets, our method took [Formula: see text] s to extract 494,872 biclusters. In the human PPI database of size [Formula: see text], our method generates 1840 biclusters in [Formula: see text] s. On a central nervous system embryonic tumor gene expression dataset of size 712,940, our algorithm takes   101 min to produce 747,069 biclusters, while the recent competing algorithms take significantly more time to produce the same result. RUBic is also evaluated on five different gene expression datasets and shows significant speed-up in execution time with respect to existing approaches to extract significant KEGG-enriched bi-clustering. RUBic can operate on two modes, base and flex, where base mode generates maximal biclusters and flex mode generates less number of clusters and faster based on their biological significance with respect to KEGG pathways. The code is available at ( https://github.com/CMATERJU-BIOINFO/RUBic ) for academic use only.


Assuntos
Algoritmos , Gerenciamento de Dados , Humanos , Bases de Dados Factuais , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos
2.
Chronic Stress (Thousand Oaks) ; 7: 24705470231203655, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37780807

RESUMO

Background: Posttraumatic stress disorder (PTSD) is a significant burden among combat Veterans returning from the wars in Iraq and Afghanistan. While empirically supported treatments have demonstrated reductions in PTSD symptomatology, there remains a need to improve treatment effectiveness. Functional magnetic resonance imaging (fMRI) neurofeedback has emerged as a possible treatment to ameliorate PTSD symptom severity. Virtual reality (VR) approaches have also shown promise in increasing treatment compliance and outcomes. To facilitate fMRI neurofeedback-associated therapies, it would be advantageous to accurately classify internal brain stress levels while Veterans are exposed to trauma-associated VR imagery. Methods: Across 2 sessions, we used fMRI to collect neural responses to trauma-associated VR-like stimuli among male combat Veterans with PTSD symptoms (N = 8). Veterans reported their self-perceived stress level on a scale from 1 to 8 every 15 s throughout the fMRI sessions. In our proposed framework, we precisely sample the fMRI data on cortical gray matter, blurring the data along the gray-matter manifold to reduce noise and dimensionality while preserving maximum neural information. Then, we independently applied 3 machine learning (ML) algorithms to this fMRI data collected across 2 sessions, separately for each Veteran, to build individualized ML models that predicted their internal brain states (self-reported stress responses). Results: We accurately classified the 8-class self-reported stress responses with a mean (± standard error) root mean square error of 0.6 (± 0.1) across all Veterans using the best ML approach. Conclusions: The findings demonstrate the predictive ability of ML algorithms applied to whole-brain cortical fMRI data collected during individual Veteran sessions. The framework we have developed to preprocess whole-brain cortical fMRI data and train ML models across sessions would provide a valuable tool to enable individualized real-time fMRI neurofeedback during VR-like exposure therapy for PTSD.

3.
Sensors (Basel) ; 22(9)2022 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-35590822

RESUMO

Inpatient gait analysis is an essential part of rehabilitation for foot amputees and includes the ground contact time (GCT) difference of both legs as an essential component. Doctors communicate improvement advice to patients regarding their gait pattern based on a few steps taken at the doctor's visit. A wearable sensor system, called Suralis, consisting of an inertial measurement unit (IMU) and a pressure measuring sock, including algorithms calculating GCT, is presented. Two data acquisitions were conducted to implement and validate initial contact (IC) and toe-off (TO) event detection algorithms as the basis for the GCT difference determination for able-bodied and prosthesis wearers. The results of the algorithms show a median GCT error of -51.7 ms (IMU) and 14.7 ms (sensor sock) compared to the ground truth and thus represent a suitable possibility for wearable gait analysis. The wearable system presented, therefore, enables a continuous feedback system for patients and, above all, a remote diagnosis of spatio-temporal aspects of gait behaviour based on reliable data collected in everyday life.


Assuntos
Marcha , Dispositivos Eletrônicos Vestíveis , Algoritmos , Fenômenos Biomecânicos , , Análise da Marcha , Humanos
4.
PeerJ Comput Sci ; 7: e355, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33817005

RESUMO

Tremendous quantities of numeric data have been generated as streams in various cyber ecosystems. Sorting is one of the most fundamental operations to gain knowledge from data. However, due to size restrictions of data storage which includes storage inside and outside CPU with respect to the massive streaming data sources, data can obviously overflow the storage. Consequently, all classic sorting algorithms of the past are incapable of obtaining a correct sorted sequence because data to be sorted cannot be totally stored in the data storage. This paper proposes a new sorting algorithm called streaming data sort for streaming data on a uniprocessor constrained by a limited storage size and the correctness of the sorted order. Data continuously flow into the storage as consecutive chunks with chunk sizes less than the storage size. A theoretical analysis of the space bound and the time complexity is provided. The sorting time complexity is O (n), where n is the number of incoming data. The space complexity is O (M), where M is the storage size. The experimental results show that streaming data sort can handle a million permuted data by using a storage whose size is set as low as 35% of the data size. This proposed concept can be practically applied to various applications in different fields where the data always overflow the working storage and sorting process is needed.

5.
J Theor Biol ; 455: 131-139, 2018 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-30036526

RESUMO

Functionally similar non-coding RNAs are expected to be similar in certain regions of their secondary structures. These similar regions are called common structure motifs, and are structurally conserved throughout evolution to maintain their functional roles. Common structure motif identification is one of the critical tasks in RNA secondary structure analysis. Nevertheless, current approaches suffer several limitations, and/or do not scale with both structure size and the number of input secondary structures. In this work, we present a method to transform the conserved base pair stems into transaction items and apply frequent itemset mining to identify common structure motifs existing in a majority of input structures. Our experimental results on telomerase and ribosomal RNA secondary structures report frequent stem patterns that are of biological significance. Moreover, the algorithms utilized in our method are scalable and frequent stem patterns can be identified efficiently among many large structures.


Assuntos
Algoritmos , Simulação por Computador , Conformação de Ácido Nucleico , RNA Ribossômico/química , RNA/química , Análise de Sequência de RNA , Telomerase/química , RNA/genética , RNA Ribossômico/genética , Telomerase/genética
6.
IEEE Trans Neural Netw Learn Syst ; 28(11): 2775-2788, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-28113384

RESUMO

It has received much attention in recent years to use Markov blankets in a Bayesian network for feature selection. The Markov blanket of a class attribute in a Bayesian network is a unique yet minimal feature subset for optimal feature selection if the probability distribution of a data set can be faithfully represented by this Bayesian network. However, if a data set violates the faithful condition, Markov blankets of a class attribute may not be unique. To tackle this issue, in this paper, we propose a new concept of representative sets and then design the selection via group alpha-investing (SGAI) algorithm to perform Markov blanket feature selection with representative sets for classification. Using a comprehensive set of real data, our empirical studies have demonstrated that SGAI outperforms the state-of-the-art Markov blanket feature selectors and other well-established feature selection methods.It has received much attention in recent years to use Markov blankets in a Bayesian network for feature selection. The Markov blanket of a class attribute in a Bayesian network is a unique yet minimal feature subset for optimal feature selection if the probability distribution of a data set can be faithfully represented by this Bayesian network. However, if a data set violates the faithful condition, Markov blankets of a class attribute may not be unique. To tackle this issue, in this paper, we propose a new concept of representative sets and then design the selection via group alpha-investing (SGAI) algorithm to perform Markov blanket feature selection with representative sets for classification. Using a comprehensive set of real data, our empirical studies have demonstrated that SGAI outperforms the state-of-the-art Markov blanket feature selectors and other well-established feature selection methods.

7.
IEEE Trans Neural Netw Learn Syst ; 28(11): 2466-2478, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-27514067

RESUMO

In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.In spam and malware detection, attackers exploit randomization to obfuscate malicious data and increase their chances of evading detection at test time, e.g., malware code is typically obfuscated using random strings or byte sequences to hide known exploits. Interestingly, randomization has also been proposed to improve security of learning algorithms against evasion attacks, as it results in hiding information about the classifier to the attacker. Recent work has proposed game-theoretical formulations to learn secure classifiers, by simulating different evasion attacks and modifying the classification function accordingly. However, both the classification function and the simulated data manipulations have been modeled in a deterministic manner, without accounting for any form of randomization. In this paper, we overcome this limitation by proposing a randomized prediction game, namely, a noncooperative game-theoretic formulation in which the classifier and the attacker make randomized strategy selections according to some probability distribution defined over the respective strategy set. We show that our approach allows one to improve the tradeoff between attack detection and false alarms with respect to the state-of-the-art secure classifiers, even against attacks that are different from those hypothesized during design, on application examples including handwritten digit recognition, spam, and malware detection.

8.
IEEE Trans Neural Netw Learn Syst ; 28(11): 2660-2673, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-27576267

RESUMO

Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA