Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Bioinformatics ; 30(21): 3086-92, 2014 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-25028722

RESUMO

MOTIVATION: Boolean network models are suitable to simulate GRNs in the absence of detailed kinetic information. However, reducing the biological reality implies making assumptions on how genes interact (interaction rules) and how their state is updated during the simulation (update scheme). The exact choice of the assumptions largely determines the outcome of the simulations. In most cases, however, the biologically correct assumptions are unknown. An ideal simulation thus implies testing different rules and schemes to determine those that best capture an observed biological phenomenon. This is not trivial because most current methods to simulate Boolean network models of GRNs and to compute their attractors impose specific assumptions that cannot be easily altered, as they are built into the system. RESULTS: To allow for a more flexible simulation framework, we developed ASP-G. We show the correctness of ASP-G in simulating Boolean network models and obtaining attractors under different assumptions by successfully recapitulating the detection of attractors of previously published studies. We also provide an example of how performing simulation of network models under different settings help determine the assumptions under which a certain conclusion holds. The main added value of ASP-G is in its modularity and declarativity, making it more flexible and less error-prone than traditional approaches. The declarative nature of ASP-G comes at the expense of being slower than the more dedicated systems but still achieves a good efficiency with respect to computational time. AVAILABILITY AND IMPLEMENTATION: The source code of ASP-G is available at http://bioinformatics.intec.ugent.be/kmarchal/Supplementary_Information_Musthofa_2014/asp-g.zip.


Assuntos
Redes Reguladoras de Genes , Algoritmos , Biologia Computacional/métodos
2.
Stud Health Technol Inform ; 316: 839-840, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176923

RESUMO

Artificial Intelligence (AI) has the potential to "bridge the gap" between healthcare provider and patient needs in low-resource settings to deliver timely, personalized, and empathetic care to individuals with active tuberculosis.


Assuntos
Inteligência Artificial , Tuberculose , Humanos , Tuberculose/terapia , Sistemas de Apoio a Decisões Clínicas
3.
PLoS One ; 19(8): e0304476, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39196905

RESUMO

Domain Generation Algorithms (DGAs) are used by malware to generate pseudorandom domain names to establish communication between infected bots and command and control servers. While DGAs can be detected by machine learning (ML) models with great accuracy, offering DGA detection as a service raises privacy concerns when requiring network administrators to disclose their DNS traffic to the service provider. The main scientific contribution of this paper is to propose the first end-to-end framework for privacy-preserving classification as a service of domain names into DGA (malicious) or non-DGA (benign) domains. Our framework achieves these goals by carefully designed protocols that combine two privacy-enhancing technologies (PETs), namely secure multi-party computation (MPC) and differential privacy (DP). Through MPC, our framework enables an enterprise network administrator to outsource the problem of classifying a DNS (Domain Name System) domain as DGA or non-DGA to an external organization without revealing any information about the domain name. Moreover, the service provider's ML model used for DGA detection is never revealed to the network administrator. Furthermore, by using DP, we also ensure that the classification result cannot be used to learn information about individual entries of the training data. Finally, we leverage post-training float16 quantization of deep learning models in MPC to achieve efficient, secure DGA detection. We demonstrate that by using quantization achieves a significant speed-up, resulting in a 23% to 42% reduction in inference runtime without reducing accuracy using a three party secure computation protocol tolerating one corruption. Previous solutions are not end-to-end private, do not provide differential privacy guarantees for the model's outputs, and assume that model embeddings are publicly known. Our best protocol in terms of accuracy runs in about 0.22s.


Assuntos
Algoritmos , Segurança Computacional , Aprendizado de Máquina , Privacidade , Humanos
4.
BMC Med Genomics ; 14(1): 23, 2021 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-33472626

RESUMO

BACKGROUND: In biomedical applications, valuable data is often split between owners who cannot openly share the data because of privacy regulations and concerns. Training machine learning models on the joint data without violating privacy is a major technology challenge that can be addressed by combining techniques from machine learning and cryptography. When collaboratively training machine learning models with the cryptographic technique named secure multi-party computation, the price paid for keeping the data of the owners private is an increase in computational cost and runtime. A careful choice of machine learning techniques, algorithmic and implementation optimizations are a necessity to enable practical secure machine learning over distributed data sets. Such optimizations can be tailored to the kind of data and Machine Learning problem at hand. METHODS: Our setup involves secure two-party computation protocols, along with a trusted initializer that distributes correlated randomness to the two computing parties. We use a gradient descent based algorithm for training a logistic regression like model with a clipped ReLu activation function, and we break down the algorithm into corresponding cryptographic protocols. Our main contributions are a new protocol for computing the activation function that requires neither secure comparison protocols nor Yao's garbled circuits, and a series of cryptographic engineering optimizations to improve the performance. RESULTS: For our largest gene expression data set, we train a model that requires over 7 billion secure multiplications; the training completes in about 26.90 s in a local area network. The implementation in this work is a further optimized version of the implementation with which we won first place in Track 4 of the iDASH 2019 secure genome analysis competition. CONCLUSIONS: In this paper, we present a secure logistic regression training protocol and its implementation, with a new subprotocol to securely compute the activation function. To the best of our knowledge, we present the fastest existing secure multi-party computation implementation for training logistic regression models on high dimensional genome data distributed across a local area network.


Assuntos
Genômica , Privacidade , Algoritmos , Segurança Computacional , Modelos Logísticos
5.
BMC Bioinformatics ; 10: 374, 2009 Nov 12.
Artigo em Inglês | MEDLINE | ID: mdl-19909518

RESUMO

BACKGROUND: The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels. RESULTS: Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared. CONCLUSION: Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Linguística , Proteínas/metabolismo , Vocabulário Controlado
6.
IEEE Trans Neural Syst Rehabil Eng ; 27(8): 1546-1555, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31283483

RESUMO

Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data. The challenge we address is how to engage in meaningful ML with EEG data while protecting the privacy of users. Hence, we propose cryptographic protocols based on secure multiparty computation (SMC) to perform linear regression over EEG signals from many users in a fully privacy-preserving (PP) fashion, i.e., such that each individual's EEG signals are not revealed to anyone else. To illustrate the potential of our secure framework, we show how it allows estimating the drowsiness of drivers from their EEG signals as would be possible in the unencrypted case, and at a very reasonable computational cost. Our solution is the first application of commodity-based SMC to EEG data, as well as the largest documented experiment of secret sharing-based SMC in general, namely, with 15 players involved in all the computations.


Assuntos
Interfaces Cérebro-Computador , Confidencialidade , Adulto , Algoritmos , Condução de Veículo , Segurança Computacional , Bases de Dados Factuais , Eletroencefalografia , Feminino , Humanos , Modelos Lineares , Masculino , Adulto Jovem
7.
IEEE Trans Syst Man Cybern B Cybern ; 36(3): 679-84, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16761820

RESUMO

As opposed to quantitative association rule mining, fuzzy association rule mining is said to prevent the overestimation of boundary cases, as can be shown by small examples. Rule mining, however, becomes interesting in large databases, where the problem of boundary cases is less apparent and can be further suppressed by using sensible partitioning methods. A data-driven approach is used to investigate if there is a significant difference between quantitative and fuzzy association rules in large databases. The influence of the choice of a particular triangular norm in this respect is also examined.


Assuntos
Algoritmos , Inteligência Artificial , Bases de Dados Factuais , Técnicas de Apoio para a Decisão , Lógica Fuzzy , Armazenamento e Recuperação da Informação/métodos , Modelos Estatísticos , Simulação por Computador
8.
Int J Data Min Bioinform ; 5(2): 209-29, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21544955

RESUMO

Recently, many approaches to model regulatory networks have been proposed in the systems biology domain. However, the task is far from being solved. In this paper, we propose an Answer Set Programming (ASP)-based approach to model interaction networks. We build a general ASP framework that describes the network semantics and allows modelling specific networks with little effort. ASP provides a rich and flexible toolbox that allows expanding the framework with desired features. In this paper, we tune our framework to mimic Boolean network behaviour and apply it to model the Budding Yeast and Fission Yeast cell cycle networks. The obtained steady states of these networks correspond to those of the Boolean networks.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Modelos Biológicos , Modelos Genéticos , Ciclo Celular/genética , Biologia Computacional , Simulação por Computador , Mineração de Dados , Proteínas de Saccharomyces cerevisiae/genética , Saccharomycetales/citologia , Saccharomycetales/genética , Schizosaccharomyces/citologia , Schizosaccharomyces/genética , Proteínas de Schizosaccharomyces pombe/genética , Software , Biologia de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA