RESUMO
In the process of screening for probiotic strains, there are no clearly established bacterial phenotypic markers which could be used for the prediction of their in vivo mechanism of action. In this work, we demonstrate for the first time that Machine Learning (ML) methods can be used for accurately predicting the in vivo immunomodulatory activity of probiotic strains based on their cell surface phenotypic features using a snail host-microbe interaction model. A broad range of snail gut presumptive probiotics, including 240 new lactic acid bacterial strains (Lactobacillus, Leuconostoc, Lactococcus, and Enterococcus), were isolated and characterized based on their capacity to withstand snails' gastrointestinal defense barriers, such as the pedal mucus, gastric mucus, gastric juices, and acidic pH, in association with their cell surface hydrophobicity, autoaggregation, and biofilm formation ability. The implemented ML pipeline predicted with high accuracy (88 %) strains with a strong capacity to enhance chemotaxis and phagocytic activity of snails' hemolymph cells, while also revealed bacterial autoaggregation and cell surface hydrophobicity as the most important parameters that significantly affect host immune responses. The results show that ML approaches may be useful to derive a predictive understanding of host-probiotic interactions, while also highlighted the use of snails as an efficient animal model for screening presumptive probiotic strains in the light of their interaction with cellular innate immune responses.
Assuntos
Aprendizado de Máquina , Probióticos , Probióticos/farmacologia , Animais , Lactobacillales/fisiologia , Lactobacillales/imunologia , Caramujos/imunologia , Caramujos/microbiologia , Caracois Helix/imunologia , Caracois Helix/fisiologia , Imunidade Inata , ImunomodulaçãoRESUMO
In this paper, we introduce optics-informed Neural Networks and demonstrate experimentally how they can improve performance of End-to-End deep learning models for IM/DD optical transmission links. Optics-informed or optics-inspired NNs are defined as the type of DL models that rely on linear and/or nonlinear building blocks whose mathematical description stems directly from the respective response of photonic devices, drawing their mathematical framework from neuromorphic photonic hardware developments and properly adapting their DL training algorithms. We investigate the application of an optics-inspired activation function that can be obtained by a semiconductor-based nonlinear optical module and is a variant of the logistic sigmoid, referred to as the Photonic Sigmoid, in End-to-End Deep Learning configurations for fiber communication links. Compared to state-of-the-art ReLU-based configurations used in End-to-End DL fiber link demonstrations, optics-informed models based on the Photonic Sigmoid show improved noise- and chromatic dispersion compensation properties in fiber-optic IM/DD links. An extensive simulation and experimental analysis revealed significant performance benefits for the Photonic Sigmoid NNs that can reach below BER HD FEC limit for fiber lengths up to 42â km, at an effective bit transmission rate of 48 Gb/s.
RESUMO
Limit Orders allow buyers and sellers to set a "limit price" they are willing to accept in a trade. On the other hand, market orders allow for immediate execution at any price. Thus, market orders are susceptible to slippage, which is the additional cost incurred due to the unfavorable execution of a trade order. As a result, limit orders are often preferred, since they protect traders from excessive slippage costs due to larger than expected price fluctuations. Despite the price guarantees of limit orders, they are more complex compared to market orders. Orders with overly optimistic limit prices might never be executed, which increases the risk of employing limit orders in Machine Learning (ML)-based trading systems. Indeed, the current ML literature for trading almost exclusively relies on market orders. To overcome this limitation, a Deep Reinforcement Learning (DRL) approach is proposed to model trading agents that use limit orders. The proposed method (a) uses a framework that employs a continuous probability distribution to model limit prices, while (b) provides the ability to place market orders when the risk of no execution is more significant than the cost of slippage. Extensive experiments are conducted with multiple currency pairs, using hourly price intervals, validating the effectiveness of the proposed method and paving the way for introducing limit order modeling in DRL-based trading.
Assuntos
Comércio , Redes Neurais de ComputaçãoRESUMO
Deep Reinforcement Learning (RL) is increasingly used for developing financial trading agents for a wide range of tasks. However, optimizing deep RL agents is notoriously difficult and unstable, especially in noisy financial environments, significantly hindering the performance of trading agents. In this work, we present a novel method that improves the training reliability of DRL trading agents building upon the well-known approach of neural network distillation. In the proposed approach, teacher agents are trained in different subsets of RL environment, thus diversifying the policies they learn. Then student agents are trained using distillation from the trained teachers to guide the training process, allowing for better exploring the solution space, while "mimicking" an existing policy/trading strategy provided by the teacher model. The boost in effectiveness of the proposed method comes from the use of diversified ensembles of teachers trained to perform trading for different currencies. This enables us to transfer the common view regarding the most profitable policy to the student, further improving the training stability in noisy financial environments. In the conducted experiments we find that when applying distillation, constraining the teacher models to be diversified can significantly improve their performance of the final student agents. We demonstrate this by providing an extensive evaluation on various financial trading tasks. Furthermore, we also provide additional experiments in the separate domain of control in games using the Procgen environments in order to demonstrate the generality of the proposed method.
Assuntos
Aprendizado Profundo/economia , Administração Financeira/estatística & dados numéricos , Investimentos em Saúde/estatística & dados numéricosRESUMO
Weight imprinting (WI) was recently introduced as a way to perform gradient descent-free few-shot learning. Due to this, WI was almost immediately adapted for performing few-shot learning on embedded neural network accelerators that do not support back-propagation, e.g., edge tensor processing units. However, WI suffers from many limitations, e.g., it cannot handle novel categories with multimodal distributions and special care should be given to avoid overfitting the learned embeddings on the training classes since this can have a devastating effect on classification accuracy (for the novel categories). In this article, we propose a novel hypersphere-based WI approach that is capable of training neural networks in a regularized, imprinting-aware way effectively overcoming the aforementioned limitations. The effectiveness of the proposed method is demonstrated using extensive experiments on three image data sets.
RESUMO
Knowledge-transfer (KT) methods allow for transferring the knowledge contained in a large deep learning model into a more lightweight and faster model. However, the vast majority of existing KT approaches are designed to handle mainly classification and detection tasks. This limits their performance on other tasks, such as representation/metric learning. To overcome this limitation, a novel probabilistic KT (PKT) method is proposed in this article. PKT is capable of transferring the knowledge into a smaller student model by keeping as much information as possible, as expressed through the teacher model. The ability of the proposed method to use different kernels for estimating the probability distribution of the teacher and student models, along with the different divergence metrics that can be used for transferring the knowledge, allows for easily adapting the proposed method to different applications. PKT outperforms several existing state-of-the-art KT techniques, while it is capable of providing new insights into KT by enabling several novel applications, as it is demonstrated through extensive experiments on several challenging data sets.
RESUMO
Machine learning methods have recently seen a growing number of applications in financial trading. Being able to automatically extract patterns from past price data and consistently apply them in the future has been the focus of many quantitative trading applications. However, developing machine learning-based methods for financial trading is not straightforward, requiring carefully designed targets/rewards, hyperparameter fine-tuning, and so on. Furthermore, most of the existing methods are unable to effectively exploit the information available across various financial instruments. In this article, we propose a deep reinforcement learning-based approach, which ensures that consistent rewards are provided to the trading agent, mitigating the noisy nature of profit-and-loss rewards that are usually used. To this end, we employ a novel price trailing-based reward shaping approach, significantly improving the performance of the agent in terms of profit, Sharpe ratio, and maximum drawdown. Furthermore, we carefully designed a data preprocessing method that allows for training the agent on different FOREX currency pairs, providing a way for developing market-wide RL agents and allowing, at the same time, to exploit more powerful recurrent deep learning models without the risk of overfitting. The ability of the proposed methods to improve various performance metrics is demonstrated using a challenging large-scale data set, containing 28 instruments, provided by Speedlab AG.
RESUMO
A novel adversarial attack methodology for fooling deep neural network classifiers in image classification tasks is proposed, along with a novel defense mechanism to counter such attacks. Two concepts are introduced, namely the K-Anonymity-inspired Adversarial Attack (K-A3) and the Multiple Support Vector Data Description Defense (M-SVDD-D). The proposed K-A3 introduces novel optimization criteria to standard adversarial attack methodologies, inspired by the K-Anonymity principles. Its generated adversarial examples are not only misclassified by the neural network classifier, but are uniformly spread along K different ranked output positions. The proposed M-SVDD-D consists of a deep neural architecture layer consisting of multiple non-linear one-class classifiers based on Support Vector Data Description that can be used to replace the final linear classification layer of a deep neural architecture, and an additional class verification mechanism. Its application decreases the effectiveness of adversarial attacks, by increasing the noise energy required to deceive the protected model, attributed to the introduced non-linearity. In addition, M-SVDD-D can be used to prevent adversarial attacks in black-box attack settings.
Assuntos
Máquina de Vetores de Suporte , Processamento de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/normas , Redes Neurais de ComputaçãoRESUMO
Photonics is among the most promising emerging technologies for providing fast and energy-efficient Deep Learning (DL) implementations. Despite their advantages, these photonic DL accelerators also come with certain important limitations. For example, the majority of existing photonic accelerators do not currently support many of the activation functions that are commonly used in DL, such as the ReLU activation function. Instead, sinusoidal and sigmoidal nonlinearities are usually employed, rendering the training process unstable and difficult to tune, mainly due to vanishing gradient phenomena. Thus, photonic DL models usually require carefully fine-tuning all their training hyper-parameters in order to ensure that the training process will proceed smoothly. Despite the recent advances in initialization schemes, as well as in optimization algorithms, training photonic DL models is still especially challenging. To overcome these limitations, we propose a novel adaptive initialization method that employs auxiliary tasks to estimate the optimal initialization variance for each layer of a network. The effectiveness of the proposed approach is demonstrated using two different datasets, as well as two recently proposed photonic activation functions and three different initialization methods. Apart from significantly increasing the stability of the training process, the proposed method can be directly used with any photonic activation function, without further requiring any other kind of fine-tuning, as also demonstrated through the conducted experiments.
Assuntos
Aprendizado Profundo , FótonsRESUMO
Deep learning (DL) models can be used to tackle time series analysis tasks with great success. However, the performance of DL models can degenerate rapidly if the data are not appropriately normalized. This issue is even more apparent when DL is used for financial time series forecasting tasks, where the nonstationary and multimodal nature of the data pose significant challenges and severely affect the performance of DL models. In this brief, a simple, yet effective, neural layer that is capable of adaptively normalizing the input time series, while taking into account the distribution of the data, is proposed. The proposed layer is trained in an end-to-end fashion using backpropagation and leads to significant performance improvements compared to other evaluated normalization schemes. The proposed method differs from traditional normalization methods since it learns how to perform normalization for a given task instead of using a fixed normalization scheme. At the same time, it can be directly applied to any new time series without requiring retraining. The effectiveness of the proposed method is demonstrated using a large-scale limit order book data set, as well as a load forecasting data set.
RESUMO
With the advent of deep neural networks, there is a growing interest in transferring the knowledge from a large and complex model to a smaller and faster one. In this brief, a method for unsupervised knowledge transfer (KT) between neural networks is proposed. To the best of our knowledge, the proposed method is the first method that utilizes similarity-induced embeddings to transfer the knowledge between any two layers of neural networks, regardless of the number of neurons in each of them. By this way, the knowledge is transferred without using any lossy dimensionality reduction transformations or requiring any information about the complex model, except for the activations of the layer used for KT. This is in contrast with most existing approaches that only generate soft-targets for training the smaller neural network or directly use the weights of the larger model. The proposed method is evaluated using six image data sets and it is demonstrated, through extensive experiments, that the knowledge of a neural network can be successfully transferred using different kinds of (synthetic or not) data, ranging from cross-domain data to just randomly generated data.
RESUMO
Convolutional neural networks (CNNs) are predominantly used for several challenging computer vision tasks achieving state-of-the-art performance. However, CNNs are complex models that require the use of powerful hardware, both for training and deploying them. To this end, a quantization-based pooling method is proposed in this paper. The proposed method is inspired from the bag-of-features model and can be used for learning more lightweight deep neural networks. Trainable radial basis function neurons are used to quantize the activations of the final convolutional layer, reducing the number of parameters in the network and allowing for natively classifying images of various sizes. The proposed method employs differentiable quantization and aggregation layers leading to an end-to-end trainable CNN architecture. Furthermore, a fast linear variant of the proposed method is introduced and discussed, providing new insight for understanding convolutional neural architectures. The ability of the proposed method to reduce the size of CNNs and increase the performance over other competitive methods is demonstrated using seven data sets and three different learning tasks (classification, regression, and retrieval).
RESUMO
In mathematical terms, an artificial neuron computes the inner product of a d -dimensional input vector x with its weight vector w , compares it with a bias value w0 and fires based on the result of this comparison. Therefore, its decision boundary is given by the equation wTx+w0=0 . In this paper, we propose replacing the linear hyperplane decision boundary of a neuron with a curved, paraboloid decision boundary. Thus, the decision boundary of the proposed paraboloid neuron is given by the equation (hTx+h0)2-||x-p||22=0 , where h and h0 denote the parameters of the directrix and p denotes the coordinates of the focus. Such paraboloid neural networks are proven to have superior recognition accuracy in a number of applications.
RESUMO
In this paper, a manifold-based dictionary learning method for the bag-of-features (BoF) representation optimized toward information clustering is proposed. First, the spectral representation, which unwraps the manifolds of the data and provides better clustering solutions, is formed. Then, a new dictionary is learned in order to make the histogram space, i.e., the space where the BoF historgrams exist, as similar as possible to the spectral space. The ability of the proposed method to improve the clustering solutions is demonstrated using a wide range of datasets: two image datasets, the 15-scene dataset and the Corel image dataset, one video dataset, the KTH dataset, and one text dataset, the RT-2k dataset. The proposed method improves both the internal and the external clustering criteria for two different clustering algorithms: 1) the -means and 2) the spectral clustering. Also, the optimized histogram space can be used to directly assign a new object to its cluster, instead of using the spectral space (which requires reapplying the spectral clustering algorithm or using incremental spectral clustering techniques). Finally, the learned representation is also evaluated using an information retrieval setup and it is demonstrated that improves the retrieval precision over the baseline BoF representation.
RESUMO
The vast majority of dimensionality reduction (DR) techniques rely on the second-order statistics to define their optimization objective. Even though this provides adequate results in most cases, it comes with several shortcomings. The methods require carefully designed regularizers and they are usually prone to outliers. In this paper, a new DR framework that can directly model the target distribution using the notion of similarity instead of distance is introduced. The proposed framework, called similarity embedding framework (SEF), can overcome the aforementioned limitations and provides a conceptually simpler way to express optimization targets similar to existing DR techniques. Deriving a new DR technique using the SEF becomes simply a matter of choosing an appropriate target similarity matrix. A variety of classical tasks, such as performing supervised DR and providing out-of-sample extensions, as well as, new novel techniques, such as providing fast linear embeddings for complex techniques, are demonstrated in this paper using the proposed framework. Six data sets from a diverse range of domains are used to evaluate the proposed method and it is demonstrated that it can outperform many existing DR techniques.
RESUMO
In this paper, a modified class of support vector machines (SVMs) inspired from the optimization of Fisher's discriminant ratio is presented, the so-called minimum class variance SVMs (MCVSVMs). The MCVSVMs optimization problem is solved in cases in which the training set contains less samples that the dimensionality of the training vectors using dimensionality reduction through principal component analysis (PCA). Afterward, the MCVSVMs are extended in order to find nonlinear decision surfaces by solving the optimization problem in arbitrary Hilbert spaces defined by Mercer's kernels. In that case, it is shown that, under kernel PCA, the nonlinear optimization problem is transformed into an equivalent linear MCVSVMs problem. The effectiveness of the proposed approach is demonstrated by comparing it with the standard SVMs and other classifiers, like kernel Fisher discriminant analysis in facial image characterization problems like gender determination, eyeglass, and neutral facial expression detection.
Assuntos
Algoritmos , Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise Discriminante , Humanos , Aumento da Imagem/métodos , Análise de Componente Principal , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
A novel algorithm that can be used to boost the performance of face-verification methods that utilize Fisher's criterion is presented and evaluated. The algorithm is applied to similarity, or matching error, data and provides a general solution for overcoming the "small sample size" (SSS) problem, where the lack of sufficient training samples causes improper estimation of a linear separation hyperplane between the classes. Two independent phases constitute the proposed method. Initially, a set of weighted piecewise discriminant hyperplanes are used in order to provide a more accurate discriminant decision than the one produced by the traditional linear discriminant analysis (LDA) methodology. The expected classification ability of this method is investigated throughout a series of simulations. The second phase defines proper combinations for person-specific similarity scores and describes an outlier removal process that further enhances the classification ability. The proposed technique has been tested on the M2VTS and XM2VTS frontal face databases. Experimental results indicate that the proposed framework greatly improves the face-verification performance.
Assuntos
Algoritmos , Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Análise Discriminante , Humanos , Aumento da Imagem/métodos , Modelos Lineares , Tamanho da AmostraRESUMO
In this paper, two supervised methods for enhancing the classification accuracy of the Nonnegative Matrix Factorization (NMF) algorithm are presented. The idea is to extend the NMF algorithm in order to extract features that enforce not only the spatial locality, but also the separability between classes in a discriminant manner. The first method employs discriminant analysis in the features derived from NMF. In this way, a two-phase discriminant feature extraction procedure is implemented, namely NMF plus Linear Discriminant Analysis (LDA). The second method incorporates the discriminant constraints inside the NMF decomposition. Thus, a decomposition of a face to its discriminant parts is obtained and new update rules for both the weights and the basis images are derived. The introduced methods have been applied to the problem of frontal face verification using the well-known XM2VTS database. Both methods greatly enhance the performance of NMF for frontal face verification.
Assuntos
Algoritmos , Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise Discriminante , Humanos , Aumento da Imagem/métodosRESUMO
In this paper, we propose a novel extension of the extreme learning machine (ELM) algorithm for single-hidden layer feedforward neural network training that is able to incorporate subspace learning (SL) criteria on the optimization process followed for the calculation of the network's output weights. The proposed graph embedded ELM (GEELM) algorithm is able to naturally exploit both intrinsic and penalty SL criteria that have been (or will be) designed under the graph embedding framework. In addition, we extend the proposed GEELM algorithm in order to be able to exploit SL criteria in arbitrary (even infinite) dimensional ELM spaces. We evaluate the proposed approach on eight standard classification problems and nine publicly available datasets designed for three problems related to human behavior analysis, i.e., the recognition of human face, facial expression, and activity. Experimental results denote the effectiveness of the proposed approach, since it outperforms other ELM-based classification schemes in all the cases.
Assuntos
Face/anatomia & histologia , Atividades Humanas/classificação , Processamento de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão/métodos , Feminino , Humanos , Masculino , Gravação em VídeoRESUMO
Video summarization is a timely and rapidly developing research field with broad commercial interest, due to the increasing availability of massive video data. Relevant algorithms face the challenge of needing to achieve a careful balance between summary compactness, enjoyability, and content coverage. The specific case of stereoscopic 3D theatrical films has become more important over the past years, but not received corresponding research attention. In this paper, a multi-stage, multimodal summarization process for such stereoscopic movies is proposed, that is able to extract a short, representative video skim conforming to narrative characteristics from a 3D film. At the initial stage, a novel, low-level video frame description method is introduced (frame moments descriptor) that compactly captures informative image statistics from luminance, color, optical flow, and stereoscopic disparity video data, both in a global and in a local scale. Thus, scene texture, illumination, motion, and geometry properties may succinctly be contained within a single frame feature descriptor, which can subsequently be employed as a building block in any key-frame extraction scheme, e.g., for intra-shot frame clustering. The computed key-frames are then used to construct a movie summary in the form of a video skim, which is post-processed in a manner that also considers the audio modality. The next stage of the proposed summarization pipeline essentially performs shot pruning, controlled by a user-provided shot retention parameter, that removes segments from the skim based on the narrative prominence of movie characters in both the visual and the audio modalities. This novel process (multimodal shot pruning) is algebraically modeled as a multimodal matrix column subset selection problem, which is solved using an evolutionary computing approach. Subsequently, disorienting editing effects induced by summarization are dealt with, through manipulation of the video skim. At the last step, the skim is suitably post-processed in order to reduce stereoscopic video defects that may cause visual fatigue.