Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
Sensors (Basel) ; 22(16)2022 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-36015898

RESUMO

CNNs and other deep learners are now state-of-the-art in medical imaging research. However, the small sample size of many medical data sets dampens performance and results in overfitting. In some medical areas, it is simply too labor-intensive and expensive to amass images numbering in the hundreds of thousands. Building Deep CNN ensembles of pre-trained CNNs is one powerful method for overcoming this problem. Ensembles combine the outputs of multiple classifiers to improve performance. This method relies on the introduction of diversity, which can be introduced on many levels in the classification workflow. A recent ensembling method that has shown promise is to vary the activation functions in a set of CNNs or within different layers of a single CNN. This study aims to examine the performance of both methods using a large set of twenty activations functions, six of which are presented here for the first time: 2D Mexican ReLU, TanELU, MeLU + GaLU, Symmetric MeLU, Symmetric GaLU, and Flexible MeLU. The proposed method was tested on fifteen medical data sets representing various classification tasks. The best performing ensemble combined two well-known CNNs (VGG16 and ResNet50) whose standard ReLU activation layers were randomly replaced with another. Results demonstrate the superiority in performance of this approach.


Assuntos
Diagnóstico por Imagem , Redes Neurais de Computação
2.
J Imaging ; 8(4)2022 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-35448223

RESUMO

The classification of vocal individuality for passive acoustic monitoring (PAM) and census of animals is becoming an increasingly popular area of research. Nearly all studies in this field of inquiry have relied on classic audio representations and classifiers, such as Support Vector Machines (SVMs) trained on spectrograms or Mel-Frequency Cepstral Coefficients (MFCCs). In contrast, most current bioacoustic species classification exploits the power of deep learners and more cutting-edge audio representations. A significant reason for avoiding deep learning in vocal identity classification is the tiny sample size in the collections of labeled individual vocalizations. As is well known, deep learners require large datasets to avoid overfitting. One way to handle small datasets with deep learning methods is to use transfer learning. In this work, we evaluate the performance of three pretrained CNNs (VGG16, ResNet50, and AlexNet) on a small, publicly available lion roar dataset containing approximately 150 samples taken from five male lions. Each of these networks is retrained on eight representations of the samples: MFCCs, spectrogram, and Mel spectrogram, along with several new ones, such as VGGish and stockwell, and those based on the recently proposed LM spectrogram. The performance of these networks, both individually and in ensembles, is analyzed and corroborated using the Equal Error Rate and shown to surpass previous classification attempts on this dataset; the best single network achieved over 95% accuracy and the best ensembles over 98% accuracy. The contributions this study makes to the field of individual vocal classification include demonstrating that it is valuable and possible, with caution, to use transfer learning with single pretrained CNNs on the small datasets available for this problem domain. We also make a contribution to bioacoustics generally by offering a comparison of the performance of many state-of-the-art audio representations, including for the first time the LM spectrogram and stockwell representations. All source code for this study is available on GitHub.

3.
J Imaging ; 7(12)2021 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-34940721

RESUMO

Convolutional neural networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the discrete wavelet transform and the other on the constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across four benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, a portrait dataset, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art (or comparable) performance across all four data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification.

4.
Sensors (Basel) ; 21(17)2021 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-34502700

RESUMO

In this paper, we examine two strategies for boosting the performance of ensembles of Siamese networks (SNNs) for image classification using two loss functions (Triplet and Binary Cross Entropy) and two methods for building the dissimilarity spaces (FULLY and DEEPER). With FULLY, the distance between a pattern and a prototype is calculated by comparing two images using the fully connected layer of the Siamese network. With DEEPER, each pattern is described using a deeper layer combined with dimensionality reduction. The basic design of the SNNs takes advantage of supervised k-means clustering for building the dissimilarity spaces that train a set of support vector machines, which are then combined by sum rule for a final decision. The robustness and versatility of this approach are demonstrated on several cross-domain image data sets, including a portrait data set, two bioimage and two animal vocalization data sets. Results show that the strategies employed in this work to increase the performance of dissimilarity image classification using SNN are closing the gap with standalone CNNs. Moreover, when our best system is combined with an ensemble of CNNs, the resulting performance is superior to an ensemble of CNNs, demonstrating that our new strategy is extracting additional information.


Assuntos
Redes Neurais de Computação , Animais
5.
J Imaging ; 7(9)2021 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-34564103

RESUMO

Features play a crucial role in computer vision. Initially designed to detect salient elements by means of handcrafted algorithms, features now are often learned using different layers in convolutional neural networks (CNNs). This paper develops a generic computer vision system based on features extracted from trained CNNs. Multiple learned features are combined into a single structure to work on different image classification tasks. The proposed system was derived by testing several approaches for extracting features from the inner layers of CNNs and using them as inputs to support vector machines that are then combined by sum rule. Several dimensionality reduction techniques were tested for reducing the high dimensionality of the inner layers so that they can work with SVMs. The empirically derived generic vision system based on applying a discrete cosine transform (DCT) separately to each channel is shown to significantly boost the performance of standard CNNs across a large and diverse collection of image data sets. In addition, an ensemble of different topologies taking the same DCT approach and combined with global mean thresholding pooling obtained state-of-the-art results on a benchmark image virus data set.

6.
Sensors (Basel) ; 21(5)2021 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-33668172

RESUMO

Traditionally, classifiers are trained to predict patterns within a feature space. The image classification system presented here trains classifiers to predict patterns within a vector space by combining the dissimilarity spaces generated by a large set of Siamese Neural Networks (SNNs). A set of centroids from the patterns in the training data sets is calculated with supervised k-means clustering. The centroids are used to generate the dissimilarity space via the Siamese networks. The vector space descriptors are extracted by projecting patterns onto the similarity spaces, and SVMs classify an image by its dissimilarity vector. The versatility of the proposed approach in image classification is demonstrated by evaluating the system on different types of images across two domains: two medical data sets and two animal audio data sets with vocalizations represented as images (spectrograms). Results show that the proposed system's performance competes competitively against the best-performing methods in the literature, obtaining state-of-the-art performance on one of the medical data sets, and does so without ad-hoc optimization of the clustering methods on the tested data sets.

7.
Front Neurol ; 11: 576194, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33250847

RESUMO

Alzheimer's Disease (AD) is the most common neurodegenerative disease, with 10% prevalence in the elder population. Conventional Machine Learning (ML) was proven effective in supporting the diagnosis of AD, while very few studies investigated the performance of deep learning and transfer learning in this complex task. In this paper, we evaluated the potential of ensemble transfer-learning techniques, pretrained on generic images and then transferred to structural brain MRI, for the early diagnosis and prognosis of AD, with respect to a fusion of conventional-ML approaches based on Support Vector Machine directly applied to structural brain MRI. Specifically, more than 600 subjects were obtained from the ADNI repository, including AD, Mild Cognitive Impaired converting to AD (MCIc), Mild Cognitive Impaired not converting to AD (MCInc), and cognitively-normal (CN) subjects. We used T1-weighted cerebral-MRI studies to train: (1) an ensemble of five transfer-learning architectures pretrained on generic images; (2) a 3D Convolutional Neutral Network (CNN) trained from scratch on MRI volumes; and (3) a fusion of two conventional-ML classifiers derived from different feature extraction/selection techniques coupled to SVM. The AD-vs-CN, MCIc-vs-CN, MCIc-vs-MCInc comparisons were investigated. The ensemble transfer-learning approach was able to effectively discriminate AD from CN with 90.2% AUC, MCIc from CN with 83.2% AUC, and MCIc from MCInc with 70.6% AUC, showing comparable or slightly lower results with the fusion of conventional-ML systems (AD from CN with 93.1% AUC, MCIc from CN with 89.6% AUC, and MCIc from MCInc with AUC in the range of 69.1-73.3%). The deep-learning network trained from scratch obtained lower performance than either the fusion of conventional-ML systems and the ensemble transfer-learning, due to the limited sample of images used for training. These results open new prospective on the use of transfer learning combined with neuroimages for the automatic early diagnosis and prognosis of AD, even if pretrained on generic images.

8.
Sensors (Basel) ; 19(23)2019 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-31795280

RESUMO

A fundamental problem in computer vision is face detection. In this paper, an experimentally derived ensemble made by a set of six face detectors is presented that maximizes the number of true positives while simultaneously reducing the number of false positives produced by the ensemble. False positives are removed using different filtering steps based primarily on the characteristics of the depth map related to the subwindows of the whole image that contain candidate faces. A new filtering approach based on processing the image with different wavelets is also proposed here. The experimental results show that the applied filtering steps used in our best ensemble reduce the number of false positives without decreasing the detection rate. This finding is validated on a combined dataset composed of four others for a total of 549 images, including 614 upright frontal faces acquired in unconstrained environments. The dataset provides both 2D and depth data. For further validation, the proposed ensemble is tested on the well-known BioID benchmark dataset, where it obtains a 100% detection rate with an acceptable number of false positives.

9.
Artif Intell Med ; 97: 19-26, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31202396

RESUMO

BACKGROUND AND OBJECTIVE: Early and accurate diagnosis of Alzheimer's Disease (AD) is critical since early treatment effectively slows the progression of the disease thereby adding productive years to those afflicted by this disease. A major problem encountered in the classification of MRI for the automatic diagnosis of AD is the so-called curse-of-dimensionality, which is a consequence of the high dimensionality of MRI feature vectors and the low number of training patterns available in most MRI datasets relevant to AD. METHODS: A method for performing early diagnosis of AD is proposed that combines a set of SVMs trained on different texture descriptors (which reduce dimensionality) extracted from slices of Magnetic Resonance Image (MRI) with a set of SVMs trained on markers built from the voxels of MRIs. The dimension of the voxel-based features is reduced by using different feature selection algorithms, each of which trains a separate SVM. These two sets of SVMs are then combined by weighted-sum rule for a final decision. RESULTS: Experimental results show that 2D texture descriptors improve the performance of state-of-the-art voxel-based methods. The evaluation of our system on the four ADNI datasets demonstrates the efficacy of the proposed ensemble and demonstrates a contribution to the accurate prediction of AD. CONCLUSIONS: Ensembles of texture descriptors combine partially uncorrelated information with respect to standard approaches based on voxels, feature selection, and classification by SVM. In other words, the fusion of a system based on voxels and an ensemble of texture descriptors enhances the performance of voxel-based approaches.


Assuntos
Doença de Alzheimer/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Diagnóstico Precoce , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Reconhecimento Automatizado de Padrão , Máquina de Vetores de Suporte
10.
Bioinformatics ; 35(11): 1844-1851, 2019 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-30395157

RESUMO

MOTIVATION: Because DNA-binding proteins (DNA-BPs) play a vital role in all aspects of genetic activity, the development of reliable and efficient systems for automatic DNA-BP classification is becoming a crucial proteomic technology. Key to this technology is the discovery of powerful protein representations and feature extraction methods. The goal of this article is to develop experimentally a system for automatic DNA-BP classification by comparing and combining different descriptors taken from different types of protein representations. RESULTS: The descriptors we evaluate include those starting from the position-specific scoring matrix (PSSM) of proteins, those derived from the amino-acid sequence (AAS), various matrix representations of proteins and features taken from the three-dimensional tertiary structure of proteins. We also introduce some new variants of protein descriptors. Each descriptor is used to train a separate support vector machine (SVM), and results are combined by sum rule. Our final system obtains state-or-the-art results on three benchmark DNA-BP datasets. AVAILABILITY AND IMPLEMENTATION: The MATLAB code for replicating the experiments presented in this paper is available at https://github.com/LorisNanni.


Assuntos
Proteômica , Sequência de Aminoácidos , Biologia Computacional , Proteínas de Ligação a DNA , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte
11.
Artigo em Inglês | MEDLINE | ID: mdl-29994096

RESUMO

Bioimage classification is increasingly becoming more important in many biological studies including those that require accurate cell phenotype recognition, subcellular localization, and histopathological classification. In this paper, we present a new General Purpose (GenP) bioimage classification method that can be applied to a large range of classification problems. The GenP system we propose is an ensemble that combines multiple texture features (both handcrafted and learned descriptors) for superior and generalizable discriminative power. Our ensemble obtains a boosting of performance by combining local features, dense sampling features, and deep learning features. Each descriptor is used to train a different Support Vector Machine that is then combined by sum rule. We evaluate our method on a diverse set of bioimage classification tasks each represented by a benchmark database, including some of those available in the IICBU 2008 database. Each bioimage classification task represents a typical subcellular, cellular, and tissue level classification problem. Our evaluation on these datasets demonstrates that the proposed GenP bioimage ensemble obtains state-of-the-art performance without any ad-hoc dataset tuning of the parameters (thereby avoiding any risk of overfitting/overtraining). To reproduce the experiments reported in this paper, the MATLAB code of all the descriptors is available at https://github.com/LorisNanni and https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.

12.
Bioinformatics ; 33(18): 2837-2841, 2017 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-28444139

RESUMO

MOTIVATION: Given an unknown compound, is it possible to predict its Anatomical Therapeutic Chemical class/classes? This is a challenging yet important problem since such a prediction could be used to deduce not only a compound's possible active ingredients but also its therapeutic, pharmacological and chemical properties, thereby substantially expediting the pace of drug development. The problem is challenging because some drugs and compounds belong to two or more ATC classes, making machine learning extremely difficult. RESULTS: In this article a multi-label classifier system is proposed that incorporates information about a compound's chemical-chemical interaction and its structural and fingerprint similarities to other compounds belonging to the different ATC classes. The proposed system reshapes a 1D feature vector to obtain a 2D matrix representation of the compound. This matrix is then described by a histogram of gradients that is fed into a Multi-Label Learning with Label-Specific Features classifier. Rigorous cross-validations demonstrate the superior prediction quality of this method compared with other state-of-the-art approaches developed for this problem, a superiority that is reflected particularly in the absolute true rate, the most important and harshest metric for assessing multi-label systems. AVAILABILITY AND IMPLEMENTATION: The MATLAB code for replicating the experiments presented in this article is available at https://www.dropbox.com/s/7v1mey48tl9bfgz/ToolPaperATC.rar?dl=0 . CONTACT: loris.nanni@unipd.it. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Interações Medicamentosas , Aprendizado de Máquina , Software
13.
Comput Biol Med ; 72: 239-47, 2016 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-26656952

RESUMO

In this paper, we propose a new method for improving the performance of 2D descriptors by building an n-layer image using different preprocessing approaches from which multilayer descriptors are extracted and used as feature vectors for training a Support Vector Machine. The different preprocessing approaches are used to build different n-layer images (n=3, n=5, etc.). We test both color and gray-level images, two well-known texture descriptors (Local Phase Quantization and Local Binary Pattern), and three of their variants suited for n-layer images (Volume Local Phase Quantization, Local Phase Quantization Three-Orthogonal-Planes, and Volume Local Binary Patterns). Our results show that multilayers and texture descriptors can be combined to outperform the standard single-layer approaches. Experiments on 10 datasets demonstrate the generalizability of the proposed descriptors. Most of these datasets are medical, but in each case the images are very different. Two datasets are completely unrelated to medicine and are included to demonstrate the discriminative power of the proposed descriptors across very different image recognition tasks. A MATLAB version of the complete system developed in this paper will be made available at https://www.dei.unipd.it/node/2357.


Assuntos
Diagnóstico por Imagem/classificação , Diagnóstico por Imagem/normas , Humanos , Máquina de Vetores de Suporte
14.
Comput Intell Neurosci ; 2015: 909123, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26413089

RESUMO

We perform an extensive study of the performance of different classification approaches on twenty-five datasets (fourteen image datasets and eleven UCI data mining datasets). The aim is to find General-Purpose (GP) heterogeneous ensembles (requiring little to no parameter tuning) that perform competitively across multiple datasets. The state-of-the-art classifiers examined in this study include the support vector machine, Gaussian process classifiers, random subspace of adaboost, random subspace of rotation boosting, and deep learning classifiers. We demonstrate that a heterogeneous ensemble based on the simple fusion by sum rule of different classifiers performs consistently well across all twenty-five datasets. The most important result of our investigation is demonstrating that some very recent approaches, including the heterogeneous ensemble we propose in this paper, are capable of outperforming an SVM classifier (implemented with LibSVM), even when both kernel selection and SVM parameters are carefully tuned for each dataset.


Assuntos
Algoritmos , Inteligência Artificial , Reconhecimento Automatizado de Padrão , Simulação por Computador , Conjuntos de Dados como Assunto , Humanos , Aprendizagem , Distribuição Normal , Curva ROC
15.
Stud Health Technol Inform ; 207: 74-82, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25488213

RESUMO

In this paper we investigate a new approach for extracting features from a texture using Dijkstra's algorithm. The method maps images into graphs and gray level differences into transition costs. Texture is measured over the whole image comparing the costs found by Dijkstra's algorithm with the geometric distance of the pixels. In addition, we compare and combine our new strategy with a previous method for describing textures based on Dijkstra's algorithm. For each set of features, a support vector machine (SVM) is trained. The set of classifiers is then combined by weighted sum rule. Combining the proposed set of features with the well-known local binary patterns and local ternary patterns boosts performance. To assess the performance of our approach, we test it using six medical datasets representing different image classification problems. Tests demonstrate that our approach outperforms the performance of standard methods presented in the literature. All source code for the approaches tested in this paper will be available at: http://www.dei.unipd.it/node/2357.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Microscopia/métodos , Reconhecimento Automatizado de Padrão/métodos , Máquina de Vetores de Suporte , Humanos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
16.
Stud Health Technol Inform ; 207: 83-91, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25488214

RESUMO

In this paper we propose an ensemble of texture descriptors for analyzing virus textures in transmission electron microscopy images. Specifically, we present several novel multi-quinary (MQ) codings of local binary pattern (LBP) variants: the MQ version of the dense LBP, the MQ version of the rotation invariant co-occurrence among adjacent LBPs, and the MQ version of the LBP histogram Fourier. To reduce computation time as well as to improve performance, a feature selection approach is utilized to select the thresholds used in the MQ approaches. In addition, we propose new variants of descriptors where two histograms, instead of the standard one histogram, are produced for each descriptor. The two histograms (one for edge pixels and the other for non-edge pixels) are calculated for training two different SVMs, whose results are then combined by sum rule. We show that a bag of features approach works well with this problem. Our experiments, using a publicly available dataset of 1500 images with 15 classes and same protocol as in previous works, demonstrate the superiority of our new proposed ensemble of texture descriptors. The MATLAB code of our approach is available at https://www.dei.unipd.it/node/2357.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Microscopia Eletrônica de Transmissão/métodos , Reconhecimento Automatizado de Padrão/métodos , Vírus/ultraestrutura , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
17.
Stud Health Technol Inform ; 207: 153-62, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25488221

RESUMO

Using game technologies and digital media for improving physical and mental health and for the therapeutic benefit and well-being of a wide range of people is an area of study that is rapidly expanding. Much research in this emerging field is centered at the intersection of serious games, alternative realities, and play therapy. In this paper the authors describe their transdisciplinary work at this intersection: i) an integrative system of psychotherapy technologies called MyPsySpace currently being prototyped in Second Life with the aim of offering new and virtual translations of traditional expressive therapies (virtual sandplay, virtual drama therapy, digital expressive therapy, and virtual safe spaces) and ii) a mature body of research entitled SoundScapes that is exploring the use of interactive video games and abstract creative expression (making music, digital painting, and robotic device control) as a supplement to traditional physical rehabilitation intervention. Aside from introducing our work to a broader audience, our goal is to encourage peers to investigate ideas that reach across disciplines-to both risk and reap the benefits of combining technologies, theories, and methods stemming from multiple disciplines.


Assuntos
Transtornos Mentais/psicologia , Transtornos Mentais/terapia , Ludoterapia/métodos , Terapia Assistida por Computador/métodos , Jogos de Vídeo , Terapia de Exposição à Realidade Virtual/métodos , Tecnologia Biomédica/métodos , Medicina Baseada em Evidências , Humanos , Resultado do Tratamento
18.
ScientificWorldJournal ; 2014: 236717, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25028675

RESUMO

Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Proteínas/classificação , Algoritmos , Proteínas/química
19.
J Theor Biol ; 360: 109-116, 2014 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-25026218

RESUMO

Successful protein structure identification enables researchers to estimate the biological functions of proteins, yet it remains a challenging problem. The most common method for determining an unknown protein's structural class is to perform expensive and time-consuming manual experiments. Because of the availability of amino acid sequences generated in the post-genomic age, it is possible to predict an unknown protein's structural class using machine learning methods given a protein's amino-acid sequence and/or its secondary structural elements. Following recent research in this area, we propose a new machine learning system that is based on combining several protein descriptors extracted from different protein representations, such as position specific scoring matrix (PSSM), the amino-acid sequence, and secondary structural sequences. The prediction engine of our system is operated by an ensemble of support vector machines (SVMs), where each SVM is trained on a different descriptor. The results of each SVM are combined by sum rule. Our final ensemble produces a success rate that is substantially better than previously reported results on three well-established datasets. The MATLAB code and datasets used in our experiments are freely available for future comparison at http://www.dei.unipd.it/node/2357.


Assuntos
Modelos Genéticos , Conformação Proteica , Proteínas/classificação , Proteínas/genética , Software , Algoritmos , Sequência de Aminoácidos , Inteligência Artificial , Máquina de Vetores de Suporte
20.
J Theor Biol ; 359: 120-8, 2014 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-24949993

RESUMO

The study of protein-drug interactions is a significant issue for drug development. Unfortunately, it is both expensive and time-consuming to perform physical experiments to determine whether a drug and a protein are interacting with each other. Some previous attempts to design an automated system to perform this task were based on the knowledge of the 3D structure of a protein, which is not always available in practice. With the availability of protein sequences generated in the post-genomic age, however, a sequence-based solution to deal with this problem is necessary. Following other works in this area, we propose a new machine learning system based on several protein descriptors extracted from several protein representations, such as, variants of the position specific scoring matrix (PSSM) of proteins, the amino-acid sequence, and a matrix representation of a protein. The prediction engine is operated by an ensemble of support vector machines (SVMs), with each SVM trained on a specific descriptor and the results of each SVM combined by sum rule. The overall success rate achieved by our final ensemble is notably higher than previous results obtained on the same datasets using the same testing protocols reported in the literature. MATLAB code and the datasets used in our experiments are freely available for future comparison at http://www.dei.unipd.it/node/2357.


Assuntos
Interações Medicamentosas , Redes e Vias Metabólicas , Preparações Farmacêuticas/metabolismo , Proteínas/química , Proteínas/metabolismo , Biologia Computacional , Humanos , Simulação de Acoplamento Molecular , Terapia de Alvo Molecular , Preparações Farmacêuticas/química , Ligação Proteica , Conformação Proteica , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...