RESUMEN
BACKGROUND AND OBJECTIVE: Lyme disease which is one of the most common infectious vector-borne diseases manifests itself in most cases with erythema migrans (EM) skin lesions. Recent studies show that convolutional neural networks (CNNs) perform well to identify skin lesions from images. Lightweight CNN based pre-scanner applications for resource-constrained mobile devices can help users with early diagnosis of Lyme disease and prevent the transition to a severe late form thanks to appropriate antibiotic therapy. Also, resource-intensive CNN based robust computer applications can assist non-expert practitioners with an accurate diagnosis. The main objective of this study is to extensively analyze the effectiveness of CNNs for diagnosing Lyme disease from images and to find out the best CNN architectures considering resource constraints. METHODS: First, we created an EM dataset with the help of expert dermatologists from Clermont-Ferrand University Hospital Center of France. Second, we benchmarked this dataset for twenty-three CNN architectures customized from VGG, ResNet, DenseNet, MobileNet, Xception, NASNet, and EfficientNet architectures in terms of predictive performance, computational complexity, and statistical significance. Third, to improve the performance of the CNNs, we used custom transfer learning from ImageNet pre-trained models as well as pre-trained the CNNs with the skin lesion dataset HAM10000. Fourth, for model explainability, we utilized Gradient-weighted Class Activation Mapping to visualize the regions of input that are significant to the CNNs for making predictions. Fifth, we provided guidelines for model selection based on predictive performance and computational complexity. RESULTS: Customized ResNet50 architecture gave the best classification accuracy of 84.42% ±1.36, AUC of 0.9189±0.0115, precision of 83.1%±2.49, sensitivity of 87.93%±1.47, and specificity of 80.65%±3.59. A lightweight model customized from EfficientNetB0 also performed well with an accuracy of 83.13%±1.2, AUC of 0.9094±0.0129, precision of 82.83%±1.75, sensitivity of 85.21% ±3.91, and specificity of 80.89%±2.95. All the trained models are publicly available at https://dappem.limos.fr/download.html, which can be used by others for transfer learning and building pre-scanners for Lyme disease. CONCLUSION: Our study confirmed the effectiveness of even some lightweight CNNs for building Lyme disease pre-scanner mobile applications to assist people with an initial self-assessment and referring them to expert dermatologist for further diagnosis.
Asunto(s)
Enfermedad de Lyme , Enfermedades de la Piel , Francia , Humanos , Enfermedad de Lyme/diagnóstico , Aprendizaje Automático , Redes Neurales de la ComputaciónRESUMEN
Due to the rapid progress of biological networks for modeling biological systems, a lot of biomolecular networks have been producing more and more protein-protein interaction (PPI) data. Analyzing protein-protein interaction networks aims to find regions of topological and functional (dis)similarities between molecular networks of different species. The study of PPI networks has the potential to teach us as much about life process and diseases at the molecular level. Although few methods have been developed for multiple PPI network alignment and thus, new network alignment methods are of a compelling need. In this paper, we propose a novel algorithm for a global alignment of multiple protein-protein interaction networks called MAPPIN. The latter relies on information available for the proteins in the networks, such as sequence, function, and network topology. Our algorithm is perfectly designed to exploit current multi-core CPU architectures, and has been extensively tested on a real data (eight species). Our experimental results show that MAPPIN significantly outperforms NetCoffee in terms of coverage. Nevertheless, MAPPIN is handicapped by the time required to load the gene annotation file. An extensive comparison versus the pioneering PPI methods also show that MAPPIN is often efficient in terms of coverage, mean entropy, or mean normalized.
Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas/métodos , Algoritmos , Animales , Ensayos Analíticos de Alto Rendimiento , Humanos , Mapas de Interacción de Proteínas , Proteínas/química , Alineación de SecuenciaRESUMEN
Ionizing-radiation-resistant bacteria (IRRB) are important in biotechnology. In this context, in silico methods of phenotypic prediction and genotype-phenotype relationship discovery are limited. In this work, we analyzed basal DNA repair proteins of most known proteome sequences of IRRB and ionizing-radiation-sensitive bacteria (IRSB) in order to learn a classifier that correctly predicts this bacterial phenotype. We formulated the problem of predicting bacterial ionizing radiation resistance (IRR) as a multiple-instance learning (MIL) problem, and we proposed a novel approach for this purpose. We provide a MIL-based prediction system that classifies a bacterium to either IRRB or IRSB. The experimental results of the proposed system are satisfactory with 91.5% of successful predictions.
Asunto(s)
Bacterias/genética , Bacterias/efectos de la radiación , Algoritmos , Inteligencia Artificial , Proteínas Bacterianas/genética , Reparación del ADN , ADN Bacteriano/genética , Genoma , Modelos Biológicos , Fenotipo , Radiación IonizanteRESUMEN
One of the most powerful techniques to study proteins is to look for recurrent fragments (also called substructures), then use them as patterns to characterize the proteins under study. Although protein sequences have been extensively studied in the literature, studying protein three-dimensional (3D) structures can reveal relevant structural and functional information that may not be derived from protein sequences alone. An emergent trend consists of parsing proteins 3D structures into graphs of amino acids. Hence, the search of recurrent substructures is formulated as a process of frequent subgraph discovery where each subgraph represents a 3D motif. In this scope, several efficient approaches for frequent 3D motif discovery have been proposed in the literature. However, the set of discovered 3D motifs is too large to be efficiently analyzed and explored in any further process. In this article, we propose a novel pattern selection approach that shrinks the large number of frequent 3D motifs by selecting a subset of representative ones. Existing pattern selection approaches do not exploit the domain knowledge. Yet, in our approach, we incorporate the evolutionary information of amino acids defined in the substitution matrices in order to select the representative 3D motifs. We show the effectiveness of our approach on a number of real datasets. The results issued from our experiments show that considering the substitution between amino acids allows our approach to detect many similarities between patterns that are ignored by current subgraph selection approaches, and that it is able to considerably decrease the number of 3D motifs while enhancing their interestingness.