Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36528802

RESUMEN

Accurate prediction of deoxyribonucleic acid (DNA) modifications is essential to explore and discern the process of cell differentiation, gene expression and epigenetic regulation. Several computational approaches have been proposed for particular type-specific DNA modification prediction. Two recent generalized computational predictors are capable of detecting three different types of DNA modifications; however, type-specific and generalized modifications predictors produce limited performance across multiple species mainly due to the use of ineffective sequence encoding methods. The paper in hand presents a generalized computational approach "DNA-MP" that is competent to more precisely predict three different DNA modifications across multiple species. Proposed DNA-MP approach makes use of a powerful encoding method "position specific nucleotides occurrence based 117 on modification and non-modification class densities normalized difference" (POCD-ND) to generate the statistical representations of DNA sequences and a deep forest classifier for modifications prediction. POCD-ND encoder generates statistical representations by extracting position specific distributional information of nucleotides in the DNA sequences. We perform a comprehensive intrinsic and extrinsic evaluation of the proposed encoder and compare its performance with 32 most widely used encoding methods on $17$ benchmark DNA modifications prediction datasets of $12$ different species using $10$ different machine learning classifiers. Overall, with all classifiers, the proposed POCD-ND encoder outperforms existing $32$ different encoders. Furthermore, combinedly over 5-fold cross validation benchmark datasets and independent test sets, proposed DNA-MP predictor outperforms state-of-the-art type-specific and generalized modifications predictors by an average accuracy of 7% across 4mc datasets, 1.35% across 5hmc datasets and 10% for 6ma datasets. To facilitate the scientific community, the DNA-MP web application is available at https://sds_genetic_analysis.opendfki.de/DNA_Modifications/.


Asunto(s)
Epigénesis Genética , Aprendizaje Automático , Programas Informáticos , Nucleótidos , ADN/genética
2.
Nat Methods ; 18(9): 1038-1045, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34462594

RESUMEN

Light microscopy combined with well-established protocols of two-dimensional cell culture facilitates high-throughput quantitative imaging to study biological phenomena. Accurate segmentation of individual cells in images enables exploration of complex biological questions, but can require sophisticated imaging processing pipelines in cases of low contrast and high object density. Deep learning-based methods are considered state-of-the-art for image segmentation but typically require vast amounts of annotated data, for which there is no suitable resource available in the field of label-free cellular imaging. Here, we present LIVECell, a large, high-quality, manually annotated and expert-validated dataset of phase-contrast images, consisting of over 1.6 million cells from a diverse set of cell morphologies and culture densities. To further demonstrate its use, we train convolutional neural network-based models using LIVECell and evaluate model segmentation accuracy with a proposed a suite of benchmarks.


Asunto(s)
Bases de Datos Factuales , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía/métodos , Modelos Biológicos , Técnicas de Cultivo de Célula , Humanos , Redes Neurales de la Computación
3.
Sensors (Basel) ; 22(11)2022 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-35684703

RESUMEN

Deep neural networks are one of the most successful classifiers across different domains. However, their use is limited in safety-critical areas due to their limitations concerning interpretability. The research field of explainable artificial intelligence addresses this problem. However, most interpretability methods align to the imaging modality by design. The paper introduces TimeREISE, a model agnostic attribution method that shows success in the context of time series classification. The method applies perturbations to the input and considers different attribution map characteristics such as the granularity and density of an attribution map. The approach demonstrates superior performance compared to existing methods concerning different well-established measurements. TimeREISE shows impressive results in the deletion and insertion test, Infidelity, and Sensitivity. Concerning the continuity of an explanation, it showed superior performance while preserving the correctness of the attribution map. Additional sanity checks prove the correctness of the approach and its dependency on the model parameters. TimeREISE scales well with an increasing number of channels and timesteps. TimeREISE applies to any time series classification network and does not rely on prior data knowledge. TimeREISE is suited for any usecase independent of dataset characteristics such as sequence length, channel number, and number of classes.


Asunto(s)
Inteligencia Artificial , Redes Neurales de la Computación , Factores de Tiempo
4.
Int J Mol Sci ; 23(15)2022 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-35897818

RESUMEN

Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative splicing of precursor mRNA in reversed order across exons. Despite the abundant presence of circRNAs in human genes and their involvement in diverse physiological processes, the functionality of most circRNAs remains a mystery. Like other non-coding RNAs, sub-cellular localization knowledge of circRNAs has the aptitude to demystify the influence of circRNAs on protein synthesis, degradation, destination, their association with different diseases, and potential for drug development. To date, wet experimental approaches are being used to detect sub-cellular locations of circular RNAs. These approaches help to elucidate the role of circRNAs as protein scaffolds, RNA-binding protein (RBP) sponges, micro-RNA (miRNA) sponges, parental gene expression modifiers, alternative splicing regulators, and transcription regulators. To complement wet-lab experiments, considering the progress made by machine learning approaches for the determination of sub-cellular localization of other non-coding RNAs, the paper in hand develops a computational framework, Circ-LocNet, to precisely detect circRNA sub-cellular localization. Circ-LocNet performs comprehensive extrinsic evaluation of 7 residue frequency-based, residue order and frequency-based, and physio-chemical property-based sequence descriptors using the five most widely used machine learning classifiers. Further, it explores the performance impact of K-order sequence descriptor fusion where it ensembles similar as well dissimilar genres of statistical representation learning approaches to reap the combined benefits. Considering the diversity of statistical representation learning schemes, it assesses the performance of second-order, third-order, and going all the way up to seventh-order sequence descriptor fusion. A comprehensive empirical evaluation of Circ-LocNet over a newly developed benchmark dataset using different settings reveals that standalone residue frequency-based sequence descriptors and tree-based classifiers are more suitable to predict sub-cellular localization of circular RNAs. Further, K-order heterogeneous sequence descriptors fusion in combination with tree-based classifiers most accurately predict sub-cellular localization of circular RNAs. We anticipate this study will act as a rich baseline and push the development of robust computational methodologies for the accurate sub-cellular localization determination of novel circRNAs.


Asunto(s)
MicroARNs , ARN Circular , Empalme Alternativo , Humanos , MicroARNs/genética , ARN/genética , ARN/metabolismo , ARN Circular/genética , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo
5.
Environ Monit Assess ; 194(2): 133, 2022 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-35089424

RESUMEN

Water is a basic and primary resource which is required for sustenance of life on the Earth. The importance of water quality is increasing with the ascending water pollution owing to industrialization and depletion of fresh water sources. The countries having low control on reducing water pollution are likely to retain poor public health. Additionally, the methods being used in most developing countries are not effective and are based more on human intervention than on technological and automated solutions. Typically, most of the water samples and related data are monitored and tested in laboratories, which eventually consumes time and effort at the expense of producing fewer reliable results. In view of the above, there is an imperative need to devise a proper and systematic system to regularly monitor and manage the quality of water resources to arrest the related issues. Towards such ends, Internet of Things (IoT) is a great alternative to such traditional approaches which are complex and ineffective and it allows taking remote measurements in real-time with minimal human involvement. The proposed system consists of various water quality measuring nodes encompassing various sensors including dissolved oxygen, turbidity, pH level, water temperature, and total dissolved solids. These sensors nodes deployed at various sites of the study area transmit data to the server for processing and analysis using GSM modules. The data collected over months is used for water quality classification using water quality indices and for bacterial prediction by employing machine learning algorithms. For data visualization, a Web portal is developed which consists of a dashboard of Web services to display the heat maps and other related info-graphics. The real-time water quality data is collected using IoT nodes and the historic data is acquired from the Rawal Lake Filtration Plant. Several machine learning algorithms including neural networks (NN), convolutional neural networks (CNN), ridge regression (RR), support vector machines (SVM), decision tree regression (DTR), Bayesian regression (BR), and an ensemble of all models are trained for fecal coliform bacterial prediction, where SVM and Bayesian regression models have shown the optimal performance with mean squared error (MSE) of 0.35575 and 0.39566 respectively. The proposed system provides an alternative and more convenient solution for bacterial prediction, which otherwise is done manually in labs and is an expensive and time-consuming approach. In addition to this, it offers several other advantages including remote monitoring, ease of scalability, real-time status of water quality, and a portable hardware.


Asunto(s)
Internet de las Cosas , Teorema de Bayes , Monitoreo del Ambiente , Humanos , Aprendizaje Automático , Calidad del Agua
6.
Sensors (Basel) ; 21(21)2021 Nov 05.
Artículo en Inglés | MEDLINE | ID: mdl-34770678

RESUMEN

With the rise in the employment of deep learning methods in safety-critical scenarios, interpretability is more essential than ever before. Although many different directions regarding interpretability have been explored for visual modalities, time series data has been neglected, with only a handful of methods tested due to their poor intelligibility. We approach the problem of interpretability in a novel way by proposing TSInsight, where we attach an auto-encoder to the classifier with a sparsity-inducing norm on its output and fine-tune it based on the gradients from the classifier and a reconstruction penalty. TSInsight learns to preserve features that are important for prediction by the classifier and suppresses those that are irrelevant, i.e., serves as a feature attribution method to boost the interpretability. In contrast to most other attribution frameworks, TSInsight is capable of generating both instance-based and model-based explanations. We evaluated TSInsight along with nine other commonly used attribution methods on eight different time series datasets to validate its efficacy. The evaluation results show that TSInsight naturally achieves output space contraction; therefore, it is an effective tool for the interpretability of deep time series models.

7.
Sensors (Basel) ; 21(17)2021 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-34502609

RESUMEN

The emergence of various types of commercial cameras (compact, high resolution, high angle of view, high speed, and high dynamic range, etc.) has contributed significantly to the understanding of human activities. By taking advantage of the characteristic of a high angle of view, this paper demonstrates a system that recognizes micro-behaviors and a small group discussion with a single 360 degree camera towards quantified meeting analysis. We propose a method that recognizes speaking and nodding, which have often been overlooked in existing research, from a video stream of face images and a random forest classifier. The proposed approach was evaluated on our three datasets. In order to create the first and the second datasets, we asked participants to meet physically: 16 sets of five minutes data from 21 unique participants and seven sets of 10 min meeting data from 12 unique participants. The experimental results showed that our approach could detect speaking and nodding with a macro average f1-score of 67.9% in a 10-fold random split cross-validation and a macro average f1-score of 62.5% in a leave-one-participant-out cross-validation. By considering the increased demand for an online meeting due to the COVID-19 pandemic, we also record faces on a screen that are captured by web cameras as the third dataset and discussed the potential and challenges of applying our ideas to virtual video conferences.


Asunto(s)
Actividades Humanas , Fotograbar , COVID-19 , Humanos , Pandemias
8.
Int J Mol Sci ; 22(16)2021 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-34445436

RESUMEN

Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.


Asunto(s)
Biología Computacional/métodos , ARN no Traducido/clasificación , ARN no Traducido/metabolismo , Animales , Inteligencia Artificial , Bases de Datos Factuales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , ARN no Traducido/genética , Análisis de Secuencia de ARN , Distribución Tisular
9.
Sensors (Basel) ; 20(9)2020 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-32365724

RESUMEN

Mind wandering is a drift of attention away from the physical world and towards our thoughts and concerns. Mind wandering affects our cognitive state in ways that can foster creativity but hinder productivity. In the context of learning, mind wandering is primarily associated with lower performance. This study has two goals. First, we investigate the effects of text semantics and music on the frequency and type of mind wandering. Second, using eye-tracking and electrodermal features, we propose a novel technique for automatic, user-independent detection of mind wandering. We find that mind wandering was most frequent in texts for which readers had high expertise and that were combined with sad music. Furthermore, a significant increase in task-related thoughts was observed for texts for which readers had little prior knowledge. A Random Forest classification model yielded an F 1 -Score of 0.78 when using only electrodermal features to detect mind wandering, of 0.80 when using only eye-movement features, and of 0.83 when using both. Our findings pave the way for building applications which automatically detect events of mind wandering during reading.


Asunto(s)
Atención , Técnicas Biosensibles , Movimientos Oculares , Tecnología de Seguimiento Ocular , Femenino , Humanos , Masculino , Lectura
10.
BMC Med Inform Decis Mak ; 19(1): 136, 2019 07 17.
Artículo en Inglés | MEDLINE | ID: mdl-31315618

RESUMEN

BACKGROUND: With the advancement of powerful image processing and machine learning techniques, Computer Aided Diagnosis has become ever more prevalent in all fields of medicine including ophthalmology. These methods continue to provide reliable and standardized large scale screening of various image modalities to assist clinicians in identifying diseases. Since optic disc is the most important part of retinal fundus image for glaucoma detection, this paper proposes a two-stage framework that first detects and localizes optic disc and then classifies it into healthy or glaucomatous. METHODS: The first stage is based on Regions with Convolutional Neural Network (RCNN) and is responsible for localizing and extracting optic disc from a retinal fundus image while the second stage uses Deep Convolutional Neural Network to classify the extracted disc into healthy or glaucomatous. Unfortunately, none of the publicly available retinal fundus image datasets provides any bounding box ground truth required for disc localization. Therefore, in addition to the proposed solution, we also developed a rule-based semi-automatic ground truth generation method that provides necessary annotations for training RCNN based model for automated disc localization. RESULTS: The proposed method is evaluated on seven publicly available datasets for disc localization and on ORIGA dataset, which is the largest publicly available dataset with healthy and glaucoma labels, for glaucoma classification. The results of automatic localization mark new state-of-the-art on six datasets with accuracy reaching 100% on four of them. For glaucoma classification we achieved Area Under the Receiver Operating Characteristic Curve equal to 0.874 which is 2.7% relative improvement over the state-of-the-art results previously obtained for classification on ORIGA dataset. CONCLUSION: Once trained on carefully annotated data, Deep Learning based methods for optic disc detection and localization are not only robust, accurate and fully automated but also eliminates the need for dataset-dependent heuristic algorithms. Our empirical evaluation of glaucoma classification on ORIGA reveals that reporting only Area Under the Curve, for datasets with class imbalance and without pre-defined train and test splits, does not portray true picture of the classifier's performance and calls for additional performance metrics to substantiate the results.


Asunto(s)
Aprendizaje Profundo , Diagnóstico por Computador/métodos , Fondo de Ojo , Glaucoma/diagnóstico por imagen , Interpretación de Imagen Asistida por Computador/métodos , Disco Óptico/diagnóstico por imagen , Humanos
12.
Sensors (Basel) ; 19(11)2019 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-31146357

RESUMEN

The need for robust unsupervised anomaly detection in streaming data is increasing rapidly in the current era of smart devices, where enormous data are gathered from numerous sensors. These sensors record the internal state of a machine, the external environment, and the interaction of machines with other machines and humans. It is of prime importance to leverage this information in order to minimize downtime of machines, or even avoid downtime completely by constant monitoring. Since each device generates a different type of streaming data, it is normally the case that a specific kind of anomaly detection technique performs better than the others depending on the data type. For some types of data and use-cases, statistical anomaly detection techniques work better, whereas for others, deep learning-based techniques are preferred. In this paper, we present a novel anomaly detection technique, FuseAD, which takes advantage of both statistical and deep-learning-based approaches by fusing them together in a residual fashion. The obtained results show an increase in area under the curve (AUC) as compared to state-of-the-art anomaly detection methods when FuseAD is tested on a publicly available dataset (Yahoo Webscope benchmark). The obtained results advocate that this fusion-based technique can obtain the best of both worlds by combining their strengths and complementing their weaknesses. We also perform an ablation study to quantify the contribution of the individual components in FuseAD, i.e., the statistical ARIMA model as well as the deep-learning-based convolutional neural network (CNN) model.

13.
Sensors (Basel) ; 18(8)2018 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-30071586

RESUMEN

This paper presents a simple yet effective method for improving the performance of zero-shot learning (ZSL). ZSL classifies instances of unseen classes, from which no training data is available, by utilizing the attributes of the classes. Conventional ZSL methods have equally dealt with all the available attributes, but this sometimes causes misclassification. This is because an attribute that is effective for classifying instances of one class is not always effective for another class. In this case, a metric of classifying the latter class can be undesirably influenced by the irrelevant attribute. This paper solves this problem by taking the importance of each attribute for each class into account when calculating the metric. In addition to the proposal of this new method, this paper also contributes by providing a dataset for pose classification based on wearable sensors, named HDPoseDS. It contains 22 classes of poses performed by 10 subjects with 31 IMU sensors across full body. To the best of our knowledge, it is the richest wearable-sensor dataset especially in terms of sensor density, and thus it is suitable for studying zero-shot pose/action recognition. The presented method was evaluated on HDPoseDS and outperformed relative improvement of 5.9% in comparison to the best baseline method.


Asunto(s)
Aprendizaje Automático , Postura/fisiología , Dispositivos Electrónicos Vestibles , Conjuntos de Datos como Asunto , Humanos
14.
Brief Funct Genomics ; 23(2): 163-179, 2024 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-37248673

RESUMEN

Post-translational modifications (PTMs) either enhance a protein's activity in various sub-cellular processes, or degrade their activity which leads toward failure of intracellular processes. Tyrosine nitration (NT) modification degrades protein's activity that initiates and propagates various diseases including neurodegenerative, cardiovascular, autoimmune diseases and carcinogenesis. Identification of NT modification supports development of novel therapies and drug discoveries for associated diseases. Identification of NT modification in biochemical labs is expensive, time consuming and error-prone. To supplement this process, several computational approaches have been proposed. However these approaches fail to precisely identify NT modification, due to the extraction of irrelevant, redundant and less discriminative features from protein sequences. This paper presents the NTpred framework that is competent in extracting comprehensive features from raw protein sequences using four different sequence encoders. To reap the benefits of different encoders, it generates four additional feature spaces by fusing different combinations of individual encodings. Furthermore, it eradicates irrelevant and redundant features from eight different feature spaces through a Recursive Feature Elimination process. Selected features of four individual encodings and four feature fusion vectors are used to train eight different Gradient Boosted Tree classifiers. The probability scores from the trained classifiers are utilized to generate a new probabilistic feature space, which is used to train a Logistic Regression classifier. On the BD1 benchmark dataset, the proposed framework outperforms the existing best-performing predictor in 5-fold cross validation and independent test evaluation with combined improvement of 13.7% in MCC and 20.1% in AUC. Similarly, on the BD2 benchmark dataset, the proposed framework outperforms the existing best-performing predictor with combined improvement of 5.3% in MCC and 1.0% in AUC. NTpred is publicly available for further experimentation and predictive use at: https://sds_genetic_analysis.opendfki.de/PredNTS/.


Asunto(s)
Biología Computacional , Proteínas , Proteínas/metabolismo , Secuencia de Aminoácidos , Aprendizaje Automático , Tirosina
15.
Front Artif Intell ; 7: 1236947, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39021435

RESUMEN

Since the advent of deep learning (DL), the field has witnessed a continuous stream of innovations. However, the translation of these advancements into practical applications has not kept pace, particularly in safety-critical domains where artificial intelligence (AI) must meet stringent regulatory and ethical standards. This is underscored by the ongoing research in eXplainable AI (XAI) and privacy-preserving machine learning (PPML), which seek to address some limitations associated with these opaque and data-intensive models. Despite brisk research activity in both fields, little attention has been paid to their interaction. This work is the first to thoroughly investigate the effects of privacy-preserving techniques on explanations generated by common XAI methods for DL models. A detailed experimental analysis is conducted to quantify the impact of private training on the explanations provided by DL models, applied to six image datasets and five time series datasets across various domains. The analysis comprises three privacy techniques, nine XAI methods, and seven model architectures. The findings suggest non-negligible changes in explanations through the implementation of privacy measures. Apart from reporting individual effects of PPML on XAI, the paper gives clear recommendations for the choice of techniques in real applications. By unveiling the interdependencies of these pivotal technologies, this research marks an initial step toward resolving the challenges that hinder the deployment of AI in safety-critical settings.

16.
Front Artif Intell ; 7: 1428501, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39021434

RESUMEN

Survival prediction integrates patient-specific molecular information and clinical signatures to forecast the anticipated time of an event, such as recurrence, death, or disease progression. Survival prediction proves valuable in guiding treatment decisions, optimizing resource allocation, and interventions of precision medicine. The wide range of diseases, the existence of various variants within the same disease, and the reliance on available data necessitate disease-specific computational survival predictors. The widespread adoption of artificial intelligence (AI) methods in crafting survival predictors has undoubtedly revolutionized this field. However, the ever-increasing demand for more sophisticated and effective prediction models necessitates the continued creation of innovative advancements. To catalyze these advancements, it is crucial to bring existing survival predictors knowledge and insights into a centralized platform. The paper in hand thoroughly examines 23 existing review studies and provides a concise overview of their scope and limitations. Focusing on a comprehensive set of 90 most recent survival predictors across 44 diverse diseases, it delves into insights of diverse types of methods that are used in the development of disease-specific predictors. This exhaustive analysis encompasses the utilized data modalities along with a detailed analysis of subsets of clinical features, feature engineering methods, and the specific statistical, machine or deep learning approaches that have been employed. It also provides insights about survival prediction data sources, open-source predictors, and survival prediction frameworks.

17.
Sci Rep ; 14(1): 9466, 2024 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-38658614

RESUMEN

Long extrachromosomal circular DNA (leccDNA) regulates several biological processes such as genomic instability, gene amplification, and oncogenesis. The identification of leccDNA holds significant importance to investigate its potential associations with cancer, autoimmune, cardiovascular, and neurological diseases. In addition, understanding these associations can provide valuable insights about disease mechanisms and potential therapeutic approaches. Conventionally, wet lab-based methods are utilized to identify leccDNA, which are hindered by the need for prior knowledge, and resource-intensive processes, potentially limiting their broader applicability. To empower the process of leccDNA identification across multiple species, the paper in hand presents the very first computational predictor. The proposed iLEC-DNA predictor makes use of SVM classifier along with sequence-derived nucleotide distribution patterns and physicochemical properties-based features. In addition, the study introduces a set of 12 benchmark leccDNA datasets related to three species, namely Homo sapiens (HM), Arabidopsis Thaliana (AT), and Saccharomyces cerevisiae (SC/YS). It performs large-scale experimentation across 12 benchmark datasets under different experimental settings using the proposed predictor, more than 140 baseline predictors, and 858 encoder ensembles. The proposed predictor outperforms baseline predictors and encoder ensembles across diverse leccDNA datasets by producing average performance values of 81.09%, 62.2% and 81.08% in terms of ACC, MCC and AUC-ROC across all the datasets. The source code of the proposed and baseline predictors is available at https://github.com/FAhtisham/Extrachrosmosomal-DNA-Prediction . To facilitate the scientific community, a web application for leccDNA identification is available at https://sds_genetic_analysis.opendfki.de/iLEC_DNA/.


Asunto(s)
ADN Circular , Saccharomyces cerevisiae , ADN Circular/genética , Humanos , Saccharomyces cerevisiae/genética , Arabidopsis/genética , Biología Computacional/métodos , Nucleótidos/genética , Máquina de Vectores de Soporte
18.
Comput Biol Med ; 176: 108538, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38759585

RESUMEN

Anticancer peptides (ACPs) key properties including bioactivity, high efficacy, low toxicity, and lack of drug resistance make them ideal candidates for cancer therapies. To deeply explore the potential of ACPs and accelerate development of cancer therapies, although 53 Artificial Intelligence supported computational predictors have been developed for ACPs and non ACPs classification but only one predictor has been developed for ACPs functional types annotations. Moreover, these predictors extract amino acids distribution patterns to transform peptides sequences into statistical vectors that are further fed to classifiers for discriminating peptides sequences and annotating peptides functional classes. Overall, these predictors remain fail in extracting diverse types of amino acids distribution patterns from peptide sequences. The paper in hand presents a unique CARE encoder that transforms peptides sequences into statistical vectors by extracting 4 different types of distribution patterns including correlation, distribution, composition, and transition. Across public benchmark dataset, proposed encoder potential is explored under two different evaluation settings namely; intrinsic and extrinsic. Extrinsic evaluation indicates that 12 different machine learning classifiers achieve superior performance with the proposed encoder as compared to 55 existing encoders. Furthermore, an intrinsic evaluation reveals that, unlike existing encoders, the proposed encoder generates more discriminative clusters for ACPs and non-ACPs classes. Across 8 public benchmark ACPs and non-ACPs classification datasets, proposed encoder and Adaboost classifier based CAPTURE predictor outperforms existing predictors with an average accuracy, recall and MCC score of 1%, 4%, and 2% respectively. In generalizeability evaluation case study, across 7 benchmark anti-microbial peptides classification datasets, CAPTURE surpasses existing predictors by an average AU-ROC of 2%. CAPTURE predictive pipeline along with label powerset method outperforms state-of-the-art ACPs functional types predictor by 5%, 5%, 5%, 6%, and 3% in terms of average accuracy, subset accuracy, precision, recall, and F1 respectively. CAPTURE web application is available at https://sds_genetic_analysis.opendfki.de/CAPTURE.


Asunto(s)
Antineoplásicos , Péptidos , Humanos , Antineoplásicos/uso terapéutico , Antineoplásicos/química , Péptidos/química , Aprendizaje Automático , Secuencia de Aminoácidos , Biología Computacional/métodos , Neoplasias/tratamiento farmacológico , Análisis de Secuencia de Proteína/métodos , Bases de Datos de Proteínas
19.
Cancer Med ; 13(12): e7398, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38923826

RESUMEN

Artificial intelligence (AI) promises to be the next revolutionary step in modern society. Yet, its role in all fields of industry and science need to be determined. One very promising field is represented by AI-based decision-making tools in clinical oncology leading to more comprehensive, personalized therapy approaches. In this review, the authors provide an overview on all relevant technical applications of AI in oncology, which are required to understand the future challenges and realistic perspectives for decision-making tools. In recent years, various applications of AI in medicine have been developed focusing on the analysis of radiological and pathological images. AI applications encompass large amounts of complex data supporting clinical decision-making and reducing errors by objectively quantifying all aspects of the data collected. In clinical oncology, almost all patients receive a treatment recommendation in a multidisciplinary cancer conference at the beginning and during their treatment periods. These highly complex decisions are based on a large amount of information (of the patients and of the various treatment options), which need to be analyzed and correctly classified in a short time. In this review, the authors describe the technical and medical requirements of AI to address these scientific challenges in a multidisciplinary manner. Major challenges in the use of AI in oncology and decision-making tools are data security, data representation, and explainability of AI-based outcome predictions, in particular for decision-making processes in multidisciplinary cancer conferences. Finally, limitations and potential solutions are described and compared for current and future research attempts.


Asunto(s)
Inteligencia Artificial , Toma de Decisiones Clínicas , Oncología Médica , Neoplasias , Humanos , Oncología Médica/métodos , Neoplasias/terapia , Medicina de Precisión/métodos , Sistemas de Apoyo a Decisiones Clínicas
20.
Front Bioinform ; 3: 1194993, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37484865

RESUMEN

Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA