Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
J Arthroplasty ; 39(3): 677-682, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37770008

RESUMEN

BACKGROUND: Patient-reported outcome measures (PROMs) are an important metric to assess total knee arthroplasty (TKA) patients. The purpose of this study was to use a machine learning (ML) algorithm to identify patient features that impact PROMs after TKA. METHODS: Data from 636 TKA patients enrolled in our patient database between 2018 and 2022, were retrospectively reviewed. Their mean age was 68 years (range, 39 to 92), 56.7% women, and mean body mass index of 31.17 (range, 16 to 58). Patient demographics and the Functional Comorbidity Index were collected alongside Patient-Reported Outcome Measures Information System Global Health v1.2 (PROMIS GH-P) physical component scores preoperatively, at 3 months, and 1 year after TKA. An unsupervised ML algorithm (spectral clustering) was used to identify patient features impacting PROMIS GH-P scores at the various time points. RESULTS: The algorithm identified 5 patient clusters that varied by demographics, comorbidities, and pain scores. Each cluster was associated with predictable trends in PROMIS GH-P scores across the time points. Notably, patients who had the worst preoperative PROMIS GH-P scores (cluster 5) had the most improvement after TKA, whereas patients who had higher global health rating preoperatively had more modest improvement (clusters 1, 2, and 3). Two out of Five patient clusters (cluster 4 and 5) showed improvement in PROMIS GH-P scores that met a minimally clinically important difference at 1-year postoperative. CONCLUSIONS: The unsupervised ML algorithm identified patient clusters that had predictable changes in PROMs after TKA. It is a positive step toward providing precision medical care for each of our arthroplasty patients.


Asunto(s)
Artroplastia de Reemplazo de Rodilla , Osteoartritis de la Rodilla , Humanos , Femenino , Anciano , Masculino , Articulación de la Rodilla/cirugía , Estudios Retrospectivos , Aprendizaje Automático no Supervisado , Calidad de Vida , Resultado del Tratamiento , Medición de Resultados Informados por el Paciente , Osteoartritis de la Rodilla/cirugía
2.
Expert Syst Appl ; 204: 117553, 2022 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-35611122

RESUMEN

Due to the rapid technological advances that have been made over the years, more people are changing their way of living from traditional ways of doing business to those featuring greater use of electronic resources. This transition has attracted (and continues to attract) the attention of cybercriminals, referred to in this article as "attackers", who make use of the structure of the Internet to commit cybercrimes, such as phishing, in order to trick users into revealing sensitive data, including personal information, banking and credit card details, IDs, passwords, and more important information via replicas of legitimate websites of trusted organizations. In our digital society, the COVID-19 pandemic represents an unprecedented situation. As a result, many individuals were left vulnerable to cyberattacks while attempting to gather credible information about this alarming situation. Unfortunately, by taking advantage of this situation, specific attacks associated with the pandemic dramatically increased. Regrettably, cyberattacks do not appear to be abating. For this reason, cyber-security corporations and researchers must constantly develop effective and innovative solutions to tackle this growing issue. Although several anti-phishing approaches are already in use, such as the use of blacklists, visuals, heuristics, and other protective solutions, they cannot efficiently prevent imminent phishing attacks. In this paper, we propose machine learning models that use a limited number of features to classify COVID-19-related domain names as either malicious or legitimate. Our primary results show that a small set of carefully extracted lexical features, from domain names, can allow models to yield high scores; additionally, the number of subdomain levels as a feature can have a large influence on the predictions.

3.
Comput Struct Biotechnol J ; 23: 1641-1653, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38680869

RESUMEN

Protein generation has numerous applications in designing therapeutic antibodies and creating new drugs. Still, it is a demanding task due to the inherent complexities of protein structures and the limitations of current generative models. Proteins possess intricate geometry, and sampling their conformational space is challenging due to its high dimensionality. This paper introduces novel Markovian and non-Markovian generative diffusion models based on fractional stochastic differential equations and the Lévy distribution, allowing for a more effective exploration of the conformational space. The approach is applied to a dataset of 40,000 proteins and evaluated in terms of Fréchet distance, fidelity, and diversity, outperforming the state-of-the-art by 25.4%, 35.8%, and 11.8%, respectively.

4.
Top Spinal Cord Inj Rehabil ; 30(1): 1-44, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38433735

RESUMEN

Background: Traumatic spinal cord injuries (TSCI) greatly affect the lives of patients and their families. Prognostication may improve treatment strategies, health care resource allocation, and counseling. Multivariable clinical prediction models (CPMs) for prognosis are tools that can estimate an absolute risk or probability that an outcome will occur. Objectives: We sought to systematically review the existing literature on CPMs for TSCI and critically examine the predictor selection methods used. Methods: We searched MEDLINE, PubMed, Embase, Scopus, and IEEE for English peer-reviewed studies and relevant references that developed multivariable CPMs to prognosticate patient-centered outcomes in adults with TSCI. Using narrative synthesis, we summarized the characteristics of the included studies and their CPMs, focusing on the predictor selection process. Results: We screened 663 titles and abstracts; of these, 21 full-text studies (2009-2020) consisting of 33 distinct CPMs were included. The data analysis domain was most commonly at a high risk of bias when assessed for methodological quality. Model presentation formats were inconsistently included with published CPMs; only two studies followed established guidelines for transparent reporting of multivariable prediction models. Authors frequently cited previous literature for their initial selection of predictors, and stepwise selection was the most frequent predictor selection method during modelling. Conclusion: Prediction modelling studies for TSCI serve clinicians who counsel patients, researchers aiming to risk-stratify participants for clinical trials, and patients coping with their injury. Poor methodological rigor in data analysis, inconsistent transparent reporting, and a lack of model presentation formats are vital areas for improvement in TSCI CPM research.


Asunto(s)
Traumatismos de la Médula Espinal , Humanos , Modelos Teóricos
5.
Artif Intell Rev ; 56(3): 2057-2109, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-35791405

RESUMEN

The widespread usage of machine learning in different mainstream contexts has made deep learning the technique of choice in various domains, including finance. This systematic survey explores various scenarios employing deep learning in financial markets, especially the stock market. A key requirement for our methodology is its focus on research papers involving backtesting. That is, we consider whether the experimentation mode is sufficient for market practitioners to consider the work in a real-world use case. Works meeting this requirement are distributed across seven distinct specializations. Most studies focus on trade strategy, price prediction, and portfolio management, with a limited number considering market simulation, stock selection, hedging strategy, and risk management. We also recognize that domain-specific metrics such as "returns" and "volatility" appear most important for accurately representing model performance across specializations. Our study demonstrates that, although there have been some improvements in reproducibility, substantial work remains to be done regarding model explainability. Accordingly, we suggest several future directions, such as improving trust by creating reproducible, explainable, and accountable models and emphasizing prediction of longer-term horizons-potentially via the utilization of supplementary data-which continues to represent a significant unresolved challenge.

6.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 391-407, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-35085073

RESUMEN

The classification of deformable protein shapes, based solely on their macromolecular surfaces, is a challenging problem in protein-protein interaction prediction and protein design. Shape classification is made difficult by the fact that proteins are dynamic, flexible entities with high geometrical complexity. In this paper, we introduce a novel description for such deformable shapes. This description is based on the bifractional Fokker-Planck and Dirac-Kähler equations. These equations analyse and probe protein shapes in terms of a scalar, vectorial and non-commuting quaternionic field, allowing for a more comprehensive description of the protein shapes. An underlying non-Markovian Lévy random walk establishes geometrical relationships between distant regions while recalling previous analyses. Classification is performed with a multiobjective deep hierarchical pyramidal neural network, thus performing a multilevel analysis of the description. Our approach is applied to the SHREC'19 dataset for deformable protein shapes classification and to the SHREC'16 dataset for deformable partial shapes classification, demonstrating the effectiveness and generality of our approach.


Asunto(s)
Algoritmos , Aprendizaje Profundo , Redes Neurales de la Computación
7.
Discov Data ; 1(1): 2, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37035459

RESUMEN

Research into Intrusion and Anomaly Detectors at the Host level typically pays much attention to extracting attributes from system call traces. These include window-based, Hidden Markov Models, and sequence-model-based attributes. Recently, several works have been focusing on sequence-model-based feature extractors, specifically Word2Vec and GloVe, to extract embeddings from the system call traces due to their ability to capture semantic relationships among system calls. However, due to the nature of the data, these extractors introduce inconsistencies in the extracted features, causing the Machine Learning models built on them to yield inaccurate and potentially misleading results. In this paper, we first highlight the research challenges posed by these extractors. Then, we conduct experiments with new feature sets assessing their suitability to address the detected issues. Our experiments show that Word2Vec is prone to introducing more duplicated samples than GloVe. Regarding the solutions proposed, we found that concatenating the embedding vectors generated by Word2Vec and GloVe yields the overall best balanced accuracy. In addition to resolving the challenge of data leakage, this approach enables an improvement in performance relative to other alternatives.

8.
Discov Data ; 1(1): 4, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37038388

RESUMEN

In Machine Learning, the datasets used to build models are one of the main factors limiting what these models can achieve and how good their predictive performance is. Machine Learning applications for cyber-security or computer security are numerous including cyber threat mitigation and security infrastructure enhancement through pattern recognition, real-time attack detection, and in-depth penetration testing. Therefore, for these applications in particular, the datasets used to build the models must be carefully thought to be representative of real-world data. However, because of the scarcity of labelled data and the cost of manually labelling positive examples, there is a growing corpus of literature utilizing Semi-Supervised Learning with cyber-security data repositories. In this work, we provide a comprehensive overview of publicly available data repositories and datasets used for building computer security or cyber-security systems based on Semi-Supervised Learning, where only a few labels are necessary or available for building strong models. We highlight the strengths and limitations of the data repositories and sets and provide an analysis of the performance assessment metrics used to evaluate the built models. Finally, we discuss open challenges and provide future research directions for using cyber-security datasets and evaluating models built upon them.

9.
Comput Struct Biotechnol J ; 21: 1324-1348, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36817951

RESUMEN

Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. We therefore developed the ProtInteract framework to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequence attributes. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction under three different scenarios. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The contributions of this work are twofold. First, ProtInteract assimilates the protein's primary structure into a pseudo-time series. Therefore, we leverage the nature of the time series of proteins and their physicochemical properties to encode a protein's amino acid sequence into a lower-dimensional vector space. This approach enables extracting highly informative sequence attributes while reducing computational complexity. Second, the ProtInteract framework utilises this information to identify protein interactions with other proteins based on its amino acid configuration. Our results suggest that the proposed framework performs with high accuracy and efficiency in predicting protein-protein interactions.

10.
Front Neurol ; 14: 1263291, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37900603

RESUMEN

Background: Conducting clinical trials for traumatic spinal cord injury (tSCI) presents challenges due to patient heterogeneity. Identifying clinically similar subgroups using patient demographics and baseline injury characteristics could lead to better patient-centered care and integrated care delivery. Purpose: We sought to (1) apply an unsupervised machine learning approach of cluster analysis to identify subgroups of tSCI patients using patient demographics and injury characteristics at baseline, (2) to find clinical similarity within subgroups using etiological variables and outcome variables, and (3) to create multi-dimensional labels for categorizing patients. Study design: Retrospective analysis using prospectively collected data from a large national multicenter SCI registry. Methods: A method of spectral clustering was used to identify patient subgroups based on the following baseline variables collected since admission until rehabilitation: location of the injury, severity of the injury, Functional Independence Measure (FIM) motor, and demographic data (age, and body mass index). The FIM motor score, the FIM motor score change, and the total length of stay were assessed on the subgroups as outcome variables at discharge to establish the clinical similarity of the patients within derived subgroups. Furthermore, we discussed the relevance of the identified subgroups based on the etiological variables (energy and mechanism of injury) and compared them with the literature. Our study also employed a qualitative approach to systematically describe the identified subgroups, crafting multi-dimensional labels to highlight distinguishing factors and patient-focused insights. Results: Data on 334 tSCI patients from the Rick Hansen Spinal Cord Injury Registry was analyzed. Five significantly different subgroups were identified (p-value ≤0.05) based on baseline variables. Outcome variables at discharge superimposed on these subgroups had statistically different values between them (p-value ≤0.05) and supported the notion of clinical similarity of patients within each subgroup. Conclusion: Utilizing cluster analysis, we identified five clinically similar subgroups of tSCI patients at baseline, yielding statistically significant inter-group differences in clinical outcomes. These subgroups offer a novel, data-driven categorization of tSCI patients which aligns with their demographics and injury characteristics. As it also correlates with traditional tSCI classifications, this categorization could lead to improved personalized patient-centered care.

11.
Comput Struct Biotechnol J ; 20: 5316-5341, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36212542

RESUMEN

Most proteins perform their biological function by interacting with themselves or other molecules. Thus, one may obtain biological insights into protein functions, disease prevalence, and therapy development by identifying protein-protein interactions (PPI). However, finding the interacting and non-interacting protein pairs through experimental approaches is labour-intensive and time-consuming, owing to the variety of proteins. Hence, protein-protein interaction and protein-ligand binding problems have drawn attention in the fields of bioinformatics and computer-aided drug discovery. Deep learning methods paved the way for scientists to predict the 3-D structure of proteins from genomes, predict the functions and attributes of a protein, and modify and design new proteins to provide desired functions. This review focuses on recent deep learning methods applied to problems including predicting protein functions, protein-protein interaction and their sites, protein-ligand binding, and protein design.

12.
Adv Exp Med Biol ; 680: 447-54, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20865529

RESUMEN

Consider a protein (P(X)) that has been identified, during drug design, to constitute a new breakthrough in the design of a drug for treating a terminal illness. That is, this protein has the ability to dock on active sites and mask the subsequent docking of harmful foreign proteins. Unfortunately, protein X has serious side effects and is therefore not suitable for use in drug design. Suppose another protein (P(Y)) with similar outer structure (or envelope) and functionality, but without these side effects, exists. Locating and using such an alternative protein has obvious benefits. This paper introduces an approach to locate such similar protein envelopes by considering their three-dimensional (3D) shapes. We present a system which indexes and searches a large 3D protein database and illustrate its effectiveness against a very large protein repository.


Asunto(s)
Diseño Asistido por Computadora/estadística & datos numéricos , Diseño de Fármacos , Proteínas/química , Proteínas/metabolismo , Sitios de Unión , Biología Computacional , Simulación por Computador , Bases de Datos de Proteínas , Modelos Moleculares , Unión Proteica , Conformación Proteica , Pliegue de Proteína , Proteómica
13.
Vaccine ; 33(48): 6930-7, 2015 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-26413882

RESUMEN

MOTIVATION: The macromolecular surfaces associated with proteins and macromolecules play a key role in determining their functionality and interactions, and are also of importance in structural analysis and classification. As a result of their interaction with their environment, the macromolecular surfaces experience random conformational deformations. Consequently, a realistic description of the molecular surface must be invariant under these deformations. Further, the motion associated with disconnected regions on the molecular surface may be correlated. This property is known as the allosteric effect. In this paper, we address these two requirements. To this end, we propose an approach based on discrete differential geometry and the fractional Fokker-Planck equation which provides an isometrically invariant and allosteric aware description of macromolecular surfaces. Our method is applied to the influenza neuraminidase.


Asunto(s)
Neuraminidasa/química , Proteínas Virales/química , Regulación Alostérica , Fenómenos Químicos , Conformación Proteica , Propiedades de Superficie
14.
Biomed Res Int ; 2015: 183918, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25785262

RESUMEN

Macromolecular structures, such as neuraminidases, hemagglutinins, and monoclonal antibodies, are not rigid entities. Rather, they are characterised by their flexibility, which is the result of the interaction and collective motion of their constituent atoms. This conformational diversity has a significant impact on their physicochemical and biological properties. Among these are their structural stability, the transport of ions through the M2 channel, drug resistance, macromolecular docking, binding energy, and rational epitope design. To assess these properties and to calculate the associated thermodynamical observables, the conformational space must be efficiently sampled and the dynamic of the constituent atoms must be simulated. This paper presents algorithms and techniques that address the abovementioned issues. To this end, a computational review of molecular dynamics, Monte Carlo simulations, Langevin dynamics, and free energy calculation is presented. The exposition is made from first principles to promote a better understanding of the potentialities, limitations, applications, and interrelations of these computational methods.


Asunto(s)
Computadores Moleculares , Simulación de Dinámica Molecular , Estructura Molecular , Método de Montecarlo , Humanos
15.
Curr Pharm Des ; 19(12): 2183-93, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23016846

RESUMEN

The comparison of macromolecular structures, in terms of functionalities, is a crucial step when aiming to identify potential docking sites. Drug designers require the identification of such docking sites for the binding of two proteins, in order to form a stable complex. This paper presents a review of current approaches to macromolecular structure comparison and docking, following an algorithmic approach. We describe techniques based on the Bayesian framework, kernel-based methods, projection-based techniques and spectral approaches. We introduce the use of quantum particle swarm optimization, in order to aid us to find the most appropriate docking sites. We discuss the importance of the heat and Schrodinger equations to address the non-rigid nature of proteins and highlight the strengths and limitations of the various methods.


Asunto(s)
Biología Computacional , Modelos Moleculares , Complejos Multiproteicos/química , Algoritmos , Animales , Teorema de Bayes , Bases de Datos de Proteínas , Humanos , Simulación del Acoplamiento Molecular , Complejos Multiproteicos/metabolismo , Conformación Proteica , Estabilidad Proteica , Teoría Cuántica
16.
Artículo en Inglés | MEDLINE | ID: mdl-18002179

RESUMEN

Consider the scenario where, for a prescription drug designed to treat a terminal illness, a particular protein has been successfully identified as a crucial, beneficial component in the drug compound. However, this protein has contra-indications and causes severe adverse effects in a certain subset of the population. If another protein from the same family, with similar structure and functionality, but without these adverse effects, can be found, the subsequent modification of the harmful drug has obvious benefits. This paper describes a new indexing and similarity search system to retrieve such protein structure family members, based on their 3D shape. Our approach is translation, scale and rotation invariant, which eliminates the need for prior structure alignment. Our experimental evaluation against seven (7) diverse protein families indicate that our system accurately and precisely locate all members of a family. We further illustrate this by showing that our system precisely retrieves the Homo Sapiens Hemoglobin family members, against a database containing 26,000 protein structures.


Asunto(s)
Modelos Químicos , Modelos Moleculares , Proteínas/química , Proteínas/ultraestructura , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Simulación por Computador , Imagenología Tridimensional/métodos , Datos de Secuencia Molecular , Conformación Proteica , Alineación de Secuencia/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA