Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
Int J Mol Sci ; 21(3)2020 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-32033398

RESUMO

Osteosarcoma is the most common subtype of primary bone cancer, affecting mostly adolescents. In recent years, several studies have focused on elucidating the molecular mechanisms of this sarcoma; however, its molecular etiology has still not been determined with precision. Therefore, we applied a consensus strategy with the use of several bioinformatics tools to prioritize genes involved in its pathogenesis. Subsequently, we assessed the physical interactions of the previously selected genes and applied a communality analysis to this protein-protein interaction network. The consensus strategy prioritized a total list of 553 genes. Our enrichment analysis validates several studies that describe the signaling pathways PI3K/AKT and MAPK/ERK as pathogenic. The gene ontology described TP53 as a principal signal transducer that chiefly mediates processes associated with cell cycle and DNA damage response It is interesting to note that the communality analysis clusters several members involved in metastasis events, such as MMP2 and MMP9, and genes associated with DNA repair complexes, like ATM, ATR, CHEK1, and RAD51. In this study, we have identified well-known pathogenic genes for osteosarcoma and prioritized genes that need to be further explored.


Assuntos
Neoplasias Ósseas/genética , Neoplasias Ósseas/patologia , Osteossarcoma/genética , Osteossarcoma/patologia , Biologia Computacional/métodos , Consenso , Reparo do DNA/genética , Regulação Neoplásica da Expressão Gênica/genética , Ontologia Genética , Redes Reguladoras de Genes/genética , Humanos , Mapas de Interação de Proteínas/genética , Transdução de Sinais/genética
2.
Int J Mol Sci ; 20(18)2019 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-31491969

RESUMO

In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository.


Assuntos
Mapeamento de Epitopos , Aprendizado de Máquina , Peptídeos/imunologia , Sequência de Aminoácidos , Humanos , Peptídeos/química , Curva ROC , Relação Estrutura-Atividade
4.
Int J Mol Sci ; 17(8)2016 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-27529225

RESUMO

Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Algoritmos , Animais , Humanos , Relação Quantitativa Estrutura-Atividade
5.
J Theor Biol ; 349: 12-21, 2014 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-24491256

RESUMO

The cell death (CD) is a dynamic biological function involved in physiological and pathological processes. Due to the complexity of CD, there is a demand for fast theoretical methods that can help to find new CD molecular targets. The current work presents the first classification model to predict CD-related proteins based on Markov Mean Properties. These protein descriptors have been calculated with the MInD-Prot tool using the topological information of the amino acid contact networks of the 2423 protein chains, five atom physicochemical properties and the protein 3D regions. The Machine Learning algorithms from Weka were used to find the best classification model for CD-related protein chains using all 20 attributes. The most accurate algorithm to solve this problem was K*. After several feature subset methods, the best model found is based on only 11 variables and is characterized by the Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.992 and the true positive rate (TP Rate) of 88.2% (validation set). 7409 protein chains labeled with "unknown function" in the PDB Databank were analyzed with the best model in order to predict the CD-related biological activity. Thus, several proteins have been predicted to have CD-related function in Homo sapiens: 3DRX-involved in virus-host interaction biological process, protein homooligomerization; 4DWF-involved in cell differentiation, chromatin modification, DNA damage response, protein stabilization; 1IUR-involved in ATP binding, chaperone binding; 1J7D-involved in DNA double-strand break processing, histone ubiquitination, nucleotide-binding oligomerization; 1UTU-linked with DNA repair, regulation of transcription; 3EEC-participating to the cellular membrane organization, egress of virus within host cell, class mediator resulting in cell cycle arrest, negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle and apoptotic process. Other proteins from bacteria predicted as CD-related are 2G3V - a CAG pathogenicity island protein 13 from Helicobacter pylori, 4G5A - a hypothetical protein in Bacteroides thetaiotaomicron, 1YLK-involved in the nitrogen metabolism of Mycobacterium tuberculosis, and 1XSV - with possible DNA/RNA binding domains. The results demonstrated the possibility to predict CD-related proteins using molecular information encoded into the protein 3D structure. Thus, the current work demonstrated the possibility to predict new molecular targets involved in cell-death processes.


Assuntos
Cadeias de Markov , Proteínas/classificação , Algoritmos , Morte Celular , Bases de Dados de Proteínas , Padrões de Referência
6.
J Chem Inf Model ; 54(1): 16-29, 2014 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-24320872

RESUMO

The use of numerical parameters in Complex Network analysis is expanding to new fields of application. At a molecular level, we can use them to describe the molecular structure of chemical entities, protein interactions, or metabolic networks. However, the applications are not restricted to the world of molecules and can be extended to the study of macroscopic nonliving systems, organisms, or even legal or social networks. On the other hand, the development of the field of Artificial Intelligence has led to the formulation of computational algorithms whose design is based on the structure and functioning of networks of biological neurons. These algorithms, called Artificial Neural Networks (ANNs), can be useful for the study of complex networks, since the numerical parameters that encode information of the network (for example centralities/node descriptors) can be used as inputs for the ANNs. The Wiener index (W) is a graph invariant widely used in chemoinformatics to quantify the molecular structure of drugs and to study complex networks. In this work, we explore for the first time the possibility of using Markov chains to calculate analogues of node distance numbers/W to describe complex networks from the point of view of their nodes. These parameters are called Markov-Wiener node descriptors of order k(th) (W(k)). Please, note that these descriptors are not related to Markov-Wiener stochastic processes. Here, we calculated the W(k)(i) values for a very high number of nodes (>100,000) in more than 100 different complex networks using the software MI-NODES. These networks were grouped according to the field of application. Molecular networks include the Metabolic Reaction Networks (MRNs) of 40 different organisms. In addition, we analyzed other biological and legal and social networks. These include the Interaction Web Database Biological Networks (IWDBNs), with 75 food webs or ecological systems and the Spanish Financial Law Network (SFLN). The calculated W(k)(i) values were used as inputs for different ANNs in order to discriminate correct node connectivity patterns from incorrect random patterns. The MIANN models obtained present good values of Sensitivity/Specificity (%): MRNs (78/78), IWDBNs (90/88), and SFLN (86/84). These preliminary results are very promising from the point of view of a first exploratory study and suggest that the use of these models could be extended to the high-throughput re-evaluation of connectivity in known complex networks (collation).


Assuntos
Modelos Biológicos , Redes Neurais de Computação , Algoritmos , Biologia Computacional , Bases de Dados Factuais , Ecossistema , Jurisprudência , Cadeias de Markov , Redes e Vias Metabólicas , Modelos Econométricos , Modelos Teóricos , Apoio Social , Software
7.
J Chem Inf Model ; 54(3): 744-55, 2014 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-24521170

RESUMO

This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.


Assuntos
Síndrome da Imunodeficiência Adquirida/tratamento farmacológico , Síndrome da Imunodeficiência Adquirida/epidemiologia , Fármacos Anti-HIV/uso terapêutico , Algoritmos , Animais , Fármacos Anti-HIV/química , Bases de Dados Factuais , Avaliação Pré-Clínica de Medicamentos , HIV/efeitos dos fármacos , HIV/isolamento & purificação , Humanos , Modelos Estatísticos , Redes Neurais de Computação , Prevalência , Apoio Social , Estados Unidos/epidemiologia
8.
Assist Technol ; 26(1): 33-44, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24800452

RESUMO

The purpose of this study is to describe the process of assessment of three assistive devices to meet the needs of a woman with cerebral palsy (CP) in order to provide her with computer access and use. The user has quadriplegic CP, with anarthria, using a syllabic keyboard. Devices were evaluated through a three-step approach: (a) use of a questionnaire to preselect potential assistive technologies, (b) use of an eTAO tool to determine the effectiveness of each devised, and (c) a conducting semi-structured interview to obtain qualitative data. Touch screen, joystick, and trackball were the preselected devices. The best device that met the user's needs and priorities was joystick. The finding was corroborated by both the eTAO tool and the semi-structured interview. Computers are a basic form of social participation. It is important to consider the special needs and priorities of users and to try different devices when undertaking a device-selection process. Environmental and personal factors have to be considered, as well. This leads to a need to evaluate new tools in order to provide the appropriate support. The eTAO could be a suitable instrument for this purpose. Additional research is also needed to understand how to better match devices with different user populations and how to comprehensively evaluate emerging technologies relative to users with disabilities.


Assuntos
Paralisia Cerebral , Periféricos de Computador , Desenho de Equipamento , Tecnologia Assistiva/normas , Interface Usuário-Computador , Adulto , Ergonomia , Feminino , Humanos , Pesquisa Qualitativa , Inquéritos e Questionários
9.
Heliyon ; 10(7): e28560, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38590890

RESUMO

Single Sign-On (SSO) methods are the primary solution to authenticate users across multiple web systems. These mechanisms streamline the authentication procedure by avoiding duplicate developments of authentication modules for each application. Besides, these mechanisms also provide convenience to the end-user by keeping the user authenticated when switching between different contexts. To ensure this cross-application authentication, SSO relies on an Identity Provider (IdP), which is commonly set up and managed by each institution that needs to enforce SSO internally. However, the solution is not so straightforward when several institutions need to cooperate in a unique ecosystem. This could be tackled by centralizing the authentication mechanisms in one of the involved entities, a solution raising responsibilities that may be difficult for peers to accept. Moreover, this solution is not appropriate for dynamic groups, where peers may join or leave frequently. In this paper, we propose an architecture that uses a trusted third-party service to authenticate multiple entities, ensuring the isolation of the user's attributes between this service and the institutional SSO systems. This architecture was validated in the EHDEN Portal, which includes web tools and services of this European health project, to establish a Federated Authentication schema.

10.
J Cheminform ; 16(1): 27, 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38449058

RESUMO

For understanding a chemical compound's mechanism of action and its side effects, as well as for drug discovery, it is crucial to predict its possible protein targets. This study examines 15 developed target-centric models (TCM) employing different molecular descriptions and machine learning algorithms. They were contrasted with 17 third-party models implemented as web tools (WTCM). In both sets of models, consensus strategies were implemented as potential improvement over individual predictions. The findings indicate that TCM reach f1-score values greater than 0.8. Comparing both approaches, the best TCM achieves values of 0.75, 0.61, 0.25 and 0.38 for true positive/negative rates (TPR, TNR) and false negative/positive rates (FNR, FPR); outperforming the best WTCM. Moreover, the consensus strategy proves to have the most relevant results in the top 20 % of target profiles. TCM consensus reach TPR and FNR values of 0.98 and 0; while on WTCM reach values of 0.75 and 0.24. The implemented computational tool with the TCM and their consensus strategy at: https://bioquimio.udla.edu.ec/tidentification01/ . Scientific Contribution: We compare and discuss the performances of 17 public compound-target interaction prediction models and 15 new constructions. We also explore a compound-target interaction prioritization strategy using a consensus approach, and we analyzed the challenging involved in interactions modeling.

11.
An Pediatr (Engl Ed) ; 100(3): 195-201, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38461129

RESUMO

This article examines the use of artificial intelligence (AI) in the field of paediatric care within the framework of the 7P medicine model (Predictive, Preventive, Personalized, Precise, Participatory, Peripheral and Polyprofessional). It highlights various applications of AI in the diagnosis, treatment and management of paediatric diseases as well as the role of AI in prevention and in the efficient management of health care resources and the resulting impact on the sustainability of public health systems. Successful cases of the application of AI in the paediatric care setting are presented, placing emphasis on the need to move towards a 7P health care model. Artificial intelligence is revolutionizing society at large and has a great potential for significantly improving paediatric care.


Assuntos
Inteligência Artificial , Humanos , Criança
12.
Stud Health Technol Inform ; 294: 585-586, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612156

RESUMO

Many clinical studies are greatly dependent on an efficient identification of relevant datasets. This selection can be performed in existing health data catalogues, by searching for available metadata. The search process can be optimised through questioning-answering interfaces, to help researchers explore the available data present. However, when searching the distinct catalogues the lack of metadata harmonisation imposes a few bottlenecks. This paper presents a methodology to allow semantic search over several biomedical database catalogues, by extracting the information using a shared domain knowledge. The resulting pipeline allows the converted data to be published as FAIR endpoints, and it provides an end-user interface that accepts natural language questions.


Assuntos
Metadados , Semântica , Bases de Dados Factuais , Idioma , Processamento de Linguagem Natural
13.
J Proteome Res ; 10(4): 1698-718, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21184613

RESUMO

Many drugs with very different affinity to a large number of receptors are described. Thus, in this work, we selected drug-target pairs (DTPs/nDTPs) of drugs with high affinity/nonaffinity for different targets. Quantitative structure-activity relationship (QSAR) models become a very useful tool in this context because they substantially reduce time and resource-consuming experiments. Unfortunately, most QSAR models predict activity against only one protein target and/or they have not been implemented on a public Web server yet, freely available online to the scientific community. To solve this problem, we developed a multitarget QSAR (mt-QSAR) classifier combining the MARCH-INSIDE software for the calculation of the structural parameters of drug and target with the linear discriminant analysis (LDA) method in order to seek the best model. The accuracy of the best LDA model was 94.4% (3,859/4,086 cases) for training and 94.9% (1,909/2,012 cases) for the external validation series. In addition, we implemented the model into the Web portal Bio-AIMS as an online server entitled MARCH-INSIDE Nested Drug-Bank Exploration & Screening Tool (MIND-BEST), located at http://miaja.tic.udc.es/Bio-AIMS/MIND-BEST.php . This online tool is based on PHP/HTML/Python and MARCH-INSIDE routines. Finally, we illustrated two practical uses of this server with two different experiments. In experiment 1, we report for the first time a MIND-BEST prediction, synthesis, characterization, and MAO-A and MAO-B pharmacological assay of eight rasagiline derivatives, promising for anti-Parkinson drug design. In experiment 2, we report sampling, parasite culture, sample preparation, 2-DE, MALDI-TOF and -TOF/TOF MS, MASCOT search, 3D structure modeling with LOMETS, and MIND-BEST prediction for different peptides as new protein of the found in the proteome of the bird parasite Trichomonas gallinae, which is promising for antiparasite drug targets discovery.


Assuntos
Desenho de Fármacos , Avaliação Pré-Clínica de Medicamentos/métodos , Glucosefosfato Desidrogenase/metabolismo , Internet , Inibidores da Monoaminoxidase/química , Monoaminoxidase/metabolismo , Proteínas de Protozoários/metabolismo , Trichomonas , Animais , Antiparasitários/química , Antiparasitários/farmacologia , Columbidae/microbiologia , Descoberta de Drogas , Glucosefosfato Desidrogenase/química , Indanos/síntese química , Indanos/química , Modelos Moleculares , Modelos Teóricos , Dados de Sequência Molecular , Estrutura Molecular , Monoaminoxidase/química , Inibidores da Monoaminoxidase/síntese química , Peptídeos/química , Conformação Proteica , Proteínas de Protozoários/química , Relação Quantitativa Estrutura-Atividade , Trichomonas/química , Trichomonas/efeitos dos fármacos , Trichomonas/enzimologia
14.
J Theor Biol ; 271(1): 136-44, 2011 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-21130100

RESUMO

A statistical approach has been applied to analyse primary structure patterns at inner positions of α-helices in proteins. A systematic survey was carried out in a recent sample of non-redundant proteins selected from the Protein Data Bank, which were used to analyse α-helix structures for amino acid pairing patterns. Only residues more than three positions apart from both termini of the α-helix were considered as inner. Amino acid pairings i, i+k (k=1, 2, 3, 4, 5), were analysed and the corresponding 20×20 matrices of relative global propensities were constructed. An analysis of (i, i+4, i+8) and (i, i+3, i+4) triplet patterns was also performed. These analysis yielded information on a series of amino acid patterns (pairings and triplets) showing either high or low preference for α-helical motifs and suggested a novel approach to protein alphabet reduction. In addition, it has been shown that the individual amino acid propensities are not enough to define the statistical distribution of these patterns. Global pair propensities also depend on the type of pattern, its composition and orientation in the protein sequence. The data presented should prove useful to obtain and refine useful predictive rules which can further the development and fine-tuning of protein structure prediction algorithms and tools.


Assuntos
Aminoácidos/química , Estrutura Secundária de Proteína , Proteínas/química , Algoritmos , Bases de Dados de Proteínas , Dobramento de Proteína
15.
PeerJ Comput Sci ; 7: e584, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34322589

RESUMO

In recent years, machine learning (ML) researchers have changed their focus towards biological problems that are difficult to analyse with standard approaches. Large initiatives such as The Cancer Genome Atlas (TCGA) have allowed the use of omic data for the training of these algorithms. In order to study the state of the art, this review is provided to cover the main works that have used ML with TCGA data. Firstly, the principal discoveries made by the TCGA consortium are presented. Once these bases have been established, we begin with the main objective of this study, the identification and discussion of those works that have used the TCGA data for the training of different ML approaches. After a review of more than 100 different papers, it has been possible to make a classification according to following three pillars: the type of tumour, the type of algorithm and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two major algorithms: Random Forest and Support Vector Machines. We also observe the rise in the use of deep artificial neural networks. It is worth emphasizing, the increase of integrative models of multi-omic data analysis. The different biological conditions are a consequence of molecular homeostasis, driven by both protein coding regions, regulatory elements and the surrounding environment. It is notable that a large number of works make use of genetic expression data, which has been found to be the preferred method by researchers when training the different models. The biological problems addressed have been classified into five types: prognosis prediction, tumour subtypes, microsatellite instability (MSI), immunological aspects and certain pathways of interest. A clear trend was detected in the prediction of these conditions according to the type of tumour. That is the reason for which a greater number of works have focused on the BRCA cohort, while specific works for survival, for example, were centred on the GBM cohort, due to its large number of events. Throughout this review, it will be possible to go in depth into the works and the methodologies used to study TCGA cancer data. Finally, it is intended that this work will serve as a basis for future research in this field of study.

16.
Stud Health Technol Inform ; 281: 327-331, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34042759

RESUMO

The process of refining the research question in a medical study depends greatly on the current background of the investigated subject. The information found in prior works can directly impact several stages of the study, namely the cohort definition stage. Besides previous published methods, researchers could also leverage on other materials, such as the output of cohort selection tools, to enrich and to accelerate their own work. However, this kind of information is not always captured by search engines. In this paper, we present a methodology, based on a combination of content-based retrieval and text annotation techniques, to identify relevant scientific publications related to a research question and to the selected data sources.


Assuntos
Armazenamento e Recuperação da Informação , Ferramenta de Busca , Estudos de Coortes
17.
JMIR Med Inform ; 9(2): e22976, 2021 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-33629960

RESUMO

BACKGROUND: Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. OBJECTIVE: To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. METHODS: We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. RESULTS: The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to "omics" and the other related to the COVID-19 pandemic. CONCLUSIONS: BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others).

18.
Comput Struct Biotechnol J ; 19: 4538-4558, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34471498

RESUMO

Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.

19.
J Proteome Res ; 9(2): 1182-90, 2010 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-19947655

RESUMO

Trypanosoma brucei causes African trypanosomiasis in humans (HAT or African sleeping sickness) and Nagana in cattle. The disease threatens over 60 million people and uncounted numbers of cattle in 36 countries of sub-Saharan Africa and has a devastating impact on human health and the economy. On the other hand, Trypanosoma cruzi is responsible in South America for Chagas disease, which can cause acute illness and death, especially in young children. In this context, the discovery of novel drug targets in Trypanosome proteome is a major focus for the scientific community. Recently, many researchers have spent important efforts on the study of protein-protein interactions (PPIs) in pathogen Trypanosome species concluding that the low sequence identities between some parasite proteins and their human host render these PPIs as highly promising drug targets. To the best of our knowledge, there are no general models to predict Unique PPIs in Trypanosome (TPPIs). On the other hand, the 3D structure of an increasing number of Trypanosome proteins is reported in databases. In this regard, the introduction of a new model to predict TPPIs from the 3D structure of proteins involved in PPI is very important. For this purpose, we introduced new protein-protein complex invariants based on the Markov average electrostatic potential xi(k)(R(i)) for amino acids located in different regions (R(i)) of i-th protein and placed at a distance k one from each other. We calculated more than 30 different types of parameters for 7866 pairs of proteins (1023 TPPIs and 6823 non-TPPIs) from more than 20 organisms, including parasites and human or cattle hosts. We found a very simple linear model that predicts above 90% of TPPIs and non-TPPIs both in training and independent test subsets using only two parameters. The parameters were (d)xi(k)(s) = |xi(k)(s(1)) - xi(k)(s(2))|, the absolute difference between the xi(k)(s(i)) values on the surface of the two proteins of the pairs. We also tested nonlinear ANN models for comparison purposes but the linear model gives the best results. We implemented this predictor in the web server named TrypanoPPI freely available to public at http://miaja.tic.udc.es/Bio-AIMS/TrypanoPPI.php. This is the first model that predicts how unique a protein-protein complex in Trypanosome proteome is with respect to other parasites and hosts, opening new opportunities for antitrypanosome drug target discovery.


Assuntos
Internet , Proteínas/química , Proteínas de Protozoários/química , Trypanosoma/química , Cadeias de Markov , Modelos Moleculares , Redes Neurais de Computação , Ligação Proteica , Eletricidade Estática
20.
BMC Cancer ; 10: 528, 2010 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-20920369

RESUMO

BACKGROUND: Controversy exists with regard to the impact that the different components of diagnosis delay may have on the degree of invasion and prognosis in patients with colorectal cancer. The follow-up strategies after treatment also vary considerably. The aims of this study are: a) to determine if the symptoms-to-diagnosis interval and the treatment delay modify the survival of patients with colorectal cancer, and b) to determine if different follow-up strategies are associated with a higher survival rate. METHODS/DESIGN: Multi-centre study with prospective follow-up in five regions in Spain (Galicia, Balearic Islands, Catalonia, Aragón and Valencia) during the period 2010-2012. Incident cases are included with anatomopathological confirmation of colorectal cancer (International Classification of Diseases 9th revision codes 153-154) that formed a part of a previous study (n = 953).At the time of diagnosis, each patient was given a structured interview. Their clinical records will be reviewed during the follow-up period in order to obtain information on the explorations and tests carried out after treatment, and the progress of these patients.Symptoms-to-diagnosis interval is defined as the time calculated from the diagnosis of cancer and the first symptoms attributed to cancer. Treatment delay is defined as the time elapsed between diagnosis and treatment. In non-metastatic patients treated with curative intention, information will be obtained during the follow-up period on consultations performed in the digestive, surgery and oncology departments, as well as the endoscopies, tumour markers and imaging procedures carried out.Local recurrence, development of metastases in the follow-up, appearance of a new tumour and mortality will be included as outcome variables.Actuarial survival analysis with Kaplan-Meier curves, Cox regression and competitive risk survival analysis will be performed. DISCUSSION: This study will make it possible to verify if the different components of delay have an impact on survival rate in colon cancer and rectal cancer. In consequence, this multi-centre study will be able to detect the variability present in the follow-up of patients with colorectal cancer, and if this variability modifies the prognosis. Ideally, this study could determine which follow-up strategies are associated with a better prognosis in colorectal cancer.


Assuntos
Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/mortalidade , Neoplasias Colorretais/terapia , Intervalo Livre de Doença , Feminino , Humanos , Masculino , Oncologia/métodos , Invasividade Neoplásica , Metástase Neoplásica , Prognóstico , Estudos Prospectivos , Recidiva , Espanha , Taxa de Sobrevida , Fatores de Tempo , Resultado do Tratamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA