Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Support Care Cancer ; 31(2): 139, 2023 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-36707490

RESUMEN

BACKGROUND: Chemotherapy-induced peripheral neuropathy (CIPN) is a common toxicity of taxanes for which there is no effective intervention. Genomic CIPN risk determination has yielded promising, but inconsistent results. The present study assessed the utility of a collective SNP cluster identified using novel analytics to describe taxane-associated CIPN risk. METHODS: We analyzed GWAS data derived from ECOG-5103, first identifying SNPs that were most strongly associated with CIPN using Fisher's ratio (FR). We then ranked ordered those SNPs which discriminated CIPN-positive (CIPN +) from CIPN-negative phenotypes based on their discriminatory power and developed the cluster of SNPs which provided the highest predictive accuracy using leave-one-out cross-validation (LOOCV). RESULTS: Using aggregated genotype data obtained from the previously reported ECOG-5103 clinical trial (in which two different arrays were used, HumanOmniExpress (727,227 SNPs) and HumanOmni1-Quad1 (1,131,857 SNPs)), we identified a 267 SNP cluster which was associated with a CIPN + phenotype with an accuracy of 96.1%. CONCLUSIONS: A cluster of SNPs was identified which prospectively discriminated patients most likely to develop symptomatic CIPN following taxane exposure as part of a breast cancer chemotherapy regimen. Validation using an independent patient cohort should be performed.


Asunto(s)
Antineoplásicos , Neoplasias de la Mama , Enfermedades del Sistema Nervioso Periférico , Taxoides , Humanos , Antineoplásicos/efectos adversos , Estudio de Asociación del Genoma Completo , Enfermedades del Sistema Nervioso Periférico/inducido químicamente , Enfermedades del Sistema Nervioso Periférico/genética , Polimorfismo de Nucleótido Simple , Taxoides/efectos adversos , Ensayos Clínicos como Asunto , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Femenino
2.
Support Care Cancer ; 31(3): 178, 2023 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-36809570

RESUMEN

INTRODUCTION: Using GWAS data derived from a large collaborative trial (ECOG-5103), we identified a cluster of 267 SNPs which predicted CIPN in treatment-naive patients as reported in Part 1 of this study. To assess the functional and pathological implications of this set, we identified collective gene signatures were and evaluated the informational value of those signatures in defining CIPN's pathogenesis. METHODS: In Part 1, we analyzed GWAS data derived from ECOG-5103, first identifying those SNPs that were most strongly associated with CIPN using Fisher's ratio. After identifying those SNPs which differentiated CIPN-positive from CIPN-negative phenotypes, we ranked them in order of their discriminatory power to produce a cluster of SNPs which provided the highest predictive accuracy using leave-one-out cross validation (LOOCV). An uncertainty analysis was included. Using the best predictive SNP cluster, we performed gene attribution for each SNP using NCBI Phenotype Genotype Integrator and then assessed functionality by applying GeneAnalytics, Gene Set Enrichment Analysis, and PCViz. RESULTS: Using aggregate data derived from the GWAS, we identified a 267 SNP cluster which was associated with a CIPN+ phenotype with an accuracy of 96.1%. We could attribute 173 genes to the 267 SNP cluster. Six long intergenic non-protein coding genes were excluded. Ultimately, the functional analysis was based on 138 genes. Of the 17 pathways identified by Gene Analytics (GA) software, the irinotecan pharmacokinetic pathway had the highest score. Highly matching gene ontology attributions included flavone metabolic process, flavonoid glucuronidation, xenobiotic glucuronidation, nervous system development, UDP glycosyltransferase activity, retinoic acid binding, protein kinase C binding, and glucoronosyl transferase activity. Gene Set Enrichment Analysis (GSEA) GO terms identified neuron-associated genes as most significant (p = 5.45e-10). Consistent with the GA's output, flavone, and flavonoid associated terms, glucuronidation were noted as were GO terms associated with neurogenesis. CONCLUSION: The application of functional analyses to phenotype-associated SNP clusters provides an independent validation step in assessing the clinical meaningfulness of GWAS-derived data. Functional analyses following gene attribution of a CIPN-predictive SNP cluster identified pathways, gene ontology terms, and a network which were consistent with a neuropathic phenotype.


Asunto(s)
Neoplasias , Enfermedades del Sistema Nervioso Periférico , Humanos , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo , Taxoides/efectos adversos , Enfermedades del Sistema Nervioso Periférico/inducido químicamente , Neoplasias/tratamiento farmacológico
3.
Int J Mol Sci ; 23(21)2022 Oct 26.
Artículo en Inglés | MEDLINE | ID: mdl-36361765

RESUMEN

Noise is a basic ingredient in data, since observed data are always contaminated by unwanted deviations, i.e., noise, which, in the case of overdetermined systems (with more data than model parameters), cause the corresponding linear system of equations to have an imperfect solution. In addition, in the case of highly underdetermined parameterization, noise can be absorbed by the model, generating spurious solutions. This is a very undesirable situation that might lead to incorrect conclusions. We presented mathematical formalism based on the inverse problem theory combined with artificial intelligence methodologies to perform an enhanced sampling of noisy biomedical data to improve the finding of meaningful solutions. Random sampling methods fail for high-dimensional biomedical problems. Sampling methods such as smart model parameterizations, forward surrogates, and parallel computing are better suited for such problems. We applied these methods to several important biomedical problems, such as phenotype prediction and a problem related to predicting the effects of protein mutations, i.e., if a given single residue mutation is neutral or deleterious, causing a disease. We also applied these methods to de novo drug discovery and drug repositioning (repurposing) through the enhanced exploration of huge chemical space. The purpose of these novel methods that address the problem of noise and uncertainty in biomedical data is to find new therapeutic solutions, perform drug repurposing, and accelerate and optimize drug discovery, thus reestablishing homeostasis. Finding the right target, the right compound, and the right patient are the three bottlenecks to running successful clinical trials from the correct analysis of preclinical models. Artificial intelligence can provide a solution to these problems, considering that the character of the data restricts the quality of the prediction, as in any modeling procedure in data analysis. The use of simple and plain methodologies is crucial to tackling these important and challenging problems, particularly drug repositioning/repurposing in rare diseases.


Asunto(s)
Inteligencia Artificial , Reposicionamiento de Medicamentos , Incertidumbre , Reposicionamiento de Medicamentos/métodos , Descubrimiento de Drogas/métodos , Fenotipo
4.
Int J Mol Sci ; 23(9)2022 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-35563034

RESUMEN

Big data in health care is a fast-growing field and a new paradigm that is transforming case-based studies to large-scale, data-driven research. As big data is dependent on the advancement of new data standards, technology, and relevant research, the future development of big data applications holds foreseeable promise in the modern day health care revolution. Enormously large, rapidly growing collections of biomedical omics-data (genomics, proteomics, transcriptomics, metabolomics, glycomics, etc.) and clinical data create major challenges and opportunities for their analysis and interpretation and open new computational gateways to address these issues. The design of new robust algorithms that are most suitable to properly analyze this big data by taking into account individual variability in genes has enabled the creation of precision (personalized) medicine. We reviewed and highlighted the significance of big data analytics for personalized medicine and health care by focusing mostly on machine learning perspectives on personalized medicine, genomic data models with respect to personalized medicine, the application of data mining algorithms for personalized medicine as well as the challenges we are facing right now in big data analytics.


Asunto(s)
Ciencia de los Datos , Medicina de Precisión , Macrodatos , Atención a la Salud , Genómica , Medicina de Precisión/métodos
5.
BMC Bioinformatics ; 21(Suppl 2): 89, 2020 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-32164540

RESUMEN

BACKGROUND: Phenotype prediction problems are usually considered ill-posed, as the amount of samples is very limited with respect to the scrutinized genetic probes. This fact complicates the sampling of the defective genetic pathways due to the high number of possible discriminatory genetic networks involved. In this research, we outline three novel sampling algorithms utilized to identify, classify and characterize the defective pathways in phenotype prediction problems, such as the Fisher's ratio sampler, the Holdout sampler and the Random sampler, and apply each one to the analysis of genetic pathways involved in tumor behavior and outcomes of triple negative breast cancers (TNBC). Altered biological pathways are identified using the most frequently sampled genes and are compared to those obtained via Bayesian Networks (BNs). RESULTS: Random, Fisher's ratio and Holdout samplers were more accurate and robust than BNs, while providing comparable insights about disease genomics. CONCLUSIONS: The three samplers tested are good alternatives to Bayesian Networks since they are less computationally demanding algorithms. Importantly, this analysis confirms the concept of "biological invariance" since the altered pathways should be independent of the sampling methodology and the classifier used for their inference. Nevertheless, still some modifications are needed in the Bayesian networks to be able to sample correctly the uncertainty space in phenotype prediction problems, since the probabilistic parameterization of the uncertainty space is not unique and the use of the optimum network might falsify the pathways analysis.


Asunto(s)
Algoritmos , Neoplasias de la Mama Triple Negativas/patología , Teorema de Bayes , Bases de Datos Genéticas , Femenino , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Metástasis de la Neoplasia , Fenotipo , Análisis de Supervivencia , Neoplasias de la Mama Triple Negativas/genética , Neoplasias de la Mama Triple Negativas/mortalidad
6.
Int J Mol Sci ; 21(10)2020 May 19.
Artículo en Inglés | MEDLINE | ID: mdl-32438758

RESUMEN

We present the analysis of the defective genetic pathways of the Late-Onset Alzheimer's Disease (LOAD) compared to the Mild Cognitive Impairment (MCI) and Healthy Controls (HC) using different sampling methodologies. These algorithms sample the uncertainty space that is intrinsic to any kind of highly underdetermined phenotype prediction problem, by looking for the minimum-scale signatures (header genes) corresponding to different random holdouts. The biological pathways can be identified performing posterior analysis of these signatures established via cross-validation holdouts and plugging the set of most frequently sampled genes into different ontological platforms. That way, the effect of helper genes, whose presence might be due to the high degree of under determinacy of these experiments and data noise, is reduced. Our results suggest that common pathways for Alzheimer's disease and MCI are mainly related to viral mRNA translation, influenza viral RNA transcription and replication, gene expression, mitochondrial translation, and metabolism, with these results being highly consistent regardless of the comparative methods. The cross-validated predictive accuracies achieved for the LOAD and MCI discriminations were 84% and 81.5%, respectively. The difference between LOAD and MCI could not be clearly established (74% accuracy). The most discriminatory genes of the LOAD-MCI discrimination are associated with proteasome mediated degradation and G-protein signaling. Based on these findings we have also performed drug repositioning using Dr. Insight package, proposing the following different typologies of drugs: isoquinoline alkaloids, antitumor antibiotics, phosphoinositide 3-kinase PI3K, autophagy inhibitors, antagonists of the muscarinic acetylcholine receptor and histone deacetylase inhibitors. We believe that the potential clinical relevance of these findings should be further investigated and confirmed with other independent studies.


Asunto(s)
Enfermedad de Alzheimer/tratamiento farmacológico , Enfermedad de Alzheimer/genética , Reposicionamiento de Medicamentos , Transducción de Señal , Edad de Inicio , Estudios de Casos y Controles , Disfunción Cognitiva/genética , Redes Reguladoras de Genes , Humanos , Modelos Lineales , Aprendizaje Automático , Fenotipo
7.
Molecules ; 25(11)2020 May 26.
Artículo en Inglés | MEDLINE | ID: mdl-32466409

RESUMEN

We discuss the use of the regularized linear discriminant analysis (LDA) as a model reduction technique combined with particle swarm optimization (PSO) in protein tertiary structure prediction, followed by structure refinement based on singular value decomposition (SVD) and PSO. The algorithm presented in this paper corresponds to the category of template-based modeling. The algorithm performs a preselection of protein templates before constructing a lower dimensional subspace via a regularized LDA. The protein coordinates in the reduced spaced are sampled using a highly explorative optimization algorithm, regressive-regressive PSO (RR-PSO). The obtained structure is then projected onto a reduced space via singular value decomposition and further optimized via RR-PSO to carry out a structure refinement. The final structures are similar to those predicted by best structure prediction tools, such as Rossetta and Zhang servers. The main advantage of our methodology is that alleviates the ill-posed character of protein structure prediction problems related to high dimensional optimization. It is also capable of sampling a wide range of conformational space due to the application of a regularized linear discriminant analysis, which allows us to expand the differences over a reduced basis set.


Asunto(s)
Proteínas/química , Algoritmos , Análisis Discriminante , Pliegue de Proteína , Estructura Terciaria de Proteína
8.
Int J Mol Sci ; 20(19)2019 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-31546608

RESUMEN

We present the analysis of defective pathways in multiple myeloma (MM) using two recently developed sampling algorithms of the biological pathways: The Fisher's ratio sampler, and the holdout sampler. We performed the retrospective analyses of different gene expression datasets concerning different aspects of the disease, such as the existing difference between bone marrow stromal cells in MM and healthy controls (HC), the gene expression profiling of CD34+ cells in MM and HC, the difference between hyperdiploid and non-hyperdiploid myelomas, and the prediction of the chromosome 13 deletion, to provide a deeper insight into the molecular mechanisms involved in the disease. Our analysis has shown the importance of different altered pathways related to glycosylation, infectious disease, immune system response, different aspects of metabolism, DNA repair, protein recycling and regulation of the transcription of genes involved in the differentiation of myeloid cells. The main difference in genetic pathways between hyperdiploid and non-hyperdiploid myelomas are related to infectious disease, immune system response and protein recycling. Our work provides new insights on the genetic pathways involved in this complex disease and proposes novel targets for future therapies.


Asunto(s)
Células de la Médula Ósea/metabolismo , Cromosomas Humanos Par 13/genética , Células Madre Hematopoyéticas/metabolismo , Mieloma Múltiple/metabolismo , Algoritmos , Aneuploidia , Antígenos CD34/inmunología , Cromosomas Humanos Par 13/metabolismo , Perfilación de la Expresión Génica , Células Madre Hematopoyéticas/inmunología , Humanos , Mieloma Múltiple/genética , Mieloma Múltiple/inmunología , Estudios Retrospectivos , Transducción de Señal , Células del Estroma/metabolismo
9.
Entropy (Basel) ; 20(2)2018 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-33265187

RESUMEN

Most inverse problems in the industry (and particularly in geophysical exploration) are highly underdetermined because the number of model parameters too high to achieve accurate data predictions and because the sampling of the data space is scarce and incomplete; it is always affected by different kinds of noise. Additionally, the physics of the forward problem is a simplification of the reality. All these facts result in that the inverse problem solution is not unique; that is, there are different inverse solutions (called equivalent), compatible with the prior information that fits the observed data within similar error bounds. In the case of nonlinear inverse problems, these equivalent models are located in disconnected flat curvilinear valleys of the cost-function topography. The uncertainty analysis consists of obtaining a representation of this complex topography via different sampling methodologies. In this paper, we focus on the use of a particle swarm optimization (PSO) algorithm to sample the region of equivalence in nonlinear inverse problems. Although this methodology has a general purpose, we show its application for the uncertainty assessment of the solution of a geophysical problem concerning gravity inversion in sedimentary basins, showing that it is possible to efficiently perform this task in a sampling-while-optimizing mode. Particularly, we explain how to use and analyze the geophysical models sampled by exploratory PSO family members to infer different descriptors of nonlinear uncertainty.

10.
J Gene Med ; 19(1-2)2017 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-27928896

RESUMEN

BACKGROUND: B-cell chronic lymphocytic leukemia (CLL) is a heterogeneous disease and the most common adult leukemia in western countries. IgVH mutational status distinguishes two major types of CLL, each associated with a different prognosis and survival. Sequencing identified NOTCH1 and SF3B1 as the two main recurrent mutations. We described a novel method to clarify how these mutations affect gene expression by finding small-scale signatures that predict the IgVH, NOTCH1 and SF3B1 mutations. We subsequently defined the biological pathways and correlation networks involved in disease development, with the potential goal of identifying new drugable targets. METHODS: We modeled a microarray dataset consisting of 48807 probes derived from 163 samples. The use of Fisher's ratio and fold change combined with feature elimination allowed us to identify the minimum number of genes with the highest predictive mutation power and, subsequently, we applied network and pathway analyses of these genes to identify their biological roles. RESULTS: The mutational status of the patients was accurately predicted (94-99%) using small-scale gene signatures: 13 genes for IgVH, 60 for NOTCH1 and 22 for SF3B1. LPL plays an important role in the case of the IgVH mutation, whereas MSI2, LTK, TFEC and CNTAP2 are involved in the NOTCH1 mutation, and RPL32 and PLAGL1 are involved in the SF3B1 mutation. Four high discriminatory genes (IGHG1, MYBL1, NRIP1 and RGS1) are common to these three mutations. The IL-4-mediated signaling events pathway appears to be involved as a common mechanism and suggests an important role of the immune response mechanisms and antigen presentation. CONCLUSIONS: This retrospective analysis served to provide a deeper understanding of the effects of the different mutations in CLL disease progression, with the expectation that these findings will be clinically applied in the near future to the development of new drugs.


Asunto(s)
Genómica , Leucemia Linfocítica Crónica de Células B/genética , Biomarcadores de Tumor , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Redes Reguladoras de Genes , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genómica/métodos , Humanos , Cadenas Pesadas de Inmunoglobulina/genética , Leucemia Linfocítica Crónica de Células B/diagnóstico , Leucemia Linfocítica Crónica de Células B/metabolismo , Leucemia Linfocítica Crónica de Células B/mortalidad , Modelos Biológicos , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos , Fosfoproteínas/genética , Análisis de Componente Principal , Pronóstico , Factores de Empalme de ARN/genética , Receptores Notch/genética , Reproducibilidad de los Resultados , Estudios Retrospectivos , Transducción de Señal
11.
Comput Biol Med ; 149: 106029, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36067633

RESUMEN

BACKGROUND: To understand the transcriptomic response to SARS-CoV-2 infection, is of the utmost importance to design diagnostic tools predicting the severity of the infection. METHODS: We have performed a deep sampling analysis of the viral transcriptomic data oriented towards drug repositioning. Using different samplers, the basic principle of this methodology the biological invariance, which means that the pathways altered by the disease, should be independent on the algorithm used to unravel them. RESULTS: The transcriptomic analysis of the altered pathways, reveals a distinctive inflammatory response and potential side effects of infection. The virus replication causes, in some cases, acute respiratory distress syndrome in the lungs, and affects other organs such as heart, brain, and kidneys. Therefore, the repositioned drugs to fight COVID-19 should, not only target the interferon signalling pathway and the control of the inflammation, but also the altered genetic pathways related to the side effects of infection. We also show via Principal Component Analysis that the transcriptome signatures are different from influenza and RSV. The gene COL1A1, which controls collagen production, seems to play a key/vital role in the regulation of the immune system. Additionally, other small-scale signature genes appear to be involved in the development of other COVID-19 comorbidities. CONCLUSIONS: Transcriptome-based drug repositioning offers possible fast-track antiviral therapy for COVID-19 patients. It calls for additional clinical studies using FDA approved drugs for patients with increased susceptibility to infection and with serious medical complications.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , COVID-19 , SARS-CoV-2 , Antivirales/farmacología , Antivirales/uso terapéutico , COVID-19/genética , Reposicionamiento de Medicamentos , Humanos , Interferones , Transcriptoma/genética
12.
Comput Math Methods Med ; 2021: 5556433, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34422090

RESUMEN

The prediction of the dynamics of the COVID-19 outbreak and the corresponding needs of the health care system (COVID-19 patients' admissions, the number of critically ill patients, need for intensive care units, etc.) is based on the combination of a limited growth model (Verhulst model) and a short-term predictive model that allows predictions to be made for the following day. In both cases, the uncertainty analysis of the prediction is performed, i.e., the set of equivalent models that adjust the historical data with the same accuracy. This set of models provides the posterior distribution of the parameters of the predictive model that adjusts the historical series. It can be extrapolated to the same analyzed time series (e.g., the number of infected individuals per day) or to another time series of interest to which it is correlated and used, e.g., to predict the number of patients admitted to urgent care units, the number of critically ill patients, or the total number of admissions, which are directly related to health needs. These models can be regionalized, that is, the predictions can be made at the local level if data are disaggregated. We show that the Verhulst and the Gompertz models provide similar results and can be also used to monitor and predict new outbreaks. However, the Verhulst model seems to be easier to interpret and to use.


Asunto(s)
COVID-19/epidemiología , Modelos Biológicos , Pandemias , SARS-CoV-2 , COVID-19/transmisión , Biología Computacional , Necesidades y Demandas de Servicios de Salud , Humanos , Conceptos Matemáticos , Modelos Estadísticos , Pandemias/estadística & datos numéricos , España/epidemiología , Factores de Tiempo
13.
Int J Hyg Environ Health ; 234: 113723, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33690094

RESUMEN

An outbreak of the novel COVID-19 virus occurred during February 2020 onwards in almost all the European countries, including Spain. This study covers the correlation found between weather variables (Maximum Temperature, Minimum Temperature, Mean Temperature, Atmospheric Pressure, Daily Rainfall, Daily Sun hours) and the coronavirus propagation in Spain. A strong relationship is found when correlating the virus spread to the mean temperature, minimum temperature, and atmospheric pressure in different Spanish provinces. In this analysis we have used the ratio of the PCR COVID-19 positives with respect to the population size. A linear regression model using the mean temperature is implemented. Moreover, an analysis of variance is used to confirm the influence of mean temperature on the spread of virus. As a second measurement of the COVID-19 outbreak we have used the results of the antibodies tests carried out in Spain that provide an estimation of the heard immunity achieved. Based on this analysis, an estimation of the asymptomatic population is performed. All these results exhibit significant correlation with weather variables. The most affected provinces were Soria, Segovia and Ciudad Real, which are the coldest. On the opposite side, places such as Southern Spain, the Baleares, and Canary Islands showed a lower rate of spread. This might be related to the warmer climate and the insularity of these islands. Besides, the coastal influence and the daily sun hours might also influence the lower rates in the east and west regions in Spain. This analysis provides a deeper insight of the influence of weather variables onto the COVID-19 spread in Spain.


Asunto(s)
COVID-19/epidemiología , Clima , Brotes de Enfermedades/estadística & datos numéricos , Análisis de Varianza , Humanos , Modelos Lineales , SARS-CoV-2 , España/epidemiología , Temperatura , Tiempo (Meteorología)
14.
Pharmgenomics Pers Med ; 13: 105-119, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32256101

RESUMEN

The complexity of orphan diseases, which are those that do not have an effective treatment, together with the high dimensionality of the genetic data used for their analysis and the high degree of uncertainty in the understanding of the mechanisms and genetic pathways which are involved in their development, motivate the use of advanced techniques of artificial intelligence and in-depth knowledge of molecular biology, which is crucial in order to find plausible solutions in drug design, including drug repositioning. Particularly, we show that the use of robust deep sampling methodologies of the altered genetics serves to obtain meaningful results and dramatically decreases the cost of research and development in drug design, influencing very positively the use of precision medicine and the outcomes in patients. The target-centric approach and the use of strong prior hypotheses that are not matched against reality (disease genetic data) are undoubtedly the cause of the high number of drug design failures and attrition rates. Sampling and prediction under uncertain conditions cannot be avoided in the development of precision medicine.

15.
Cancers (Basel) ; 13(1)2020 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-33374500

RESUMEN

Artificial intelligence methods may help in unveiling information that is hidden in high-dimensional oncological data. Flow cytometry studies of haematological malignancies provide quantitative data with the potential to be used for the construction of response biomarkers. Many computational methods from the bioinformatics toolbox can be applied to these data, but they have not been exploited in their full potential in leukaemias, specifically for the case of childhood B-cell Acute Lymphoblastic Leukaemia. In this paper, we analysed flow cytometry data that were obtained at diagnosis from 56 paediatric B-cell Acute Lymphoblastic Leukaemia patients from two local institutions. Our aim was to assess the prognostic potential of immunophenotypical marker expression intensity. We constructed classifiers that are based on the Fisher's Ratio to quantify differences between patients with relapsing and non-relapsing disease. We also correlated this with genetic information. The main result that arises from the data was the association between subexpression of marker CD38 and the probability of relapse.

16.
Expert Opin Drug Discov ; 14(8): 769-777, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31140873

RESUMEN

Introduction: Drug discovery is the process through which potential new compounds are identified by means of biology, chemistry, and pharmacology. Due to the high complexity of genomic data, AI techniques are increasingly needed to help reduce this and aid the adoption of optimal decisions. Phenotypic prediction is of particular use to drug discovery and precision medicine where sets of genes that predict a given phenotype are determined. Phenotypic prediction is an undetermined problem given that the number of monitored genetic probes markedly exceeds the number of collected samples (from patients). This imbalance creates ambiguity in the characterization of the biological pathways that are responsible for disease development. Areas covered: In this paper, the authors present AI methodologies that perform a robust deep sampling of altered genetic pathways to locate new therapeutic targets, assist in drug repurposing and speed up and optimize the drug selection process. Expert opinion: AI is a potential solution to a number of drug discovery problems, though one should, bear in mind that the quality of data predicts the overall quality of the prediction, as in any modeling task in data science. The use of transparent methodologies is crucial, particularly in drug repositioning/repurposing in rare diseases.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas/métodos , Reposicionamiento de Medicamentos , Humanos , Fenotipo , Medicina de Precisión/métodos
17.
J Mol Model ; 25(3): 79, 2019 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-30810816

RESUMEN

We discuss the relationship between the problem of protein tertiary structure prediction from the amino acid sequence and the uncertainty analysis. The algorithm presented in this paper belongs to the category of decoy-based modeling, where different known protein models are used to establish a low dimensional space via principal component analysis. The low dimensional space is utilized to perform an energy optimization via a family of very explorative particle swarm optimizers to find the global minimum. The aim of this procedure is to get a representative sample of the nonlinear equivalent region, that is, protein models that have their energy lower than a certain energy bound. The posterior analysis of this family provides very valuable information about the backbone structure of the native conformation and its possible alternate states. This methodology has the advantage of being simple and fast and can help refine the tertiary protein structure. We comprehensively illustrate the performance of our algorithm on one protein from the CASP-9 protein structure prediction experiment. We also provide a theoretical analysis of the energy landscape found in the tertiary structure protein inverse problem, explaining why model reduction techniques (principal component analysis in this case) serve to alleviate the ill-posed character of this high dimensional optimization problem. In addition, we expand the computational benchmark with a summary of other CASP-9 proteins in the Appendix.


Asunto(s)
Caspasa 9/química , Biología Computacional/métodos , Algoritmos , Secuencia de Aminoácidos , Simulación por Computador , Modelos Moleculares , Análisis de Componente Principal , Pliegue de Proteína , Estructura Terciaria de Proteína , Proteínas/química , Incertidumbre
18.
Biomolecules ; 10(1)2019 12 31.
Artículo en Inglés | MEDLINE | ID: mdl-31906171

RESUMEN

Accurate prediction of protein stability changes resulting from amino acid substitutions is of utmost importance in medicine to better understand which mutations are deleterious, leading to diseases, and which are neutral. Since conducting wet lab experiments to get a better understanding of protein mutations is costly and time consuming, and because of huge number of possible mutations the need of computational methods that could accurately predict effects of amino acid mutations is of greatest importance. In this research, we present a robust methodology to predict the energy changes of a proteins upon mutations. The proposed prediction scheme is based on two step algorithm that is a Holdout Random Sampler followed by a neural network model for regression. The Holdout Random Sampler is utilized to analysis the energy change, the corresponding uncertainty, and to obtain a set of admissible energy changes, expressed as a cumulative distribution function. These values are further utilized to train a simple neural network model that can predict the energy changes. Results were blindly tested (validated) against experimental energy changes, giving Pearson correlation coefficients of 0.66 for Single Point Mutations and 0.77 for Multiple Point Mutations. These results confirm the successfulness of our method, since it outperforms majority of previous studies in this field.


Asunto(s)
Redes Neurales de la Computación , Estabilidad Proteica , Proteínas/genética , Aminoácidos/química , Aminoácidos/genética , Bases de Datos de Proteínas , Aprendizaje Automático , Mutación Puntual/genética , Proteínas/química , Termodinámica
19.
Mech Ageing Dev ; 182: 111129, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31445068

RESUMEN

Sarcopenia is an age-related multifactorial process that involved several biological mechanisms, whose specific contribution and interplay is still unknown. The present study proposes prognostic networks based on machine learning approaches to unravel the interplay among those biological mechanisms mainly involved in the development of Sarcopenia. After analyzing 114 biological and clinical variables in adults older than 70 years, and using all the biological prognostic networks detected by machine learning with accuracy higher than 82%, we designed a consensus classifier based on majority vote that improve the predictive accuracy of Sarcopenia up to 91%. Additionally, we applied logistic regression analysis to propose the interplay among the most discriminative biological variables of Sarcopenia: anthropometry, body composition, functional performance of lower limbs, systemic oxidative stress, presence of depression and medication for the digestive system based on proton-pump inhibitors. Our data also demonstrate that besides a loss of muscle mass, impairments on functional performance of lower limbs are more relevant for develop Sarcopenia than those affecting the muscle strength.


Asunto(s)
Aprendizaje Automático , Sarcopenia , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Pronóstico , Sarcopenia/diagnóstico , Sarcopenia/metabolismo , Sarcopenia/patología
20.
J Bioinform Comput Biol ; 16(2): 1850005, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29566640

RESUMEN

We discuss applicability of principal component analysis (PCA) for protein tertiary structure prediction from amino acid sequence. The algorithm presented in this paper belongs to the category of protein refinement models and involves establishing a low-dimensional space where the sampling (and optimization) is carried out via particle swarm optimizer (PSO). The reduced space is found via PCA performed for a set of low-energy protein models previously found using different optimization techniques. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. This term is aimed at providing high frequency details in the energy optimization. The goal of this research is to analyze how the dimensionality reduction affects the prediction capability of the PSO procedure. For that purpose, different proteins from the Critical Assessment of Techniques for Protein Structure Prediction experiments were modeled. In all the cases, both the energy of the best decoy and the distance to the native structure have decreased. Our analysis also shows how the predicted backbone structure of native conformation and of alternative low energy states varies with respect to the PCA dimensionality. Generally speaking, the reconstruction can be successfully achieved with 10 principal components and the high frequency term. We also provide a computational analysis of protein energy landscape for the inverse problem of reconstructing structure from the reduced number of principal components, showing that the dimensionality reduction alleviates the ill-posed character of this high-dimensional energy optimization problem. The procedure explained in this paper is very fast and allows testing different PCA expansions. Our results show that PSO improves the energy of the best decoy used in the PCA when the adequate number of PCA terms is considered.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Análisis de Componente Principal , Estructura Terciaria de Proteína , Proteínas/química , Proteínas/metabolismo , Uracil-ADN Glicosidasa/química , Uracil-ADN Glicosidasa/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA