Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
bioRxiv ; 2024 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-38826198

RESUMEN

Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. [2010], gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent RNA polymerase ribozyme structure.

2.
ArXiv ; 2024 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-38827456

RESUMEN

Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. [2010], gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent RNA polymerase ribozyme structure. Open source code: https://github.com/chaitjo/geometric-rna-design.

3.
Nat Mach Intell ; 5(7): 739-753, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37771758

RESUMEN

Integrating gene expression across tissues and cell types is crucial for understanding the coordinated biological mechanisms that drive disease and characterise homeostasis. However, traditional multitissue integration methods cannot handle uncollected tissues or rely on genotype information, which is often unavailable and subject to privacy concerns. Here we present HYFA (Hypergraph Factorisation), a parameter-efficient graph representation learning approach for joint imputation of multi-tissue and cell-type gene expression. HYFA is genotype-agnostic, supports a variable number of collected tissues per individual, and imposes strong inductive biases to leverage the shared regulatory architecture of tissues and genes. In performance comparison on Genotype-Tissue Expression project data, HYFA achieves superior performance over existing methods, especially when multiple reference tissues are available. The HYFA-imputed dataset can be used to identify replicable regulatory genetic variations (eQTLs), with substantial gains over the original incomplete dataset. HYFA can accelerate the effective and scalable integration of tissue and cell-type transcriptome biorepositories.

4.
Bioinformatics ; 38(3): 730-737, 2022 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-33471074

RESUMEN

MOTIVATION: High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets of an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticized because they fail to emulate key properties of gene expression data. In this article, we develop a method based on a conditional generative adversarial network to generate realistic transcriptomics data for Escherichia coli and humans. We assess the performance of our approach across several tissues and cancer-types. RESULTS: We show that our model preserves several gene expression properties significantly better than widely used simulators, such as SynTReN or GeneNetWeaver. The synthetic data preserve tissue- and cancer-specific properties of transcriptomics data. Moreover, it exhibits real gene clusters and ontologies both at local and global scales, suggesting that the model learns to approximate the gene expression manifold in a biologically meaningful way. AVAILABILITY AND IMPLEMENTATION: Code is available at: https://github.com/rvinas/adversarial-gene-expression. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Escherichia coli , Perfilación de la Expresión Génica , Humanos , Perfilación de la Expresión Génica/métodos , Expresión Génica
5.
Bioinformatics ; 38(5): 1320-1327, 2022 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-34888618

RESUMEN

MOTIVATION: Gene expression data are commonly used at the intersection of cancer research and machine learning for better understanding of the molecular status of tumour tissue. Deep learning predictive models have been employed for gene expression data due to their ability to scale and remove the need for manual feature engineering. However, gene expression data are often very high dimensional, noisy and presented with a low number of samples. This poses significant problems for learning algorithms: models often overfit, learn noise and struggle to capture biologically relevant information. In this article, we utilize external biological knowledge embedded within structures of gene interaction graphs such as protein-protein interaction (PPI) networks to guide the construction of predictive models. RESULTS: We present Gene Interaction Network Constrained Construction (GINCCo), an unsupervised method for automated construction of computational graph models for gene expression data that are structurally constrained by prior knowledge of gene interaction networks. We employ this methodology in a case study on incorporating a PPI network in cancer phenotype prediction tasks. Our computational graphs are structurally constructed using topological clustering algorithms on the PPI networks which incorporate inductive biases stemming from network biology research on protein complex discovery. Each of the entities in the GINCCo computational graph represents biological entities such as genes, candidate protein complexes and phenotypes instead of arbitrary hidden nodes of a neural network. This provides a biologically relevant mechanism for model regularization yielding strong predictive performance while drastically reducing the number of model parameters and enabling guided post-hoc enrichment analyses of influential gene sets with respect to target phenotypes. Our experiments analysing a variety of cancer phenotypes show that GINCCo often outperforms support vector machine, Fully Connected Multi-layer Perceptrons (MLP) and Randomly Connected MLPs despite greatly reduced model complexity. AVAILABILITY AND IMPLEMENTATION: https://github.com/paulmorio/gincco contains the source code for our approach. We also release a library with algorithms for protein complex discovery within PPI networks at https://github.com/paulmorio/protclus. This repository contains implementations of the clustering algorithms used in this article. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Neoplasias , Humanos , Redes Neurales de la Computación , Programas Informáticos , Neoplasias/genética , Sesgo , Expresión Génica , Biología Computacional/métodos
6.
Front Genet ; 12: 624128, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33927746

RESUMEN

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

7.
Euro Surveill ; 26(9)2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33663646

RESUMEN

BackgroundSeveral clinical trials have assessed the protective potential of chloroquine and hydroxychloroquine. Chronic exposure to such drugs might lower the risk of infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) or severe coronavirus disease (COVID-19).AimTo assess COVID-19 incidence and risk of hospitalisation in a cohort of patients chronically taking chloroquine/hydroxychloroquine.MethodsWe used linked health administration databases to follow a cohort of patients with chronic prescription of hydroxychloroquine/chloroquine and a control cohort matched by age, sex and primary care service area, between 1 January and 30 April 2020. COVID-19 cases were identified using International Classification of Diseases 10 codes.ResultsWe analysed a cohort of 6,746 patients (80% female) with active prescriptions for hydroxychloroquine/chloroquine, and 13,492 controls. During follow-up, there were 97 (1.4%) COVID-19 cases in the exposed cohort and 183 (1.4%) among controls. The incidence rate was very similar between the two groups (12.05 vs 11.35 cases/100,000 person-days). The exposed cohort was not at lower risk of infection compared with controls (hazard ratio (HR): 1.08; 95% confidence interval (CI): 0.83-1.44; p = 0.50). Forty cases (0.6%) were admitted to hospital in the exposed cohort and 50 (0.4%) in the control cohort, suggesting a higher hospitalisation rate in the former, though differences were not confirmed after adjustment (HR: 1·46; 95% CI: 0.91-2.34; p = 0.10).ConclusionsPatients chronically exposed to chloroquine/hydroxychloroquine did not differ in risk of COVID-19 nor hospitalisation, compared with controls. As controls were mainly female, findings might not be generalisable to a male population.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , COVID-19 , Antivirales/uso terapéutico , COVID-19/epidemiología , Cloroquina/efectos adversos , Femenino , Humanos , Hidroxicloroquina/efectos adversos , Incidencia , Masculino , Estudios Prospectivos , SARS-CoV-2 , España/epidemiología
8.
Artif Intell Med ; 85: 43-49, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-28943335

RESUMEN

OBJECTIVE: The use of artificial intelligence techniques to find out which Single Nucleotide Polymorphisms (SNPs) promote the development of a disease is one of the features of medical research, as such techniques may potentially aid early diagnosis and help in the prescription of preventive measures. In particular, the aim is to help physicians to identify the relevant SNPs related to Type 2 diabetes, and to build a decision-support tool for risk prediction. METHODS: We use the Random Forest (RF) technique in order to search for the most important attributes (SNPs) related to diabetes, giving a weight (degree of importance), ranging between 0 and 1, to each attribute. Support Vector Machines and Logistic Regression have also been used since they are two other machine learning techniques that are well-established in the health community. Their performance has been compared to that achieved by RF. Furthermore, the relevance of the attributes obtained through the use of RF has then been used to perform predictions with k-Nearest Neighbour method weighting attributes in the similarity measure according to the relevance of the attributes with RF. RESULTS: Testing is performed on a set of 677 subjects. RF is able to handle the complexity of features' interactions, overfitting, and unknown attribute values, providing the SNPs' relevance with an up to 0.89 area under the ROC curve in terms of risk prediction. RF outperforms all the other tested machine learning techniques in terms of prediction accuracy, and in terms of the stability of the estimated relevance of the attributes. CONCLUSIONS: The Random Forest is a useful method for learning predictive models and the relevance of SNPs without any underlying assumption.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Técnicas de Apoyo para la Decisión , Diabetes Mellitus Tipo 2/genética , Polimorfismo de Nucleótido Simple , Máquina de Vectores de Soporte , Toma de Decisiones Clínicas , Árboles de Decisión , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/terapia , Predisposición Genética a la Enfermedad , Humanos , Modelos Logísticos , Fenotipo , Pronóstico , Medición de Riesgo , Factores de Riesgo
9.
Med Clin (Barc) ; 134 Suppl 1: 39-44, 2010 Feb.
Artículo en Español | MEDLINE | ID: mdl-20211352

RESUMEN

In this work some of the fundamentals of change management techniques to ensure the introduction of information and communication technologies in health organizations are analized. Managing change is aimed at redirecting the impact of any transformation process in the organizations towards a positive attitude and enthusiasm of those involved. That is, this paper analyzes the most important of all factors that must be managed in any project for change: the human factor. If a proper change management is a critical success factor in implementing new processes and systems of information and communication technologies (ICT) in an organization, when we faced with the introduction of new processes and interoperability systems between different organizations, cooperation, leadership and motivation of individuals focused on a common goal is absolutely imperative. This is the case of the new ICT systems being introduced in the Catalan Health System. Indeed, by definition of the model itself, in Catalonia, continuity of care, increased efficiency and effectiveness and quality improvement of projects as the clinical history shared, electronic prescriptions, or scanning medical imaging, require necessarily the definition of processes in which a large number of different health organizations, different in their law status, and whose own interests should converge towards the ICT systems and processes of health care so that the contribution of all parties can make a whole. The success of these projects, a reality nowadays, is due largely to the management of the human factor conducted continuously since its inception.


Asunto(s)
Comunicación , Atención a la Salud , Sistemas de Información , Humanos , Innovación Organizacional
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...