Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 167
Filtrar
1.
Sci Rep ; 14(1): 13188, 2024 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-38851759

RESUMO

Genome interpretation (GI) encompasses the computational attempts to model the relationship between genotype and phenotype with the goal of understanding how the first leads to the second. While traditional approaches have focused on sub-problems such as predicting the effect of single nucleotide variants or finding genetic associations, recent advances in neural networks (NNs) have made it possible to develop end-to-end GI models that take genomic data as input and predict phenotypes as output. However, technical and modeling issues still need to be fixed for these models to be effective, including the widespread underdetermination of genomic datasets, making them unsuitable for training large, overfitting-prone, NNs. Here we propose novel GI models to address this issue, exploring the use of two types of transfer learning approaches and proposing a novel Biologically Meaningful Sparse NN layer specifically designed for end-to-end GI. Our models predict the leaf and seed ionome in A.thaliana, obtaining comparable results to our previous over-parameterized model while reducing the number of parameters by 8.8 folds. We also investigate how the effect of population stratification influences the evaluation of the performances, highlighting how it leads to (1) an instance of the Simpson's Paradox, and (2) model generalization limitations.


Assuntos
Arabidopsis , Genoma de Planta , Folhas de Planta , Sementes , Arabidopsis/genética , Folhas de Planta/genética , Folhas de Planta/metabolismo , Sementes/genética , Sementes/metabolismo , Redes Neurais de Computação , Genômica/métodos , Fenótipo , Modelos Genéticos , Genótipo
2.
Comput Struct Biotechnol J ; 23: 1773-1785, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38689715

RESUMO

Magnesium (Mg)-based implants have emerged as a promising alternative for orthopedic applications, owing to their bioactive properties and biodegradability. As the implants degrade, Mg2+ ions are released, influencing all surrounding cell types, especially mesenchymal stem cells (MSCs). MSCs are vital for bone tissue regeneration, therefore, it is essential to understand their molecular response to Mg2+ ions in order to maximize the potential of Mg-based biomaterials. In this study, we conducted a gene regulatory network (GRN) analysis to examine the molecular responses of MSCs to Mg2+ ions. We used time-series proteomics data collected at 11 time points across a 21-day period for the GRN construction. We studied the impact of Mg2+ ions on the resulting networks and identified the key proteins and protein interactions affected by the application of Mg2+ ions. Our analysis highlights MYL1, MDH2, GLS, and TRIM28 as the primary targets of Mg2+ ions in the response of MSCs during 1-21 days phase. Our results also identify MDH2-MYL1, MDH2-RPS26, TRIM28-AK1, TRIM28-SOD2, and GLS-AK1 as the critical protein relationships affected by Mg2+ ions. By offering a comprehensive understanding of the regulatory role of Mg2+ ions on MSCs, our study contributes valuable insights into the molecular response of MSCs to Mg-based materials, thereby facilitating the development of innovative therapeutic strategies for orthopedic applications.

3.
J Chem Inf Model ; 64(7): 2331-2344, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37642660

RESUMO

Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets.


Assuntos
Benchmarking , Relação Quantitativa Estrutura-Atividade , Bioensaio , Aprendizado de Máquina
4.
JMIR Med Inform ; 11: e48030, 2023 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-37943585

RESUMO

BACKGROUND: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence. OBJECTIVE: This study aims to present a comprehensive, research question-agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing. METHODS: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline's effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative. RESULTS: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19. CONCLUSIONS: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.

5.
Sci Rep ; 13(1): 19449, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945674

RESUMO

High-throughput sequencing allowed the discovery of many disease variants, but nowadays it is becoming clear that the abundance of genomics data mostly just moved the bottleneck in Genetics and Precision Medicine from a data availability issue to a data interpretation issue. To solve this empasse it would be beneficial to apply the latest Deep Learning (DL) methods to the Genome Interpretation (GI) problem, similarly to what AlphaFold did for Structural Biology. Unfortunately DL requires large datasets to be viable, and aggregating genomics datasets poses several legal, ethical and infrastructural complications. Federated Learning (FL) is a Machine Learning (ML) paradigm designed to tackle these issues. It allows ML methods to be collaboratively trained and tested on collections of physically separate datasets, without requiring the actual centralization of sensitive data. FL could thus be key to enable DL applications to GI on sufficiently large genomics data. We propose FedCrohn, a FL GI Neural Network model for the exome-based Crohn's Disease risk prediction, providing a proof-of-concept that FL is a viable paradigm to build novel ML GI approaches. We benchmark it in several realistic scenarios, showing that FL can indeed provide performances similar to conventional ML on centralized data, and that collaborating in FL initiatives is likely beneficial for most of the medical centers participating in them.


Assuntos
Doença de Crohn , Exoma , Humanos , Exoma/genética , Doença de Crohn/genética , Genômica , Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala
6.
Genome Biol ; 24(1): 224, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798735

RESUMO

BACKGROUND: Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination ([Formula: see text]). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects. RESULTS: We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case-control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis. CONCLUSIONS: In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.


Assuntos
Doenças Inflamatórias Intestinais , Dinâmica não Linear , Humanos , Tamanho da Amostra , Doenças Inflamatórias Intestinais/genética , Redes Neurais de Computação , Fenótipo
7.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37255310

RESUMO

MOTIVATION: The prediction of reliable Drug-Target Interactions (DTIs) is a key task in computer-aided drug design and repurposing. Here, we present a new approach based on data fusion for DTI prediction built on top of the NXTfusion library, which generalizes the Matrix Factorization paradigm by extending it to the nonlinear inference over Entity-Relation graphs. RESULTS: We benchmarked our approach on five datasets and we compared our models against state-of-the-art methods. Our models outperform most of the existing methods and, simultaneously, retain the flexibility to predict both DTIs as binary classification and regression of the real-valued drug-target affinity, competing with models built explicitly for each task. Moreover, our findings suggest that the validation of DTI methods should be stricter than what has been proposed in some previous studies, focusing more on mimicking real-life DTI settings where predictions for previously unseen drugs, proteins, and drug-protein pairs are needed. These settings are exactly the context in which the benefit of integrating heterogeneous information with our Entity-Relation data fusion approach is the most evident. AVAILABILITY AND IMPLEMENTATION: All software and data are available at https://github.com/eugeniomazzone/CPI-NXTFusion and https://pypi.org/project/NXTfusion/.


Assuntos
Desenvolvimento de Medicamentos , Software , Proteínas , Interações Medicamentosas , Desenho de Fármacos
8.
Pharmaceutics ; 15(3)2023 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-36986760

RESUMO

In vitro non-cellular permeability models such as the parallel artificial membrane permeability assay (PAMPA) are widely applied tools for early-phase drug candidate screening. In addition to the commonly used porcine brain polar lipid extract for modeling the blood-brain barrier's permeability, the total and polar fractions of bovine heart and liver lipid extracts were investigated in the PAMPA model by measuring the permeability of 32 diverse drugs. The zeta potential of the lipid extracts and the net charge of their glycerophospholipid components were also determined. Physicochemical parameters of the 32 compounds were calculated using three independent forms of software (Marvin Sketch, RDKit, and ACD/Percepta). The relationship between the lipid-specific permeabilities and the physicochemical descriptors of the compounds was investigated using linear correlation, Spearman correlation, and PCA analysis. While the results showed only subtle differences between total and polar lipids, permeability through liver lipids highly differed from that of the heart or brain lipid-based models. Correlations between the in silico descriptors (e.g., number of amide bonds, heteroatoms, and aromatic heterocycles, accessible surface area, and H-bond acceptor-donor balance) of drug molecules and permeability values were also found, which provides support for understanding tissue-specific permeability.

13.
Mult Scler Relat Disord ; 66: 104072, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35917745

RESUMO

BACKGROUND: Interferon-ß, a disease-modifying therapy (DMT) for MS, may be associated with less severe COVID-19 in people with MS. RESULTS: Among 5,568 patients (83.4% confirmed COVID-19), interferon-treated patients had lower risk of severe COVID-19 compared to untreated, but not to glatiramer-acetate, dimethyl-fumarate, or pooled other DMTs. CONCLUSIONS: In comparison to other DMTs, we did not find evidence of protective effects of interferon-ß on the severity of COVID-19, though compared to the untreated, the course of COVID19 was milder among those on interferon-ß. This study does not support the use of interferon-ß as a treatment to reduce COVID-19 severity in MS.


Assuntos
COVID-19 , Esclerose Múltipla Recidivante-Remitente , Esclerose Múltipla , Acetatos , Fumarato de Dimetilo/uso terapêutico , Acetato de Glatiramer/uso terapêutico , Humanos , Imunossupressores/efeitos adversos , Interferon beta/uso terapêutico , Esclerose Múltipla/induzido quimicamente , Esclerose Múltipla/complicações , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla Recidivante-Remitente/induzido quimicamente
14.
Artigo em Inglês | MEDLINE | ID: mdl-36038263

RESUMO

BACKGROUND AND OBJECTIVES: Certain demographic and clinical characteristics, including the use of some disease-modifying therapies (DMTs), are associated with severe acute respiratory syndrome coronavirus 2 infection severity in people with multiple sclerosis (MS). Comprehensive exploration of these relationships in large international samples is needed. METHODS: Clinician-reported demographic/clinical data from 27 countries were aggregated into a data set of 5,648 patients with suspected/confirmed coronavirus disease 2019 (COVID-19). COVID-19 severity outcomes (hospitalization, admission to intensive care unit [ICU], requiring artificial ventilation, and death) were assessed using multilevel mixed-effects ordered probit and logistic regression, adjusted for age, sex, disability, and MS phenotype. DMTs were individually compared with glatiramer acetate, and anti-CD20 DMTs with pooled other DMTs and with natalizumab. RESULTS: Of 5,648 patients, 922 (16.6%) with suspected and 4,646 (83.4%) with confirmed COVID-19 were included. Male sex, older age, progressive MS, and higher disability were associated with more severe COVID-19. Compared with glatiramer acetate, ocrelizumab and rituximab were associated with higher probabilities of hospitalization (4% [95% CI 1-7] and 7% [95% CI 4-11]), ICU/artificial ventilation (2% [95% CI 0-4] and 4% [95% CI 2-6]), and death (1% [95% CI 0-2] and 2% [95% CI 1-4]) (predicted marginal effects). Untreated patients had 5% (95% CI 2-8), 3% (95% CI 1-5), and 1% (95% CI 0-3) higher probabilities of the 3 respective levels of COVID-19 severity than glatiramer acetate. Compared with pooled other DMTs and with natalizumab, the associations of ocrelizumab and rituximab with COVID-19 severity were also more pronounced. All associations persisted/enhanced on restriction to confirmed COVID-19. DISCUSSION: Analyzing the largest international real-world data set of people with MS with suspected/confirmed COVID-19 confirms that the use of anti-CD20 medication (both ocrelizumab and rituximab), as well as male sex, older age, progressive MS, and higher disability are associated with more severe course of COVID-19.


Assuntos
COVID-19 , Esclerose Múltipla Crônica Progressiva , Esclerose Múltipla , Antígenos CD20 , Acetato de Glatiramer/uso terapêutico , Humanos , Imunossupressores/uso terapêutico , Disseminação de Informação , Masculino , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla/epidemiologia , Esclerose Múltipla Crônica Progressiva/tratamento farmacológico , Natalizumab/uso terapêutico , Fatores de Risco , Rituximab/uso terapêutico
15.
Curr Res Struct Biol ; 4: 167-174, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35669450

RESUMO

Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.

16.
Stud Health Technol Inform ; 294: 829-833, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612220

RESUMO

The complexity and heterogeneity of cancers leads to variable responses of patients to treatments and interventions. Developing models that accurately predict patient's care pathways using prognostic and predictive biomarkers is increasingly important in both clinical practice and scientific research. The main objective of the ATHENA project is to: (1) accelerate data driven precision medicine for two use cases - bladder cancer and multiple myeloma, (2) apply distributed and privacy-preserving analytical methods/ algorithms to stratify patients (decision support), (3) help healthcare professionals deliver earlier and better targeted treatments, and (4) explore care pathway automations and improve outcomes for each patient. Challenges associated with data sharing and integration will be addressed and an appropriate federated data ecosystem will be created, enabling an interoperable foundation for data exchange, analysis and interpretation. By combining multidisciplinary expertise and tackling knowledge gaps in ATHENA, we propose a novel federated privacy preserving platform for oncology research.


Assuntos
Ecossistema , Privacidade , Algoritmos , Governo , Humanos , Medicina de Precisão
17.
Bioinformatics ; 38(10): 2802-2809, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561176

RESUMO

MOTIVATION: Transcriptional regulation mechanisms allow cells to adapt and respond to external stimuli by altering gene expression. The possible cell transcriptional states are determined by the underlying gene regulatory network (GRN), and reliably inferring such network would be invaluable to understand biological processes and disease progression. RESULTS: In this article, we present a novel method for the inference of GRNs, called PORTIA, which is based on robust precision matrix estimation, and we show that it positively compares with state-of-the-art methods while being orders of magnitude faster. We extensively validated PORTIA using the DREAM and MERLIN+P datasets as benchmarks. In addition, we propose a novel scoring metric that builds on graph-theoretical concepts. AVAILABILITY AND IMPLEMENTATION: The code and instructions for data acquisition and full reproduction of our results are available at https://github.com/AntoinePassemiers/PORTIA-Manuscript. PORTIA is available on PyPI as a Python package (portia-grn). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Regulação da Expressão Gênica
19.
Nat Commun ; 13(1): 961, 2022 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-35181656

RESUMO

Structural bioinformatics suffers from the lack of interfaces connecting biological structures and machine learning methods, making the application of modern neural network architectures impractical. This negatively affects the development of structure-based bioinformatics methods, causing a bottleneck in biological research. Here we present PyUUL ( https://pyuul.readthedocs.io/ ), a library to translate biological structures into 3D tensors, allowing an out-of-the-box application of state-of-the-art deep learning algorithms. The library converts biological macromolecules to data structures typical of computer vision, such as voxels and point clouds, for which extensive machine learning research has been performed. Moreover, PyUUL allows an out-of-the box GPU and sparse calculation. Finally, we demonstrate how PyUUL can be used by researchers to address some typical bioinformatics problems, such as structure recognition and docking.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Imageamento Tridimensional/métodos , Redes Neurais de Computação , Algoritmos , Humanos , Elementos Estruturais de Proteínas/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA