Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 171
Filtrar
1.
Bioinformatics ; 40(9)2024 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-39177091

RESUMO

MOTIVATION: Circulating-cell free DNA (cfDNA) is widely explored as a noninvasive biomarker for cancer screening and diagnosis. The ability to decode the cells of origin in cfDNA would provide biological insights into pathophysiological mechanisms, aiding in cancer characterization and directing clinical management and follow-up. RESULTS: We developed a DNA methylation signature-based deconvolution algorithm, MetDecode, for cancer tissue origin identification. We built a reference atlas exploiting de novo and published whole-genome methylation sequencing data for colorectal, breast, ovarian, and cervical cancer, and blood-cell-derived entities. MetDecode models the contributors absent in the atlas with methylation patterns learnt on-the-fly from the input cfDNA methylation profiles. In addition, our model accounts for the coverage of each marker region to alleviate potential sources of noise. In-silico experiments showed a limit of detection down to 2.88% of tumor tissue contribution in cfDNA. MetDecode produced Pearson correlation coefficients above 0.95 and outperformed other methods in simulations (P < 0.001; T-test; one-sided). In plasma cfDNA profiles from cancer patients, MetDecode assigned the correct tissue-of-origin in 84.2% of cases. In conclusion, MetDecode can unravel alterations in the cfDNA pool components by accurately estimating the contribution of multiple tissues, while supplied with an imperfect reference atlas. AVAILABILITY AND IMPLEMENTATION: MetDecode is available at https://github.com/JorisVermeeschLab/MetDecode.


Assuntos
Algoritmos , Biomarcadores Tumorais , Ácidos Nucleicos Livres , Metilação de DNA , Neoplasias , Humanos , Neoplasias/genética , Ácidos Nucleicos Livres/sangue , Biomarcadores Tumorais/sangue
2.
Sci Rep ; 14(1): 18243, 2024 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-39107347

RESUMO

Individual Specific Networks (ISNs) are a tool used in computational biology to infer Individual Specific relationships between biological entities from omics data. ISNs provide insights into how the interactions among these entities affect their respective functions. To address the scarcity of solutions for efficiently computing ISNs on large biological datasets, we present ISN-tractor, a data-agnostic, highly optimized Python library to build and analyse ISNs. ISN-tractor demonstrates superior scalability and efficiency in generating Individual Specific Networks (ISNs) when compared to existing methods such as LionessR, both in terms of time and memory usage, allowing ISNs to be used on large datasets. We show how ISN-tractor can be applied to real-life datasets, including The Cancer Genome Atlas (TCGA) and HapMap, showcasing its versatility. ISN-tractor can be used to build ISNs from various -omics data types, including transcriptomics, proteomics, and genotype arrays, and can detect distinct patterns of gene interactions within and across cancer types. We also show how Filtration Curves provided valuable insights into ISN characteristics, revealing topological distinctions among individuals with different clinical outcomes. Additionally, ISN-tractor can effectively cluster populations based on genetic relationships, as demonstrated with Principal Component Analysis on HapMap data.


Assuntos
Biologia Computacional , Humanos , Biologia Computacional/métodos , Redes Reguladoras de Genes , Neoplasias/genética , Software , Proteômica/métodos , Algoritmos
3.
Stud Health Technol Inform ; 316: 1582-1583, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176510

RESUMO

Real-world data (RWD) has the potential to revolutionize healthcare by offering valuable insights into patient outcomes and treatment efficacy. However, leveraging RWD effectively presents challenges, including its inherent limitations, diverse stakeholders, and insufficient data management pipelines. A proposed framework advocates three essential elements: adherence to FAIR principles (Findable, Accessible, Interoperable, and Reusable), stakeholder engagement and education, and highlighting the need for inclusive, pragmatic federated hybrid pipelines. By employing these strategies, healthcare organizations can overcome obstacles to RWD utilization and foster sustainable progress in patient care.


Assuntos
Atenção à Saúde , Humanos , Registros Eletrônicos de Saúde , Gerenciamento de Dados
4.
JMIR Form Res ; 8: e55496, 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39018557

RESUMO

BACKGROUND: The integrity and reliability of clinical research outcomes rely heavily on access to vast amounts of data. However, the fragmented distribution of these data across multiple institutions, along with ethical and regulatory barriers, presents significant challenges to accessing relevant data. While federated learning offers a promising solution to leverage insights from fragmented data sets, its adoption faces hurdles due to implementation complexities, scalability issues, and inclusivity challenges. OBJECTIVE: This paper introduces Federated Learning for Everyone (FL4E), an accessible framework facilitating multistakeholder collaboration in clinical research. It focuses on simplifying federated learning through an innovative ecosystem-based approach. METHODS: The "degree of federation" is a fundamental concept of FL4E, allowing for flexible integration of federated and centralized learning models. This feature provides a customizable solution by enabling users to choose the level of data decentralization based on specific health care settings or project needs, making federated learning more adaptable and efficient. By using an ecosystem-based collaborative learning strategy, FL4E encourages a comprehensive platform for managing real-world data, enhancing collaboration and knowledge sharing among its stakeholders. RESULTS: Evaluating FL4E's effectiveness using real-world health care data sets has highlighted its ecosystem-oriented and inclusive design. By applying hybrid models to 2 distinct analytical tasks-classification and survival analysis-within real-world settings, we have effectively measured the "degree of federation" across various contexts. These evaluations show that FL4E's hybrid models not only match the performance of fully federated models but also avoid the substantial overhead usually linked with these models. Achieving this balance greatly enhances collaborative initiatives and broadens the scope of analytical possibilities within the ecosystem. CONCLUSIONS: FL4E represents a significant step forward in collaborative clinical research by merging the benefits of centralized and federated learning. Its modular ecosystem-based design and the "degree of federation" feature make it an inclusive, customizable framework suitable for a wide array of clinical research scenarios, promising to revolutionize the field through improved collaboration and data use. Detailed implementation and analyses are available on the associated GitHub repository.

5.
Sci Rep ; 14(1): 13188, 2024 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-38851759

RESUMO

Genome interpretation (GI) encompasses the computational attempts to model the relationship between genotype and phenotype with the goal of understanding how the first leads to the second. While traditional approaches have focused on sub-problems such as predicting the effect of single nucleotide variants or finding genetic associations, recent advances in neural networks (NNs) have made it possible to develop end-to-end GI models that take genomic data as input and predict phenotypes as output. However, technical and modeling issues still need to be fixed for these models to be effective, including the widespread underdetermination of genomic datasets, making them unsuitable for training large, overfitting-prone, NNs. Here we propose novel GI models to address this issue, exploring the use of two types of transfer learning approaches and proposing a novel Biologically Meaningful Sparse NN layer specifically designed for end-to-end GI. Our models predict the leaf and seed ionome in A.thaliana, obtaining comparable results to our previous over-parameterized model while reducing the number of parameters by 8.8 folds. We also investigate how the effect of population stratification influences the evaluation of the performances, highlighting how it leads to (1) an instance of the Simpson's Paradox, and (2) model generalization limitations.


Assuntos
Arabidopsis , Genoma de Planta , Folhas de Planta , Sementes , Arabidopsis/genética , Folhas de Planta/genética , Folhas de Planta/metabolismo , Sementes/genética , Sementes/metabolismo , Redes Neurais de Computação , Genômica/métodos , Fenótipo , Modelos Genéticos , Genótipo
6.
Comput Struct Biotechnol J ; 23: 1773-1785, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38689715

RESUMO

Magnesium (Mg)-based implants have emerged as a promising alternative for orthopedic applications, owing to their bioactive properties and biodegradability. As the implants degrade, Mg2+ ions are released, influencing all surrounding cell types, especially mesenchymal stem cells (MSCs). MSCs are vital for bone tissue regeneration, therefore, it is essential to understand their molecular response to Mg2+ ions in order to maximize the potential of Mg-based biomaterials. In this study, we conducted a gene regulatory network (GRN) analysis to examine the molecular responses of MSCs to Mg2+ ions. We used time-series proteomics data collected at 11 time points across a 21-day period for the GRN construction. We studied the impact of Mg2+ ions on the resulting networks and identified the key proteins and protein interactions affected by the application of Mg2+ ions. Our analysis highlights MYL1, MDH2, GLS, and TRIM28 as the primary targets of Mg2+ ions in the response of MSCs during 1-21 days phase. Our results also identify MDH2-MYL1, MDH2-RPS26, TRIM28-AK1, TRIM28-SOD2, and GLS-AK1 as the critical protein relationships affected by Mg2+ ions. By offering a comprehensive understanding of the regulatory role of Mg2+ ions on MSCs, our study contributes valuable insights into the molecular response of MSCs to Mg-based materials, thereby facilitating the development of innovative therapeutic strategies for orthopedic applications.

7.
J Chem Inf Model ; 64(7): 2331-2344, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37642660

RESUMO

Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets.


Assuntos
Benchmarking , Relação Quantitativa Estrutura-Atividade , Bioensaio , Aprendizado de Máquina
8.
JMIR Med Inform ; 11: e48030, 2023 Nov 09.
Artigo em Inglês | MEDLINE | ID: mdl-37943585

RESUMO

BACKGROUND: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence. OBJECTIVE: This study aims to present a comprehensive, research question-agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing. METHODS: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline's effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative. RESULTS: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19. CONCLUSIONS: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.

9.
Sci Rep ; 13(1): 19449, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945674

RESUMO

High-throughput sequencing allowed the discovery of many disease variants, but nowadays it is becoming clear that the abundance of genomics data mostly just moved the bottleneck in Genetics and Precision Medicine from a data availability issue to a data interpretation issue. To solve this empasse it would be beneficial to apply the latest Deep Learning (DL) methods to the Genome Interpretation (GI) problem, similarly to what AlphaFold did for Structural Biology. Unfortunately DL requires large datasets to be viable, and aggregating genomics datasets poses several legal, ethical and infrastructural complications. Federated Learning (FL) is a Machine Learning (ML) paradigm designed to tackle these issues. It allows ML methods to be collaboratively trained and tested on collections of physically separate datasets, without requiring the actual centralization of sensitive data. FL could thus be key to enable DL applications to GI on sufficiently large genomics data. We propose FedCrohn, a FL GI Neural Network model for the exome-based Crohn's Disease risk prediction, providing a proof-of-concept that FL is a viable paradigm to build novel ML GI approaches. We benchmark it in several realistic scenarios, showing that FL can indeed provide performances similar to conventional ML on centralized data, and that collaborating in FL initiatives is likely beneficial for most of the medical centers participating in them.


Assuntos
Doença de Crohn , Exoma , Humanos , Exoma/genética , Doença de Crohn/genética , Genômica , Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala
10.
Genome Biol ; 24(1): 224, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798735

RESUMO

BACKGROUND: Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination ([Formula: see text]). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects. RESULTS: We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case-control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis. CONCLUSIONS: In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.


Assuntos
Doenças Inflamatórias Intestinais , Dinâmica não Linear , Humanos , Tamanho da Amostra , Doenças Inflamatórias Intestinais/genética , Redes Neurais de Computação , Fenótipo
11.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37255310

RESUMO

MOTIVATION: The prediction of reliable Drug-Target Interactions (DTIs) is a key task in computer-aided drug design and repurposing. Here, we present a new approach based on data fusion for DTI prediction built on top of the NXTfusion library, which generalizes the Matrix Factorization paradigm by extending it to the nonlinear inference over Entity-Relation graphs. RESULTS: We benchmarked our approach on five datasets and we compared our models against state-of-the-art methods. Our models outperform most of the existing methods and, simultaneously, retain the flexibility to predict both DTIs as binary classification and regression of the real-valued drug-target affinity, competing with models built explicitly for each task. Moreover, our findings suggest that the validation of DTI methods should be stricter than what has been proposed in some previous studies, focusing more on mimicking real-life DTI settings where predictions for previously unseen drugs, proteins, and drug-protein pairs are needed. These settings are exactly the context in which the benefit of integrating heterogeneous information with our Entity-Relation data fusion approach is the most evident. AVAILABILITY AND IMPLEMENTATION: All software and data are available at https://github.com/eugeniomazzone/CPI-NXTFusion and https://pypi.org/project/NXTfusion/.


Assuntos
Desenvolvimento de Medicamentos , Software , Proteínas , Interações Medicamentosas , Desenho de Fármacos
12.
Pharmaceutics ; 15(3)2023 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-36986760

RESUMO

In vitro non-cellular permeability models such as the parallel artificial membrane permeability assay (PAMPA) are widely applied tools for early-phase drug candidate screening. In addition to the commonly used porcine brain polar lipid extract for modeling the blood-brain barrier's permeability, the total and polar fractions of bovine heart and liver lipid extracts were investigated in the PAMPA model by measuring the permeability of 32 diverse drugs. The zeta potential of the lipid extracts and the net charge of their glycerophospholipid components were also determined. Physicochemical parameters of the 32 compounds were calculated using three independent forms of software (Marvin Sketch, RDKit, and ACD/Percepta). The relationship between the lipid-specific permeabilities and the physicochemical descriptors of the compounds was investigated using linear correlation, Spearman correlation, and PCA analysis. While the results showed only subtle differences between total and polar lipids, permeability through liver lipids highly differed from that of the heart or brain lipid-based models. Correlations between the in silico descriptors (e.g., number of amide bonds, heteroatoms, and aromatic heterocycles, accessible surface area, and H-bond acceptor-donor balance) of drug molecules and permeability values were also found, which provides support for understanding tissue-specific permeability.

17.
Mult Scler Relat Disord ; 66: 104072, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35917745

RESUMO

BACKGROUND: Interferon-ß, a disease-modifying therapy (DMT) for MS, may be associated with less severe COVID-19 in people with MS. RESULTS: Among 5,568 patients (83.4% confirmed COVID-19), interferon-treated patients had lower risk of severe COVID-19 compared to untreated, but not to glatiramer-acetate, dimethyl-fumarate, or pooled other DMTs. CONCLUSIONS: In comparison to other DMTs, we did not find evidence of protective effects of interferon-ß on the severity of COVID-19, though compared to the untreated, the course of COVID19 was milder among those on interferon-ß. This study does not support the use of interferon-ß as a treatment to reduce COVID-19 severity in MS.


Assuntos
COVID-19 , Esclerose Múltipla Recidivante-Remitente , Esclerose Múltipla , Acetatos , Fumarato de Dimetilo/uso terapêutico , Acetato de Glatiramer/uso terapêutico , Humanos , Imunossupressores/efeitos adversos , Interferon beta/uso terapêutico , Esclerose Múltipla/induzido quimicamente , Esclerose Múltipla/complicações , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla Recidivante-Remitente/induzido quimicamente
18.
Artigo em Inglês | MEDLINE | ID: mdl-36038263

RESUMO

BACKGROUND AND OBJECTIVES: Certain demographic and clinical characteristics, including the use of some disease-modifying therapies (DMTs), are associated with severe acute respiratory syndrome coronavirus 2 infection severity in people with multiple sclerosis (MS). Comprehensive exploration of these relationships in large international samples is needed. METHODS: Clinician-reported demographic/clinical data from 27 countries were aggregated into a data set of 5,648 patients with suspected/confirmed coronavirus disease 2019 (COVID-19). COVID-19 severity outcomes (hospitalization, admission to intensive care unit [ICU], requiring artificial ventilation, and death) were assessed using multilevel mixed-effects ordered probit and logistic regression, adjusted for age, sex, disability, and MS phenotype. DMTs were individually compared with glatiramer acetate, and anti-CD20 DMTs with pooled other DMTs and with natalizumab. RESULTS: Of 5,648 patients, 922 (16.6%) with suspected and 4,646 (83.4%) with confirmed COVID-19 were included. Male sex, older age, progressive MS, and higher disability were associated with more severe COVID-19. Compared with glatiramer acetate, ocrelizumab and rituximab were associated with higher probabilities of hospitalization (4% [95% CI 1-7] and 7% [95% CI 4-11]), ICU/artificial ventilation (2% [95% CI 0-4] and 4% [95% CI 2-6]), and death (1% [95% CI 0-2] and 2% [95% CI 1-4]) (predicted marginal effects). Untreated patients had 5% (95% CI 2-8), 3% (95% CI 1-5), and 1% (95% CI 0-3) higher probabilities of the 3 respective levels of COVID-19 severity than glatiramer acetate. Compared with pooled other DMTs and with natalizumab, the associations of ocrelizumab and rituximab with COVID-19 severity were also more pronounced. All associations persisted/enhanced on restriction to confirmed COVID-19. DISCUSSION: Analyzing the largest international real-world data set of people with MS with suspected/confirmed COVID-19 confirms that the use of anti-CD20 medication (both ocrelizumab and rituximab), as well as male sex, older age, progressive MS, and higher disability are associated with more severe course of COVID-19.


Assuntos
COVID-19 , Esclerose Múltipla Crônica Progressiva , Esclerose Múltipla , Antígenos CD20 , Acetato de Glatiramer/uso terapêutico , Humanos , Imunossupressores/uso terapêutico , Disseminação de Informação , Masculino , Esclerose Múltipla/tratamento farmacológico , Esclerose Múltipla/epidemiologia , Esclerose Múltipla Crônica Progressiva/tratamento farmacológico , Natalizumab/uso terapêutico , Fatores de Risco , Rituximab/uso terapêutico
19.
Curr Res Struct Biol ; 4: 167-174, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35669450

RESUMO

Current human Single Amino acid Variants (SAVs) databases provide a link between a SAVs and their effect on the carrier individual phenotype, often dividing them into Deleterious/Neutral variants. This is a very coarse-grained description of the genotype-to-phenotype relationship because it relies on un-realistic assumptions such as the perfect Mendelian behavior of each SAV and considers only dichotomic phenotypes. Moreover, the link between the effect of a SAV on a protein (its molecular phenotype) and the individual phenotype is often very complex, because multiple level of biological abstraction connect the protein and individual level phenotypes. Here we present HPMPdb, a manually curated database containing human SAVs associated with the detailed description of the molecular phenotype they cause on the affected proteins. With particular regards to machine learning (ML), this database can be used to let researchers go beyond the existing Deleterious/Neutral prediction paradigm, allowing them to build molecular phenotype predictors instead. Our class labels describe in a succinct way the effects that each SAV has on 15 protein molecular phenotypes, such as protein-protein interaction, small molecules binding, function, post-translational modifications (PTMs), sub-cellular localization, mimetic PTM, folding and protein expression. Moreover, we provide researchers with all necessary means to re-producibly train and test their models on our database. The webserver and the data described in this paper are available at hpmp.esat.kuleuven.be.

20.
Stud Health Technol Inform ; 294: 829-833, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612220

RESUMO

The complexity and heterogeneity of cancers leads to variable responses of patients to treatments and interventions. Developing models that accurately predict patient's care pathways using prognostic and predictive biomarkers is increasingly important in both clinical practice and scientific research. The main objective of the ATHENA project is to: (1) accelerate data driven precision medicine for two use cases - bladder cancer and multiple myeloma, (2) apply distributed and privacy-preserving analytical methods/ algorithms to stratify patients (decision support), (3) help healthcare professionals deliver earlier and better targeted treatments, and (4) explore care pathway automations and improve outcomes for each patient. Challenges associated with data sharing and integration will be addressed and an appropriate federated data ecosystem will be created, enabling an interoperable foundation for data exchange, analysis and interpretation. By combining multidisciplinary expertise and tackling knowledge gaps in ATHENA, we propose a novel federated privacy preserving platform for oncology research.


Assuntos
Ecossistema , Privacidade , Algoritmos , Governo , Humanos , Medicina de Precisão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA