Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
CPT Pharmacometrics Syst Pharmacol ; 13(2): 257-269, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37950385

RESUMO

High drug development costs and the limited number of new annual drug approvals increase the need for innovative approaches for drug effect prediction. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), led to a global pandemic with high morbidity and mortality. Although effective preventive measures exist, there are few effective treatments for hospitalized patients with SARS-CoV-2 infection. Drug repurposing and drug effect prediction are promising strategies that could shorten development time and reduce costs compared with de novo drug discovery. In this work, we present a machine learning framework to integrate a variety of target network features and physicochemical properties of compounds, and analyze their influence on the therapeutic effects for SARS-CoV-2 infection and on host cell cytotoxic effects. Random forest models trained on compounds with known experimental effects on SARS-CoV-2 infection and subsequent feature importance analysis based on Shapley values provided insights into the determinants of drug efficacy and cytotoxicity, which can be incorporated into novel drug discovery approaches. Given the complexity of molecular mechanisms of drug action and limited sample sizes, our models achieve a reasonable mean area under the receiver operating characteristic curve (ROC-AUC) of 0.73 on an unseen validation set. To our knowledge, this is the first work to incorporate a combination of network and physicochemical features of compounds into a machine learning model to predict drug effects on SARS-CoV-2 infection. Our systems pharmacology-based machine learning framework can be used to classify other existing drugs for SARS-CoV-2 infection and can easily be adapted to drug effect prediction for future viral outbreaks.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Descoberta de Drogas , Desenvolvimento de Medicamentos , Aprendizado de Máquina
2.
J Med Internet Res ; 25: e42621, 2023 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-37436815

RESUMO

BACKGROUND: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures. OBJECTIVE: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond. METHODS: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime. RESULTS: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites. CONCLUSIONS: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.


Assuntos
Algoritmos , Inteligência Artificial , Humanos , Ocupações em Saúde , Software , Redes de Comunicação de Computadores , Privacidade
3.
Bioinformatics ; 38(21): 4919-4926, 2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-36073911

RESUMO

MOTIVATION: In multi-cohort machine learning studies, it is critical to differentiate between effects that are reproducible across cohorts and those that are cohort-specific. Multi-task learning (MTL) is a machine learning approach that facilitates this differentiation through the simultaneous learning of prediction tasks across cohorts. Since multi-cohort data can often not be combined into a single storage solution, there would be the substantial utility of an MTL application for geographically distributed data sources. RESULTS: Here, we describe the development of 'dsMTL', a computational framework for privacy-preserving, distributed multi-task machine learning that includes three supervised and one unsupervised algorithms. First, we derive the theoretical properties of these methods and the relevant machine learning workflows to ensure the validity of the software implementation. Second, we implement dsMTL as a library for the R programming language, building on the DataSHIELD platform that supports the federated analysis of sensitive individual-level data. Third, we demonstrate the applicability of dsMTL for comorbidity modeling in distributed data. We show that comorbidity modeling using dsMTL outperformed conventional, federated machine learning, as well as the aggregation of multiple models built on the distributed datasets individually. The application of dsMTL was computationally efficient and highly scalable when applied to moderate-size (n < 500), real expression data given the actual network latency. AVAILABILITY AND IMPLEMENTATION: dsMTL is freely available at https://github.com/transbioZI/dsMTLBase (server-side package) and https://github.com/transbioZI/dsMTLClient (client-side package). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Privacidade , Humanos , Software , Linguagens de Programação , Algoritmos
4.
NPJ Syst Biol Appl ; 8(1): 5, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-35132075

RESUMO

High-grade serous ovarian carcinoma (HGSC) is the most lethal gynecologic malignancy due to the lack of reliable biomarkers, effective treatment, and chemoresistance. Improving the diagnosis and the development of targeted therapies is still needed. The molecular pathomechanisms driving HGSC progression are not fully understood though crucial for effective diagnosis and identification of novel targeted therapy options. The oncogene CTCFL (BORIS), the paralog of CTCF, is a transcriptional factor highly expressed in ovarian cancer (but in rarely any other tissue in females) with cancer-specific characteristics and therapeutic potential. In this work, we seek to understand the regulatory functions of CTCFL to unravel new target genes with clinical relevance. We used in vitro models to evaluate the transcriptional changes due to the presence of CTCFL, followed by a selection of gene candidates using de novo network enrichment analysis. The resulting mechanistic candidates were further assessed regarding their prognostic potential and druggability. We show that CTCFL-driven genes are involved in cytoplasmic membrane functions; in particular, the PI3K-Akt initiators EGFR1 and VEGFA, as well as ITGB3 and ITGB6 are potential drug targets. Finally, we identified the CTCFL targets ACTBL2, MALT1 and PCDH7 as mechanistic biomarkers to predict survival in HGSC. Finally, we elucidated the value of CTCFL in combination with its targets as a prognostic marker profile for HGSC progression and as putative drug targets.


Assuntos
Proteínas de Ligação a DNA , Neoplasias Ovarianas , Proteínas de Ligação a DNA/genética , Feminino , Humanos , Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/metabolismo , Fosfatidilinositol 3-Quinases/genética , Proteínas Proto-Oncogênicas c-akt/genética , Transdução de Sinais , Fatores de Transcrição
5.
Genome Biol ; 23(1): 32, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-35073941

RESUMO

Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distributed datasets while preserving the accuracy of the results. sPLINK is robust against heterogeneous distributions of data across cohorts while meta-analysis considerably loses accuracy in such scenarios. sPLINK achieves practical runtime and acceptable network usage for chi-square and linear/logistic regression tests. sPLINK is available at https://exbio.wzw.tum.de/splink .


Assuntos
Estudo de Associação Genômica Ampla , Privacidade , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Modelos Logísticos , Metanálise como Assunto
6.
PLOS Digit Health ; 1(9): e0000101, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36812603

RESUMO

Clinical time-to-event studies are dependent on large sample sizes, often not available at a single institution. However, this is countered by the fact that, particularly in the medical field, individual institutions are often legally unable to share their data, as medical data is subject to strong privacy protection due to its particular sensitivity. But the collection, and especially aggregation into centralized datasets, is also fraught with substantial legal risks and often outright unlawful. Existing solutions using federated learning have already demonstrated considerable potential as an alternative for central data collection. Unfortunately, current approaches are incomplete or not easily applicable in clinical studies owing to the complexity of federated infrastructures. This work presents privacy-aware and federated implementations of the most used time-to-event algorithms (survival curve, cumulative hazard rate, log-rank test, and Cox proportional hazards model) in clinical trials, based on a hybrid approach of federated learning, additive secret sharing, and differential privacy. On several benchmark datasets, we show that all algorithms produce highly similar, or in some cases, even identical results compared to traditional centralized time-to-event algorithms. Furthermore, we were able to reproduce the results of a previous clinical time-to-event study in various federated scenarios. All algorithms are accessible through the intuitive web-app Partea (https://partea.zbh.uni-hamburg.de), offering a graphical user interface for clinicians and non-computational researchers without programming knowledge. Partea removes the high infrastructural hurdles derived from existing federated learning approaches and removes the complexity of execution. Therefore, it is an easy-to-use alternative to central data collection, reducing bureaucratic efforts but also the legal risks associated with the processing of personal data to a minimum.

7.
Genome Biol ; 22(1): 338, 2021 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-34906207

RESUMO

Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneously distributed among cohorts. Flimma ( https://exbio.wzw.tum.de/flimma/ ) addresses this issue by implementing the state-of-the-art workflow limma voom in a federated manner, i.e., patient data never leaves its source site. Flimma results are identical to those generated by limma voom on aggregated datasets even in imbalanced scenarios where meta-analysis approaches fail.


Assuntos
Expressão Gênica , Privacidade , Pesquisa Biomédica , Redes de Comunicação de Computadores , Segurança Computacional/legislação & jurisprudência , Segurança Computacional/normas , Bases de Dados Factuais/legislação & jurisprudência , Bases de Dados Factuais/normas , Expressão Gênica/ética , Genes , Regulamentação Governamental , Humanos , Aprendizado de Máquina
8.
Nat Comput Sci ; 1(1): 33-41, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38217166

RESUMO

Responding quickly to unknown pathogens is crucial to stop uncontrolled spread of diseases that lead to epidemics, such as the novel coronavirus, and to keep protective measures at a level that causes as little social and economic harm as possible. This can be achieved through computational approaches that significantly speed up drug discovery. A powerful approach is to restrict the search to existing drugs through drug repurposing, which can vastly accelerate the usually long approval process. In this Review, we examine a representative set of currently used computational approaches to identify repurposable drugs for COVID-19, as well as their underlying data resources. Furthermore, we compare drug candidates predicted by computational methods to drugs being assessed by clinical trials. Finally, we discuss lessons learned from the reviewed research efforts, including how to successfully connect computational approaches with experimental studies, and propose a unified drug repurposing strategy for better preparedness in the case of future outbreaks.

9.
Nat Commun ; 11(1): 3518, 2020 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-32665542

RESUMO

Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. Various studies exist about the molecular mechanisms of viral infection. However, such information is spread across many publications and it is very time-consuming to integrate, and exploit. We develop CoVex, an interactive online platform for SARS-CoV-2 host interactome exploration and drug (target) identification. CoVex integrates virus-human protein interactions, human protein-protein interactions, and drug-target interactions. It allows visual exploration of the virus-host interactome and implements systems medicine algorithms for network-based prediction of drug candidates. Thus, CoVex is a resource to understand molecular mechanisms of pathogenicity and to prioritize candidate therapeutics. We investigate recent hypotheses on a systems biology level to explore mechanistic virus life cycle drivers, and to extract drug repurposing candidates. CoVex renders COVID-19 drug research systems-medicine-ready by giving the scientific community direct access to network medicine algorithms. It is available at https://exbio.wzw.tum.de/covex/.


Assuntos
Antivirais/uso terapêutico , Betacoronavirus/efeitos dos fármacos , Infecções por Coronavirus/tratamento farmacológico , Reposicionamento de Medicamentos/métodos , Interações entre Hospedeiro e Microrganismos/fisiologia , Pneumonia Viral/tratamento farmacológico , Algoritmos , COVID-19 , Simulação por Computador , Humanos , Internet , Pandemias , Mapas de Interação de Proteínas , SARS-CoV-2 , Ligação Viral/efeitos dos fármacos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA