Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Bioinformatics ; 38(8): 2278-2286, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35139148

RESUMO

MOTIVATION: Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules.Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. RESULTS: The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances.Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. AVAILABILITY AND IMPLEMENTATION: The implementation of the federated random forests can be found at https://featurecloud.ai/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Privacidade , Algoritmo Florestas Aleatórias , Aprendizado de Máquina , Medicina de Precisão , Atenção à Saúde
2.
J Med Internet Res ; 25: e42621, 2023 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-37436815

RESUMO

BACKGROUND: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures. OBJECTIVE: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond. METHODS: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime. RESULTS: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites. CONCLUSIONS: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.


Assuntos
Algoritmos , Inteligência Artificial , Humanos , Ocupações em Saúde , Software , Redes de Comunicação de Computadores , Privacidade
3.
Mult Scler ; 27(12): 1829-1837, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33464158

RESUMO

BACKGROUND: Human endogenous retrovirus (HERV) expression in multiple sclerosis (MS) brain lesions may contribute to chronic inflammation, but expression of genome-wide HERVs in different MS lesions is unknown. OBJECTIVE: We examined the HERV expression landscape in different MS lesions compared to control brains. METHODS: Transcripts from 71 MS brain samples and 25 control WM were obtained by next-generation RNA sequencing and mapped against HERV transcripts across the human genome. Differential expression of mapped HERV-W and HERV-H reads between MS lesion types and controls was analysed. RESULTS: Out of 6.38 billion high-quality paired end reads, 174 million reads (2.73%) mapped to HERV transcripts. There was no difference in HERVs expression level between MS and control brains, but HERV-W transcripts were significantly reduced in chronic active lesions. Of the four HERV-W transcripts exclusively present in MS, ERV3633503 located on chromosome 7q21.13 close to the MS genetic risk locus had the highest number of reads. In the HERV-H family, 75% of transcripts located to nearby 7q21-22 were overrepresented in MS, and ERV3643914 was expressed more than 16 times in MS compared to control brains. CONCLUSION: Novel HERV-W and HERV-H transcripts located at chromosome 7 regions were uniquely expressed in MS lesions, indicating their potential role in brain lesion evolution.


Assuntos
Retrovirus Endógenos , Esclerose Múltipla , Encéfalo , Retrovirus Endógenos/genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Esclerose Múltipla/genética
4.
Front Immunol ; 13: 1043579, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36532064

RESUMO

Infectious agents have been long considered to play a role in the pathogenesis of neurological diseases as part of the interaction between genetic susceptibility and the environment. The role of bacteria in CNS autoimmunity has also been highlighted by changes in the diversity of gut microbiota in patients with neurological diseases such as Parkinson's disease, Alzheimer disease and multiple sclerosis, emphasizing the role of the gut-brain axis. We discuss the hypothesis of a brain microbiota, the BrainBiota: bacteria living in symbiosis with brain cells. Existence of various bacteria in the human brain is suggested by morphological evidence, presence of bacterial proteins, metabolites, transcripts and mucosal-associated invariant T cells. Based on our data, we discuss the hypothesis that these bacteria are an integral part of brain development and immune tolerance as well as directly linked to the gut microbiome. We further suggest that changes of the BrainBiota during brain diseases may be the consequence or cause of the chronic inflammation similarly to the gut microbiota.


Assuntos
Microbioma Gastrointestinal , Microbiota , Esclerose Múltipla , Humanos , Inflamação , Autoimunidade , Bactérias
5.
Genome Biol ; 23(1): 32, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-35073941

RESUMO

Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distributed datasets while preserving the accuracy of the results. sPLINK is robust against heterogeneous distributions of data across cohorts while meta-analysis considerably loses accuracy in such scenarios. sPLINK achieves practical runtime and acceptable network usage for chi-square and linear/logistic regression tests. sPLINK is available at https://exbio.wzw.tum.de/splink .


Assuntos
Estudo de Associação Genômica Ampla , Privacidade , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Modelos Logísticos , Metanálise como Assunto
6.
Genome Biol ; 22(1): 338, 2021 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-34906207

RESUMO

Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneously distributed among cohorts. Flimma ( https://exbio.wzw.tum.de/flimma/ ) addresses this issue by implementing the state-of-the-art workflow limma voom in a federated manner, i.e., patient data never leaves its source site. Flimma results are identical to those generated by limma voom on aggregated datasets even in imbalanced scenarios where meta-analysis approaches fail.


Assuntos
Expressão Gênica , Privacidade , Pesquisa Biomédica , Redes de Comunicação de Computadores , Segurança Computacional/legislação & jurisprudência , Segurança Computacional/normas , Bases de Dados Factuais/legislação & jurisprudência , Bases de Dados Factuais/normas , Expressão Gênica/ética , Genes , Regulamentação Governamental , Humanos , Aprendizado de Máquina
7.
Netw Syst Med ; 3(1): 122-129, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32954379

RESUMO

Introduction: Multiple sclerosis (MS) is a chronic disorder of the central nervous system with an untreatable late progressive phase. Molecular maps of different stages of brain lesion evolution in patients with progressive multiple sclerosis (PMS) are missing but critical for understanding disease development and to identify novel targets to halt progression. Materials and Methods: The MS Atlas database comprises comprehensive high-quality transcriptomic profiles of 98 white matter (WM) brain samples of different lesion types (normal-appearing WM [NAWM], active, chronic active, inactive, remyelinating) from ten progressive MS patients and 25 WM areas from five non-neurological diseased cases. Results: We introduce the first MS brain lesion atlas (msatlas.dk), developed to address the current challenges of understanding mechanisms driving the fate on a lesion basis. The MS Atlas gives means for testing research hypotheses, validating biomarkers and drug targets. It comes with a user-friendly web interface, and it fosters bioinformatic methods for de novo network enrichment to extract mechanistic markers for specific lesion types and pathway-based lesion type comparison. We describe examples of how the MS Atlas can be used to extract systems medicine signatures and demonstrate the interface of MS Atlas. Conclusion: This compendium of mechanistic PMS WM lesion profiles is an invaluable resource to fuel future MS research and a new basis for treatment development.

8.
Acta Neuropathol Commun ; 7(1): 205, 2019 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-31829262

RESUMO

To identify pathogenetic markers and potential drivers of different lesion types in the white matter (WM) of patients with progressive multiple sclerosis (PMS), we sequenced RNA from 73 different WM areas. Compared to 25 WM controls, 6713 out of 18,609 genes were significantly differentially expressed in MS tissues (FDR < 0.05). A computational systems medicine analysis was performed to describe the MS lesion endophenotypes. The cellular source of specific molecules was examined by RNAscope, immunohistochemistry, and immunofluorescence. To examine common lesion specific mechanisms, we performed de novo network enrichment based on shared differentially expressed genes (DEGs), and found TGFß-R2 as a central hub. RNAscope revealed astrocytes as the cellular source of TGFß-R2 in remyelinating lesions. Since lesion-specific unique DEGs were more common than shared signatures, we examined lesion-specific pathways and de novo networks enriched with unique DEGs. Such network analysis indicated classic inflammatory responses in active lesions; catabolic and heat shock protein responses in inactive lesions; neuronal/axonal specific processes in chronic active lesions. In remyelinating lesions, de novo analyses identified axonal transport responses and adaptive immune markers, which was also supported by the most heterogeneous immunoglobulin gene expression. The signature of the normal-appearing white matter (NAWM) was more similar to control WM than to lesions: only 465 DEGs differentiated NAWM from controls, and 16 were unique. The upregulated marker CD26/DPP4 was expressed by microglia in the NAWM but by mononuclear cells in active lesions, which may indicate a special subset of microglia before the lesion develops, but also emphasizes that omics related to MS lesions should be interpreted in the context of different lesions types. While chronic active lesions were the most distinct from control WM based on the highest number of unique DEGs (n = 2213), remyelinating lesions had the highest gene expression levels, and the most different molecular map from chronic active lesions. This may suggest that these two lesion types represent two ends of the spectrum of lesion evolution in PMS. The profound changes in chronic active lesions, the predominance of synaptic/neural/axonal signatures coupled with minor inflammation may indicate end-stage irreversible molecular events responsible for this less treatable phase.


Assuntos
Encéfalo/patologia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Esclerose Múltipla Crônica Progressiva/genética , Esclerose Múltipla Crônica Progressiva/patologia , Análise de Sequência de RNA/métodos , Substância Branca/patologia , Perfilação da Expressão Gênica/métodos , Humanos , Receptor do Fator de Crescimento Transformador beta Tipo II/genética
9.
Acta Neuropathol Commun ; 7(1): 136, 2019 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-31434573

RESUMO

The authors have retracted this article [1] because a line was omitted from the data sheet; this was due to a bug in the analysis scripts.

10.
Acta Neuropathol Commun ; 7(1): 58, 2019 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-31023379

RESUMO

The heterogeneity of multiple sclerosis is reflected by dynamic changes of different lesion types in the brain white matter (WM). To identify potential drivers of this process, we RNA-sequenced 73 WM areas from patients with progressive MS (PMS) and 25 control WM. Lesion endophenotypes were described by a computational systems medicine analysis combined with RNAscope, immunohistochemistry, and immunofluorescence. The signature of the normal-appearing WM (NAWM) was more similar to control WM than to lesions: one of the six upregulated genes in NAWM was CD26/DPP4 expressed by microglia. Chronic active lesions that become prominent in PMS had a signature that were different from all other lesion types, and were differentiated from them by two clusters of 62 differentially expressed genes (DEGs). An upcoming MS biomarker, CHI3L1 was among the top ten upregulated genes in chronic active lesions expressed by astrocytes in the rim. TGFß-R2 was the central hub in a remyelination-related protein interaction network, and was expressed there by astrocytes. We used de novo networks enriched by unique DEGs to determine lesion-specific pathway regulation, i.e. cellular trafficking and activation in active lesions; healing and immune responses in remyelinating lesions characterized by the most heterogeneous immunoglobulin gene expression; coagulation and ion balance in inactive lesions; and metabolic changes in chronic active lesions. Because we found inverse differential regulation of particular genes among different lesion types, our data emphasize that omics related to MS lesions should be interpreted in the context of lesion pathology. Our data indicate that the impact of molecular pathways is substantially changing as different lesions develop. This was also reflected by the high number of unique DEGs that were more common than shared signatures. A special microglia subset characterized by CD26 may play a role in early lesion development, while astrocyte-derived TGFß-R2 and TGFß pathways may be drivers of repair in contrast to chronic tissue damage. The highly specific mechanistic signature of chronic active lesions indicates that as these lesions develop in PMS, the molecular changes are substantially skewed: the unique mitochondrial/metabolic changes and specific downregulation of molecules involved in tissue repair may reflect a stage of exhaustion.

11.
Methods Mol Biol ; 1807: 51-62, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30030803

RESUMO

DNA-methylation has a strong influence on gene expression such that differences in methylation are associated with a wide range of diseases. Array-based approaches like the Illumina 450 K or 850 K EPIC chips have been used in a wide range of studies mostly comparing a disease group with healthy control, but also to correlate with survival times, for instance. Processing, normalization, and analysis of raw data require extensive knowledge in statistics and programming languages such as R. Here we introduce DiMmer, an easy-to-use Java tool for the analysis of EWAS. A graphical user interface guides the user through preprocessing, normalization, testing for differentially methylated CpGs, and finally the discovery of differentially methylated regions (DMRs). The software performs randomization tests to compute empirical P-values, corrects for multiple testing, and requires no prior knowledge in programming. All computed results are provided as plots or tables and can be easily exported. DiMmer is thus a powerful one-stop-shop for EWAS data analysis.


Assuntos
Metilação de DNA/genética , Epigênese Genética , Estudo de Associação Genômica Ampla/métodos , Software , Ilhas de CpG/genética , Humanos , Anotação de Sequência Molecular , Interface Usuário-Computador
12.
Metabolites ; 5(2): 344-63, 2015 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-26065494

RESUMO

Computational breath analysis is a growing research area aiming at identifying volatile organic compounds (VOCs) in human breath to assist medical diagnostics of the next generation. While inexpensive and non-invasive bioanalytical technologies for metabolite detection in exhaled air and bacterial/fungal vapor exist and the first studies on the power of supervised machine learning methods for profiling of the resulting data were conducted, we lack methods to extract hidden data features emerging from confounding factors. Here, we present Carotta, a new cluster analysis framework dedicated to uncovering such hidden substructures by sophisticated unsupervised statistical learning methods. We study the power of transitivity clustering and hierarchical clustering to identify groups of VOCs with similar expression behavior over most patient breath samples and/or groups of patients with a similar VOC intensity pattern. This enables the discovery of dependencies between metabolites. On the one hand, this allows us to eliminate the effect of potential confounding factors hindering disease classification, such as smoking. On the other hand, we may also identify VOCs associated with disease subtypes or concomitant diseases. Carotta is an open source software with an intuitive graphical user interface promoting data handling, analysis and visualization. The back-end is designed to be modular, allowing for easy extensions with plugins in the future, such as new clustering methods and statistics. It does not require much prior knowledge or technical skills to operate. We demonstrate its power and applicability by means of one artificial dataset. We also apply Carotta exemplarily to a real-world example dataset on chronic obstructive pulmonary disease (COPD). While the artificial data are utilized as a proof of concept, we will demonstrate how Carotta finds candidate markers in our real dataset associated with confounders rather than the primary disease (COPD) and bronchial carcinoma (BC). Carotta is publicly available at http://carotta.compbio.sdu.dk [1].

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA