Pesquisa | BVS CLAP/SMR-OPAS/OMS

1.

Data Sanitization to Reduce Private Information Leakage from Functional Genomics.

Gürsoy, Gamze; Emani, Prashant; Brannon, Charlotte M; Jolanki, Otto A; Harmanci, Arif; Strattan, J Seth; Cherry, J Michael; Miranker, Andrew D; Gerstein, Mark.

Cell ; 183(4): 905-917.e16, 2020 11 12.

Artigo em Inglês | MEDLINE | ID: mdl-33186529

RESUMO

The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.

Assuntos

Segurança Computacional , Genômica , Privacidade , Genoma Humano , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Fenótipo , Filogenia , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Análise de Célula Única

2.

Are Data Sharing and Privacy Protection Mutually Exclusive?

Joly, Yann; Dyke, Stephanie O M; Knoppers, Bartha M; Pastinen, Tomi.

Cell ; 167(5): 1150-1154, 2016 11 17.

Artigo em Inglês | MEDLINE | ID: mdl-27863233

RESUMO

We review emerging strategies to protect the privacy of research participants in international epigenome research: open consent, genome donation, registered access, automated procedures, and privacy-enhancing technologies.

Assuntos

Genômica/ética , Genômica/legislação & jurisprudência , Disseminação de Informação , Privacidade , Sequenciamento de Nucleotídeos em Larga Escala , Projeto Genoma Humano/ética , Projeto Genoma Humano/legislação & jurisprudência , Humanos , Análise de Sequência de DNA

3.

Sociotechnical safeguards for genomic data privacy.

Wan, Zhiyu; Hazel, James W; Clayton, Ellen Wright; Vorobeychik, Yevgeniy; Kantarcioglu, Murat; Malin, Bradley A.

Nat Rev Genet ; 23(7): 429-445, 2022 07.

Artigo em Inglês | MEDLINE | ID: mdl-35246669

RESUMO

Recent developments in a variety of sectors, including health care, research and the direct-to-consumer industry, have led to a dramatic increase in the amount of genomic data that are collected, used and shared. This state of affairs raises new and challenging concerns for personal privacy, both legally and technically. This Review appraises existing and emerging threats to genomic data privacy and discusses how well current legal frameworks and technical safeguards mitigate these concerns. It concludes with a discussion of remaining and emerging challenges and illustrates possible solutions that can balance protecting privacy and realizing the benefits that result from the sharing of genetic information.

Assuntos

Genômica , Privacidade , Genoma

4.

Functional genomics data: privacy risk assessment and technological mitigation.

Gürsoy, Gamze; Li, Tianxiao; Liu, Susanna; Ni, Eric; Brannon, Charlotte M; Gerstein, Mark B.

Nat Rev Genet ; 23(4): 245-258, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-34759381

RESUMO

The generation of functional genomics data by next-generation sequencing has increased greatly in the past decade. Broad sharing of these data is essential for research advancement but poses notable privacy challenges, some of which are analogous to those that occur when sharing genetic variant data. However, there are also unique privacy challenges that arise from cryptic information leakage during the processing and summarization of functional genomics data from raw reads to derived quantities, such as gene expression values. Here, we review these challenges and present potential solutions for mitigating privacy risks while allowing broad data dissemination and analysis.

Assuntos

Privacidade Genética , Privacidade , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Medição de Risco

5.

An open science study of ageing in companion dogs.

Creevy, Kate E; Akey, Joshua M; Kaeberlein, Matt; Promislow, Daniel E L.

Nature ; 602(7895): 51-57, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-35110758

RESUMO

The Dog Aging Project is a long-term longitudinal study of ageing in tens of thousands of companion dogs. The domestic dog is among the most variable mammal species in terms of morphology, behaviour, risk of age-related disease and life expectancy. Given that dogs share the human environment and have a sophisticated healthcare system but are much shorter-lived than people, they offer a unique opportunity to identify the genetic, environmental and lifestyle factors associated with healthy lifespan. To take advantage of this opportunity, the Dog Aging Project will collect extensive survey data, environmental information, electronic veterinary medical records, genome-wide sequence information, clinicopathology and molecular phenotypes derived from blood cells, plasma and faecal samples. Here, we describe the specific goals and design of the Dog Aging Project and discuss the potential for this open-data, community science study to greatly enhance understanding of ageing in a genetically variable, socially relevant species living in a complex environment.

Assuntos

Envelhecimento/fisiologia , Cães/fisiologia , Disseminação de Informação , Animais de Estimação/fisiologia , Envelhecimento/efeitos dos fármacos , Envelhecimento/genética , Animais , Biomarcadores , Ambiente Construído , Ensaios Clínicos Veterinários como Assunto , Estudos Transversais , Coleta de Dados , Cães/genética , Feminino , Fragilidade/veterinária , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Objetivos , Envelhecimento Saudável/efeitos dos fármacos , Humanos , Inflamação/veterinária , Consentimento Livre e Esclarecido , Estilo de Vida , Longevidade/efeitos dos fármacos , Longevidade/genética , Longevidade/fisiologia , Estudos Longitudinais , Masculino , Modelos Animais , Multimorbidade , Animais de Estimação/genética , Privacidade , Sirolimo/farmacologia

6.

Patient privacy in AI-driven omics methods.

Zhou, Juexiao; Huang, Chao; Gao, Xin.

Trends Genet ; 40(5): 383-386, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38637270

RESUMO

Artificial intelligence (AI) in omics analysis raises privacy threats to patients. Here, we briefly discuss risk factors to patient privacy in data sharing, model training, and release, as well as methods to safeguard and evaluate patient privacy in AI-driven omics methods.

Assuntos

Inteligência Artificial , Genômica , Humanos , Genômica/métodos , Privacidade , Disseminação de Informação

7.

Privacy-preserving biological age prediction over federated human methylation data using fully homomorphic encryption.

Goldenberg, Meir; Mualem, Loay; Shahar, Amit; Snir, Sagi; Akavia, Adi.

Genome Res ; 34(9): 1324-1333, 2024 Oct 11.

Artigo em Inglês | MEDLINE | ID: mdl-39237299

RESUMO

DNA methylation data play a crucial role in estimating chronological age in mammals, offering real-time insights into an individual's aging process. The epigenetic pacemaker (EPM) model allows inference of the biological age as deviations from the population trend. Given the sensitivity of this data, it is essential to safeguard both inputs and outputs of the EPM model. A privacy-preserving approach for EPM computation utilizing fully homomorphic encryption was recently introduced. However, this method has limitations, including having high communication complexity and being impractical for large data sets. The current work presents a new privacy-preserving protocol for EPM computation, analytically improving both privacy and complexity. Notably, we employ a single server for the secure computation phase while ensuring privacy even in the event of server corruption (compared to requiring two noncolluding servers in prior work). Using techniques from symbolic algebra and number theory, the new protocol eliminates the need for communication during secure computation, significantly improves asymptotic runtime, and offers better compatibility to parallel computing for further time complexity reduction. We implemented our protocol, demonstrating its ability to produce results similar to the standard (insecure) EPM model with substantial performance improvement compared to prior work. These findings hold promise for enhancing data security in medical applications where personal privacy is paramount. The generality of both the new approach and the EPM suggests that this protocol may be useful in other applications employing similar expectation-maximization techniques.

Assuntos

Envelhecimento , Segurança Computacional , Metilação de DNA , Humanos , Envelhecimento/genética , Epigênese Genética , Privacidade , Algoritmos

8.

Searching across-cohort relatives in 54,092 GWAS samples via encrypted genotype regression.

Zhang, Qi-Xin; Liu, Tianzi; Guo, Xinxin; Zhen, Jianxin; Yang, Meng-Yuan; Khederzadeh, Saber; Zhou, Fang; Han, Xiaotong; Zheng, Qiwen; Jia, Peilin; Ding, Xiaohu; He, Mingguang; Zou, Xin; Liao, Jia-Kai; Zhang, Hongxin; He, Ji; Zhu, Xiaofeng; Lu, Daru; Chen, Hongyan; Zeng, Changqing; Liu, Fan; Zheng, Hou-Feng; Liu, Siyang; Xu, Hai-Ming; Chen, Guo-Bo.

PLoS Genet ; 20(1): e1011037, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38206971

RESUMO

Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.

Assuntos

Estudo de Associação Genômica Ampla , Privacidade , Humanos , Estudo de Associação Genômica Ampla/métodos , Genótipo , Software , Genômica

9.

People with disability and privacy in precision medicine research: what's at stake?

Kapur, Supriya Lal; Sabatello, Maya.

Trends Genet ; 39(5): 335-337, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-36707316

RESUMO

Re-identification from data used in precision medicine research is presumed to create minimal risk but may disproportionately impact health disparity populations. We consider plausible privacy risks and the negative ramifications thereof for people with disabilities, the largest health disparity population in the USA, and suggest measures to address these concerns.

Assuntos

Pessoas com Deficiência , Medicina de Precisão , Humanos , Privacidade

10.

Federated Analysis for Privacy-Preserving Data Sharing: A Technical and Legal Primer.

Casaletto, James; Bernier, Alexander; McDougall, Robyn; Cline, Melissa S.

Annu Rev Genomics Hum Genet ; 24: 347-368, 2023 08 25.

Artigo em Inglês | MEDLINE | ID: mdl-37253596

RESUMO

Continued advances in precision medicine rely on the widespread sharing of data that relate human genetic variation to disease. However, data sharing is severely limited by legal, regulatory, and ethical restrictions that safeguard patient privacy. Federated analysis addresses this problem by transferring the code to the data-providing the technical and legal capability to analyze the data within their secure home environment rather than transferring the data to another institution for analysis. This allows researchers to gain new insights from data that cannot be moved, while respecting patient privacy and the data stewards' legal obligations. Because federated analysis is a technical solution to the legal challenges inherent in data sharing, the technology and policy implications must be evaluated together. Here, we summarize the technical approaches to federated analysis and provide a legal analysis of their policy implications.

Assuntos

Fenbendazol , Privacidade , Humanos , Instalações de Saúde , Disseminação de Informação , Políticas

11.

Enabling tradeoffs in privacy and utility in genomic data Beacons and summary statistics.

Venkatesaramani, Rajagopal; Wan, Zhiyu; Malin, Bradley A; Vorobeychik, Yevgeniy.

Genome Res ; 33(7): 1113-1123, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37217251

RESUMO

The collection and sharing of genomic data are becoming increasingly commonplace in research, clinical, and direct-to-consumer settings. The computational protocols typically adopted to protect individual privacy include sharing summary statistics, such as allele frequencies, or limiting query responses to the presence/absence of alleles of interest using web services called Beacons. However, even such limited releases are susceptible to likelihood ratio-based membership-inference attacks. Several approaches have been proposed to preserve privacy, which either suppress a subset of genomic variants or modify query responses for specific variants (e.g., adding noise, as in differential privacy). However, many of these approaches result in a significant utility loss, either suppressing many variants or adding a substantial amount of noise. In this paper, we introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy with respect to membership-inference attacks based on likelihood ratios, combining variant suppression and modification. We consider two attack models. In the first, an attacker applies a likelihood ratio test to make membership-inference claims. In the second model, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals in the data set and those who are not. We further introduce highly scalable approaches for approximately solving the privacy-utility tradeoff problem when information is in the form of either summary statistics or presence/absence queries. Finally, we show that the proposed approaches outperform the state of the art in both utility and privacy through an extensive evaluation with public data sets.

Assuntos

Disseminação de Informação , Privacidade , Humanos , Disseminação de Informação/métodos , Genômica , Frequência do Gene , Alelos

12.

Responsible, practical genomic data sharing that accelerates research.

Byrd, James Brian; Greene, Anna C; Prasad, Deepashree Venkatesh; Jiang, Xiaoqian; Greene, Casey S.

Nat Rev Genet ; 21(10): 615-629, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-32694666

RESUMO

Data sharing anchors reproducible science, but expectations and best practices are often nebulous. Communities of funders, researchers and publishers continue to grapple with what should be required or encouraged. To illuminate the rationales for sharing data, the technical challenges and the social and cultural challenges, we consider the stakeholders in the scientific enterprise. In biomedical research, participants are key among those stakeholders. Ethical sharing requires considering both the value of research efforts and the privacy costs for participants. We discuss current best practices for various types of genomic data, as well as opportunities to promote ethical data sharing that accelerates science by aligning incentives.

Assuntos

Pesquisa Biomédica/métodos , Pesquisa Biomédica/tendências , Genômica/ética , Disseminação de Informação/ética , Pesquisadores/tendências , Comportamento Cooperativo , Humanos , Privacidade

13.

Illuminating the dark spaces of healthcare with ambient intelligence.

Haque, Albert; Milstein, Arnold; Fei-Fei, Li.

Nature ; 585(7824): 193-202, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32908264

RESUMO

Advances in machine learning and contactless sensors have given rise to ambient intelligence-physical spaces that are sensitive and responsive to the presence of humans. Here we review how this technology could improve our understanding of the metaphorically dark, unobserved spaces of healthcare. In hospital spaces, early applications could soon enable more efficient clinical workflows and improved patient safety in intensive care units and operating rooms. In daily living spaces, ambient intelligence could prolong the independence of older individuals and improve the management of individuals with a chronic disease by understanding everyday behaviour. Similar to other technologies, transformation into clinical applications at scale must overcome challenges such as rigorous clinical validation, appropriate data privacy and model transparency. Thoughtful use of this technology would enable us to understand the complex interplay between the physical environment and health-critical human behaviours.

Assuntos

Inteligência Ambiental , Atenção à Saúde/métodos , Monitoramento Ambiental/métodos , Algoritmos , Doença Crônica/terapia , Atenção à Saúde/normas , Unidades Hospitalares , Humanos , Saúde Mental , Segurança do Paciente , Privacidade

14.

The Importance, Challenges, and Possible Solutions for Sharing Proteomics Data While Safeguarding Individuals' Privacy.

Shome, Mahasish; MacKenzie, Tim M G; Subbareddy, Smitha R; Snyder, Michael P.

Mol Cell Proteomics ; 23(3): 100731, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38331191

RESUMO

Proteomics data sharing has profound benefits at the individual level as well as at the community level. While data sharing has increased over the years, mostly due to journal and funding agency requirements, the reluctance of researchers with regard to data sharing is evident as many shares only the bare minimum dataset required to publish an article. In many cases, proper metadata is missing, essentially making the dataset useless. This behavior can be explained by a lack of incentives, insufficient awareness, or a lack of clarity surrounding ethical issues. Through adequate training at research institutes, researchers can realize the benefits associated with data sharing and can accelerate the norm of data sharing for the field of proteomics, as has been the standard in genomics for decades. In this article, we have put together various repository options available for proteomics data. We have also added pros and cons of those repositories to facilitate researchers in selecting the repository most suitable for their data submission. It is also important to note that a few types of proteomics data have the potential to re-identify an individual in certain scenarios. In such cases, extra caution should be taken to remove any personal identifiers before sharing on public repositories. Data sets that will be useless without personal identifiers need to be shared in a controlled access repository so that only authorized researchers can access the data and personal identifiers are kept safe.

Assuntos

Privacidade , Proteômica , Humanos , Genômica , Metadados , Disseminação de Informação

15.

An in-depth examination of requirements for disclosure risk assessment.

Jarmin, Ron S; Abowd, John M; Ashmead, Robert; Cumings-Menon, Ryan; Goldschlag, Nathan; Hawes, Michael B; Keller, Sallie Ann; Kifer, Daniel; Leclerc, Philip; Reiter, Jerome P; Rodríguez, Rolando A; Schmutte, Ian; Velkoff, Victoria A; Zhuravlev, Pavel.

Proc Natl Acad Sci U S A ; 120(43): e2220558120, 2023 Oct 24.

Artigo em Inglês | MEDLINE | ID: mdl-37831744

RESUMO

The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. We argue that any proposal for quantifying disclosure risk should be based on prespecified, objective criteria. We illustrate this approach to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. More research is needed, but in the near term, the counterfactual approach appears best-suited for privacy versus utility analysis.

Assuntos

Confidencialidade , Revelação , Privacidade , Medição de Risco , Censos

16.

Collaborative privacy-preserving analysis of oncological data using multiparty homomorphic encryption.

Geva, Ravit; Gusev, Alexander; Polyakov, Yuriy; Liram, Lior; Rosolio, Oded; Alexandru, Andreea; Genise, Nicholas; Blatt, Marcelo; Duchin, Zohar; Waissengrin, Barliz; Mirelman, Dan; Bukstein, Felix; Blumenthal, Deborah T; Wolf, Ido; Pelles-Avraham, Sharon; Schaffer, Tali; Lavi, Lee A; Micciancio, Daniele; Vaikuntanathan, Vinod; Badawi, Ahmad Al; Goldwasser, Shafi.

Proc Natl Acad Sci U S A ; 120(33): e2304415120, 2023 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-37549296

RESUMO

Real-world healthcare data sharing is instrumental in constructing broader-based and larger clinical datasets that may improve clinical decision-making research and outcomes. Stakeholders are frequently reluctant to share their data without guaranteed patient privacy, proper protection of their datasets, and control over the usage of their data. Fully homomorphic encryption (FHE) is a cryptographic capability that can address these issues by enabling computation on encrypted data without intermediate decryptions, so the analytics results are obtained without revealing the raw data. This work presents a toolset for collaborative privacy-preserving analysis of oncological data using multiparty FHE. Our toolset supports survival analysis, logistic regression training, and several common descriptive statistics. We demonstrate using oncological datasets that the toolset achieves high accuracy and practical performance, which scales well to larger datasets. As part of this work, we propose a cryptographic protocol for interactive bootstrapping in multiparty FHE, which is of independent interest. The toolset we develop is general-purpose and can be applied to other collaborative medical and healthcare application domains.

Assuntos

Segurança Computacional , Privacidade , Humanos , Modelos Logísticos , Tomada de Decisão Clínica

17.

AFEI: adaptive optimized vertical federated learning for heterogeneous multi-omics data integration.

Wang, Qingyong; He, Minfan; Guo, Longyi; Chai, Hua.

Brief Bioinform ; 24(5)2023 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-37497720

RESUMO

Vertical federated learning has gained popularity as a means of enabling collaboration and information sharing between different entities while maintaining data privacy and security. This approach has potential applications in disease healthcare, cancer prognosis prediction, and other industries where data privacy is a major concern. Although using multi-omics data for cancer prognosis prediction provides more information for treatment selection, collecting different types of omics data can be challenging due to their production in various medical institutions. Data owners must comply with strict data protection regulations such as European Union (EU) General Data Protection Regulation. To share patient data across multiple institutions, privacy and security issues must be addressed. Therefore, we propose an adaptive optimized vertical federated-learning-based framework adaptive optimized vertical federated learning for heterogeneous multi-omics data integration (AFEI) to integrate multi-omics data collected from multiple institutions for cancer prognosis prediction. AFEI enables participating parties to build an accurate joint evaluation model for learning more information related to cancer patients from different perspectives, based on the distributed and encrypted multi-omics features shared by multiple institutions. The experimental results demonstrate that AFEI achieves higher prediction accuracy (6.5% on average) than using single omics data by utilizing the encrypted multi-omics data from different institutions, and it performs almost as well as prognosis prediction by directly integrating multi-omics data. Overall, AFEI can be seen as an efficient solution for breaking down barriers to multi-institutional collaboration and promoting the development of cancer prognosis prediction.

Assuntos

Aprendizagem , Multiômica , Humanos , Disseminação de Informação , Privacidade

18.

Opportunities and challenges for ChatGPT and large language models in biomedicine and health.

Tian, Shubo; Jin, Qiao; Yeganova, Lana; Lai, Po-Ting; Zhu, Qingqing; Chen, Xiuying; Yang, Yifan; Chen, Qingyu; Kim, Won; Comeau, Donald C; Islamaj, Rezarta; Kapoor, Aadit; Gao, Xin; Lu, Zhiyong.

Brief Bioinform ; 25(1)2023 11 22.

Artigo em Inglês | MEDLINE | ID: mdl-38168838

RESUMO

ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.

Assuntos

Armazenamento e Recuperação da Informação , Idioma , Humanos , Privacidade , Pesquisadores

19.

Federated unsupervised random forest for privacy-preserving patient stratification.

Pfeifer, Bastian; Sirocchi, Christel; Bloice, Marcus D; Kreuzthaler, Markus; Urschler, Martin.

Bioinformatics ; 40(Suppl 2): ii198-ii207, 2024 09 01.

Artigo em Inglês | MEDLINE | ID: mdl-39230698

RESUMO

MOTIVATION: In the realm of precision medicine, effective patient stratification and disease subtyping demand innovative methodologies tailored for multi-omics data. Clustering techniques applied to multi-omics data have become instrumental in identifying distinct subgroups of patients, enabling a finer-grained understanding of disease variability. Meanwhile, clinical datasets are often small and must be aggregated from multiple hospitals. Online data sharing, however, is seen as a significant challenge due to privacy concerns, potentially impeding big data's role in medical advancements using machine learning. This work establishes a powerful framework for advancing precision medicine through unsupervised random forest-based clustering in combination with federated computing. RESULTS: We introduce a novel multi-omics clustering approach utilizing unsupervised random forests. The unsupervised nature of the random forest enables the determination of cluster-specific feature importance, unraveling key molecular contributors to distinct patient groups. Our methodology is designed for federated execution, a crucial aspect in the medical domain where privacy concerns are paramount. We have validated our approach on machine learning benchmark datasets as well as on cancer data from The Cancer Genome Atlas. Our method is competitive with the state-of-the-art in terms of disease subtyping, but at the same time substantially improves the cluster interpretability. Experiments indicate that local clustering performance can be improved through federated computing. AVAILABILITY AND IMPLEMENTATION: The proposed methods are available as an R-package (https://github.com/pievos101/uRF).

Assuntos

Medicina de Precisão , Humanos , Análise por Conglomerados , Medicina de Precisão/métodos , Aprendizado de Máquina não Supervisionado , Aprendizado de Máquina , Neoplasias , Privacidade , Algoritmos , Algoritmo Florestas Aleatórias

20.

sfkit: a web-based toolkit for secure and federated genomic analysis.

Mendelsohn, Simon; Froelicher, David; Loginov, Denis; Bernick, David; Berger, Bonnie; Cho, Hyunghoon.

Nucleic Acids Res ; 51(W1): W535-W541, 2023 07 05.

Artigo em Inglês | MEDLINE | ID: mdl-37246709

RESUMO

Advances in genomics are increasingly depending upon the ability to analyze large and diverse genomic data collections, which are often difficult to amass due to privacy concerns. Recent works have shown that it is possible to jointly analyze datasets held by multiple parties, while provably preserving the privacy of each party's dataset using cryptographic techniques. However, these tools have been challenging to use in practice due to the complexities of the required setup and coordination among the parties. We present sfkit, a secure and federated toolkit for collaborative genomic studies, to allow groups of collaborators to easily perform joint analyses of their datasets without compromising privacy. sfkit consists of a web server and a command-line interface, which together support a range of use cases including both auto-configured and user-supplied computational environments. sfkit provides collaborative workflows for the essential tasks of genome-wide association study (GWAS) and principal component analysis (PCA). We envision sfkit becoming a one-stop server for secure collaborative tools for a broad range of genomic analyses. sfkit is open-source and available at: https://sfkit.org.

Assuntos

Estudo de Associação Genômica Ampla , Genômica , Software , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Internet , Privacidade , Fluxo de Trabalho

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA