Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.760
Filtrar
Más filtros

Intervalo de año de publicación
1.
Cell ; 183(4): 905-917.e16, 2020 11 12.
Artículo en Inglés | MEDLINE | ID: mdl-33186529

RESUMEN

The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.


Asunto(s)
Seguridad Computacional , Genómica , Privacidad , Genoma Humano , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Fenotipo , Filogenia , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN , Análisis de la Célula Individual
2.
Cell ; 167(5): 1150-1154, 2016 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-27863233

RESUMEN

We review emerging strategies to protect the privacy of research participants in international epigenome research: open consent, genome donation, registered access, automated procedures, and privacy-enhancing technologies.


Asunto(s)
Genómica/ética , Genómica/legislación & jurisprudencia , Difusión de la Información , Privacidad , Secuenciación de Nucleótidos de Alto Rendimiento , Proyecto Genoma Humano/ética , Proyecto Genoma Humano/legislación & jurisprudencia , Humanos , Análisis de Secuencia de ADN
3.
Nat Rev Genet ; 23(7): 429-445, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35246669

RESUMEN

Recent developments in a variety of sectors, including health care, research and the direct-to-consumer industry, have led to a dramatic increase in the amount of genomic data that are collected, used and shared. This state of affairs raises new and challenging concerns for personal privacy, both legally and technically. This Review appraises existing and emerging threats to genomic data privacy and discusses how well current legal frameworks and technical safeguards mitigate these concerns. It concludes with a discussion of remaining and emerging challenges and illustrates possible solutions that can balance protecting privacy and realizing the benefits that result from the sharing of genetic information.


Asunto(s)
Genómica , Privacidad , Genoma
4.
Nat Rev Genet ; 23(4): 245-258, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-34759381

RESUMEN

The generation of functional genomics data by next-generation sequencing has increased greatly in the past decade. Broad sharing of these data is essential for research advancement but poses notable privacy challenges, some of which are analogous to those that occur when sharing genetic variant data. However, there are also unique privacy challenges that arise from cryptic information leakage during the processing and summarization of functional genomics data from raw reads to derived quantities, such as gene expression values. Here, we review these challenges and present potential solutions for mitigating privacy risks while allowing broad data dissemination and analysis.


Asunto(s)
Privacidad Genética , Privacidad , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Medición de Riesgo
5.
Nature ; 602(7895): 51-57, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35110758

RESUMEN

The Dog Aging Project is a long-term longitudinal study of ageing in tens of thousands of companion dogs. The domestic dog is among the most variable mammal species in terms of morphology, behaviour, risk of age-related disease and life expectancy. Given that dogs share the human environment and have a sophisticated healthcare system but are much shorter-lived than people, they offer a unique opportunity to identify the genetic, environmental and lifestyle factors associated with healthy lifespan. To take advantage of this opportunity, the Dog Aging Project will collect extensive survey data, environmental information, electronic veterinary medical records, genome-wide sequence information, clinicopathology and molecular phenotypes derived from blood cells, plasma and faecal samples. Here, we describe the specific goals and design of the Dog Aging Project and discuss the potential for this open-data, community science study to greatly enhance understanding of ageing in a genetically variable, socially relevant species living in a complex environment.


Asunto(s)
Envejecimiento/fisiología , Perros/fisiología , Difusión de la Información , Mascotas/fisiología , Envejecimiento/efectos de los fármacos , Envejecimiento/genética , Animales , Biomarcadores , Entorno Construido , Ensayos Clínicos Veterinarios como Asunto , Estudios Transversales , Recolección de Datos , Perros/genética , Femenino , Fragilidad/veterinaria , Interacción Gen-Ambiente , Estudio de Asociación del Genoma Completo , Objetivos , Envejecimiento Saludable/efectos de los fármacos , Humanos , Inflamación/veterinaria , Consentimiento Informado , Estilo de Vida , Longevidad/efectos de los fármacos , Longevidad/genética , Longevidad/fisiología , Estudios Longitudinales , Masculino , Modelos Animales , Multimorbilidad , Mascotas/genética , Privacidad , Sirolimus/farmacología
6.
Trends Genet ; 40(5): 383-386, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38637270

RESUMEN

Artificial intelligence (AI) in omics analysis raises privacy threats to patients. Here, we briefly discuss risk factors to patient privacy in data sharing, model training, and release, as well as methods to safeguard and evaluate patient privacy in AI-driven omics methods.


Asunto(s)
Inteligencia Artificial , Genómica , Humanos , Genómica/métodos , Privacidad , Difusión de la Información
7.
PLoS Genet ; 20(1): e1011037, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38206971

RESUMEN

Explicitly sharing individual level data in genomics studies has many merits comparing to sharing summary statistics, including more strict QCs, common statistical analyses, relative identification and improved statistical power in GWAS, but it is hampered by privacy or ethical constraints. In this study, we developed encG-reg, a regression approach that can detect relatives of various degrees based on encrypted genomic data, which is immune of ethical constraints. The encryption properties of encG-reg are based on the random matrix theory by masking the original genotypic matrix without sacrificing precision of individual-level genotype data. We established a connection between the dimension of a random matrix, which masked genotype matrices, and the required precision of a study for encrypted genotype data. encG-reg has false positive and false negative rates equivalent to sharing original individual level data, and is computationally efficient when searching relatives. We split the UK Biobank into their respective centers, and then encrypted the genotype data. We observed that the relatives estimated using encG-reg was equivalently accurate with the estimation by KING, which is a widely used software but requires original genotype data. In a more complex application, we launched a finely devised multi-center collaboration across 5 research institutes in China, covering 9 cohorts of 54,092 GWAS samples. encG-reg again identified true relatives existing across the cohorts with even different ethnic backgrounds and genotypic qualities. Our study clearly demonstrates that encrypted genomic data can be used for data sharing without loss of information or data sharing barrier.


Asunto(s)
Estudio de Asociación del Genoma Completo , Privacidad , Humanos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Programas Informáticos , Genómica
8.
Trends Genet ; 39(5): 335-337, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36707316

RESUMEN

Re-identification from data used in precision medicine research is presumed to create minimal risk but may disproportionately impact health disparity populations. We consider plausible privacy risks and the negative ramifications thereof for people with disabilities, the largest health disparity population in the USA, and suggest measures to address these concerns.


Asunto(s)
Personas con Discapacidad , Medicina de Precisión , Humanos , Privacidad
9.
Annu Rev Genomics Hum Genet ; 24: 347-368, 2023 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-37253596

RESUMEN

Continued advances in precision medicine rely on the widespread sharing of data that relate human genetic variation to disease. However, data sharing is severely limited by legal, regulatory, and ethical restrictions that safeguard patient privacy. Federated analysis addresses this problem by transferring the code to the data-providing the technical and legal capability to analyze the data within their secure home environment rather than transferring the data to another institution for analysis. This allows researchers to gain new insights from data that cannot be moved, while respecting patient privacy and the data stewards' legal obligations. Because federated analysis is a technical solution to the legal challenges inherent in data sharing, the technology and policy implications must be evaluated together. Here, we summarize the technical approaches to federated analysis and provide a legal analysis of their policy implications.


Asunto(s)
Fenbendazol , Privacidad , Humanos , Instituciones de Salud , Difusión de la Información , Políticas
10.
Genome Res ; 33(7): 1113-1123, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37217251

RESUMEN

The collection and sharing of genomic data are becoming increasingly commonplace in research, clinical, and direct-to-consumer settings. The computational protocols typically adopted to protect individual privacy include sharing summary statistics, such as allele frequencies, or limiting query responses to the presence/absence of alleles of interest using web services called Beacons. However, even such limited releases are susceptible to likelihood ratio-based membership-inference attacks. Several approaches have been proposed to preserve privacy, which either suppress a subset of genomic variants or modify query responses for specific variants (e.g., adding noise, as in differential privacy). However, many of these approaches result in a significant utility loss, either suppressing many variants or adding a substantial amount of noise. In this paper, we introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy with respect to membership-inference attacks based on likelihood ratios, combining variant suppression and modification. We consider two attack models. In the first, an attacker applies a likelihood ratio test to make membership-inference claims. In the second model, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals in the data set and those who are not. We further introduce highly scalable approaches for approximately solving the privacy-utility tradeoff problem when information is in the form of either summary statistics or presence/absence queries. Finally, we show that the proposed approaches outperform the state of the art in both utility and privacy through an extensive evaluation with public data sets.


Asunto(s)
Difusión de la Información , Privacidad , Humanos , Difusión de la Información/métodos , Genómica , Frecuencia de los Genes , Alelos
11.
Nat Rev Genet ; 21(10): 615-629, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32694666

RESUMEN

Data sharing anchors reproducible science, but expectations and best practices are often nebulous. Communities of funders, researchers and publishers continue to grapple with what should be required or encouraged. To illuminate the rationales for sharing data, the technical challenges and the social and cultural challenges, we consider the stakeholders in the scientific enterprise. In biomedical research, participants are key among those stakeholders. Ethical sharing requires considering both the value of research efforts and the privacy costs for participants. We discuss current best practices for various types of genomic data, as well as opportunities to promote ethical data sharing that accelerates science by aligning incentives.


Asunto(s)
Investigación Biomédica/métodos , Investigación Biomédica/tendencias , Genómica/ética , Difusión de la Información/ética , Investigadores/tendencias , Conducta Cooperativa , Humanos , Privacidad
12.
Nature ; 585(7824): 193-202, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32908264

RESUMEN

Advances in machine learning and contactless sensors have given rise to ambient intelligence-physical spaces that are sensitive and responsive to the presence of humans. Here we review how this technology could improve our understanding of the metaphorically dark, unobserved spaces of healthcare. In hospital spaces, early applications could soon enable more efficient clinical workflows and improved patient safety in intensive care units and operating rooms. In daily living spaces, ambient intelligence could prolong the independence of older individuals and improve the management of individuals with a chronic disease by understanding everyday behaviour. Similar to other technologies, transformation into clinical applications at scale must overcome challenges such as rigorous clinical validation, appropriate data privacy and model transparency. Thoughtful use of this technology would enable us to understand the complex interplay between the physical environment and health-critical human behaviours.


Asunto(s)
Inteligencia Ambiental , Atención a la Salud/métodos , Monitoreo del Ambiente/métodos , Algoritmos , Enfermedad Crónica/terapia , Atención a la Salud/normas , Unidades Hospitalarias , Humanos , Salud Mental , Seguridad del Paciente , Privacidad
13.
Mol Cell Proteomics ; 23(3): 100731, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38331191

RESUMEN

Proteomics data sharing has profound benefits at the individual level as well as at the community level. While data sharing has increased over the years, mostly due to journal and funding agency requirements, the reluctance of researchers with regard to data sharing is evident as many shares only the bare minimum dataset required to publish an article. In many cases, proper metadata is missing, essentially making the dataset useless. This behavior can be explained by a lack of incentives, insufficient awareness, or a lack of clarity surrounding ethical issues. Through adequate training at research institutes, researchers can realize the benefits associated with data sharing and can accelerate the norm of data sharing for the field of proteomics, as has been the standard in genomics for decades. In this article, we have put together various repository options available for proteomics data. We have also added pros and cons of those repositories to facilitate researchers in selecting the repository most suitable for their data submission. It is also important to note that a few types of proteomics data have the potential to re-identify an individual in certain scenarios. In such cases, extra caution should be taken to remove any personal identifiers before sharing on public repositories. Data sets that will be useless without personal identifiers need to be shared in a controlled access repository so that only authorized researchers can access the data and personal identifiers are kept safe.


Asunto(s)
Privacidad , Proteómica , Humanos , Genómica , Metadatos , Difusión de la Información
14.
Proc Natl Acad Sci U S A ; 120(43): e2220558120, 2023 Oct 24.
Artículo en Inglés | MEDLINE | ID: mdl-37831744

RESUMEN

The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. We argue that any proposal for quantifying disclosure risk should be based on prespecified, objective criteria. We illustrate this approach to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. More research is needed, but in the near term, the counterfactual approach appears best-suited for privacy versus utility analysis.


Asunto(s)
Confidencialidad , Revelación , Privacidad , Medición de Riesgo , Censos
15.
Proc Natl Acad Sci U S A ; 120(33): e2304415120, 2023 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-37549296

RESUMEN

Real-world healthcare data sharing is instrumental in constructing broader-based and larger clinical datasets that may improve clinical decision-making research and outcomes. Stakeholders are frequently reluctant to share their data without guaranteed patient privacy, proper protection of their datasets, and control over the usage of their data. Fully homomorphic encryption (FHE) is a cryptographic capability that can address these issues by enabling computation on encrypted data without intermediate decryptions, so the analytics results are obtained without revealing the raw data. This work presents a toolset for collaborative privacy-preserving analysis of oncological data using multiparty FHE. Our toolset supports survival analysis, logistic regression training, and several common descriptive statistics. We demonstrate using oncological datasets that the toolset achieves high accuracy and practical performance, which scales well to larger datasets. As part of this work, we propose a cryptographic protocol for interactive bootstrapping in multiparty FHE, which is of independent interest. The toolset we develop is general-purpose and can be applied to other collaborative medical and healthcare application domains.


Asunto(s)
Seguridad Computacional , Privacidad , Humanos , Modelos Logísticos , Toma de Decisiones Clínicas
16.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37497720

RESUMEN

Vertical federated learning has gained popularity as a means of enabling collaboration and information sharing between different entities while maintaining data privacy and security. This approach has potential applications in disease healthcare, cancer prognosis prediction, and other industries where data privacy is a major concern. Although using multi-omics data for cancer prognosis prediction provides more information for treatment selection, collecting different types of omics data can be challenging due to their production in various medical institutions. Data owners must comply with strict data protection regulations such as European Union (EU) General Data Protection Regulation. To share patient data across multiple institutions, privacy and security issues must be addressed. Therefore, we propose an adaptive optimized vertical federated-learning-based framework adaptive optimized vertical federated learning for heterogeneous multi-omics data integration (AFEI) to integrate multi-omics data collected from multiple institutions for cancer prognosis prediction. AFEI enables participating parties to build an accurate joint evaluation model for learning more information related to cancer patients from different perspectives, based on the distributed and encrypted multi-omics features shared by multiple institutions. The experimental results demonstrate that AFEI achieves higher prediction accuracy (6.5% on average) than using single omics data by utilizing the encrypted multi-omics data from different institutions, and it performs almost as well as prognosis prediction by directly integrating multi-omics data. Overall, AFEI can be seen as an efficient solution for breaking down barriers to multi-institutional collaboration and promoting the development of cancer prognosis prediction.


Asunto(s)
Aprendizaje , Multiómica , Humanos , Difusión de la Información , Privacidad
17.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38168838

RESUMEN

ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.


Asunto(s)
Almacenamiento y Recuperación de la Información , Lenguaje , Humanos , Privacidad , Investigadores
18.
Nucleic Acids Res ; 51(W1): W535-W541, 2023 07 05.
Artículo en Inglés | MEDLINE | ID: mdl-37246709

RESUMEN

Advances in genomics are increasingly depending upon the ability to analyze large and diverse genomic data collections, which are often difficult to amass due to privacy concerns. Recent works have shown that it is possible to jointly analyze datasets held by multiple parties, while provably preserving the privacy of each party's dataset using cryptographic techniques. However, these tools have been challenging to use in practice due to the complexities of the required setup and coordination among the parties. We present sfkit, a secure and federated toolkit for collaborative genomic studies, to allow groups of collaborators to easily perform joint analyses of their datasets without compromising privacy. sfkit consists of a web server and a command-line interface, which together support a range of use cases including both auto-configured and user-supplied computational environments. sfkit provides collaborative workflows for the essential tasks of genome-wide association study (GWAS) and principal component analysis (PCA). We envision sfkit becoming a one-stop server for secure collaborative tools for a broad range of genomic analyses. sfkit is open-source and available at: https://sfkit.org.


Asunto(s)
Estudio de Asociación del Genoma Completo , Genómica , Programas Informáticos , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Internet , Privacidad , Flujo de Trabajo
19.
Proc Natl Acad Sci U S A ; 119(31): e2104906119, 2022 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-35878030

RESUMEN

The federal statistical system is experiencing competing pressures for change. On the one hand, for confidentiality reasons, much socially valuable data currently held by federal agencies is either not made available to researchers at all or only made available under onerous conditions. On the other hand, agencies which release public databases face new challenges in protecting the privacy of the subjects in those databases, which leads them to consider releasing fewer data or masking the data in ways that will reduce their accuracy. In this essay, we argue that the discussion has not given proper consideration to the reduced social benefits of data availability and their usability relative to the value of increased levels of privacy protection. A more balanced benefit-cost framework should be used to assess these trade-offs. We express concerns both with synthetic data methods for disclosure limitation, which will reduce the types of research that can be reliably conducted in unknown ways, and with differential privacy criteria that use what we argue is an inappropriate measure of disclosure risk. We recommend that the measure of disclosure risk used to assess all disclosure protection methods focus on what we believe is the risk that individuals should care about, that more study of the impact of differential privacy criteria and synthetic data methods on data usability for research be conducted before either is put into widespread use, and that more research be conducted on alternative methods of disclosure risk reduction that better balance benefits and costs.


Asunto(s)
Seguridad Computacional , Confidencialidad , Privacidad , Recolección de Datos , Revelación , Gobierno Federal , Agencias Gubernamentales
20.
Proc Natl Acad Sci U S A ; 119(40): e2121024119, 2022 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-36166477

RESUMEN

A set of 20 short tandem repeats (STRs) is used by the US criminal justice system to identify suspects and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene-expression variation or potential medical information. We find six significant correlations (false discovery rate = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, showing evidence compatible with forensic STRs causing expression variation or being in linkage disequilibrium with a causal locus in three cases and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression levels and, perhaps, medical information.


Asunto(s)
Genética Forense , Sitios Genéticos , Repeticiones de Microsatélite , Privacidad , Genética Forense/legislación & jurisprudencia , Genética Forense/métodos , Frecuencia de los Genes , Genética de Población , Humanos , Desequilibrio de Ligamiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA