Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
J Med Internet Res ; 19(4): e104, 2017 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-28385682

RESUMO

BACKGROUND: The Internet and social media offer promising ways to improve the reach, efficiency, and effectiveness of recruitment efforts at a reasonable cost, but raise unique ethical dilemmas. We describe how we used social media to recruit cancer patients and family caregivers for a research study, the ethical issues we encountered, and the strategies we developed to address them. OBJECTIVE: Drawing on the principles of Privacy by Design (PbD), a globally recognized standard for privacy protection, we aimed to develop a PbD framework for online health research recruitment. METHODS: We proposed a focus group study on the dietary behaviors of cancer patients and their families, and the role of Web-based dietary self-management tools. Using an established blog on our hospital website, we proposed publishing a recruitment post and sharing the link on our Twitter and Facebook pages. The Research Ethics Board (REB) raised concern about the privacy risks associated with our recruitment strategy; by clicking on a recruitment post, an individual could inadvertently disclose personal health information to third-party companies engaged in tracking online behavior. The REB asked us to revise our social media recruitment strategy with the following questions in mind: (1) How will you inform users about the potential for privacy breaches and their implications? and (2) How will you protect users from privacy breaches or inadvertently sharing potentially identifying information about themselves? RESULTS: Ethical guidelines recommend a proportionate approach to ethics assessment, which advocates for risk mitigation strategies that are proportional to the magnitude and probability of risks. We revised our social media recruitment strategy to inform users about privacy risks and to protect their privacy, while at the same time meeting our recruitment objectives. We provide a critical reflection of the perceived privacy risks associated with our social media recruitment strategy and the appropriateness of the risk mitigation strategies that we employed by assessing their alignment with PbD and by discussing the following: (1) What are the potential risks and who is at risk? (2) Is cancer considered "sensitive" personal information? (3) What is the probability of online disclosure of a cancer diagnosis in everyday life? and (4) What are the public's expectations for privacy online and their views about online tracking, profiling, and targeting? We conclude with a PbD framework for online health research recruitment. CONCLUSIONS: Researchers, REBs, ethicists, students, and potential study participants are often unaware of the privacy risks of social media research recruitment and there is no official guidance. Our PbD framework for online health research recruitment is a resource for these wide audiences.


Assuntos
Segurança Computacional , Internet , Seleção de Pacientes/ética , Mídias Sociais/ética , Ética em Pesquisa , Humanos , Privacidade
2.
J Biomed Inform ; 63: 174-183, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27426236

RESUMO

OBJECTIVES: It has become regular practice to de-identify unstructured medical text for use in research using automatic methods, the goal of which is to remove patient identifying information to minimize re-identification risk. The metrics commonly used to determine if these systems are performing well do not accurately reflect the risk of a patient being re-identified. We therefore developed a framework for measuring the risk of re-identification associated with textual data releases. METHODS: We apply the proposed evaluation framework to a data set from the University of Michigan Medical School. Our risk assessment results are then compared with those that would be obtained using a typical contemporary micro-average evaluation of recall in order to illustrate the difference between the proposed evaluation framework and the current baseline method. RESULTS: We demonstrate how this framework compares against common measures of the re-identification risk associated with an automated text de-identification process. For the probability of re-identification using our evaluation framework we obtained a mean value for direct identifiers of 0.0074 and a mean value for quasi-identifiers of 0.0022. The 95% confidence interval for these estimates were below the relevant thresholds. The threshold for direct identifier risk was based on previously used approaches in the literature. The threshold for quasi-identifiers was determined based on the context of the data release following commonly used de-identification criteria for structured data. DISCUSSION: Our framework attempts to correct for poorly distributed evaluation corpora, accounts for the data release context, and avoids the often optimistic assumptions that are made using the more traditional evaluation approach. It therefore provides a more realistic estimate of the true probability of re-identification. CONCLUSIONS: This framework should be used as a basis for computing re-identification risk in order to more realistically evaluate future text de-identification tools.


Assuntos
Confidencialidade , Anonimização de Dados , Registros Eletrônicos de Saúde , Humanos , Risco
3.
J Med Internet Res ; 14(5): e144, 2012 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-23079075

RESUMO

BACKGROUND: The h-index is a commonly used metric for evaluating the publication performance of researchers. However, in a multidisciplinary field such as medical informatics, interpreting the h-index is a challenge because researchers tend to have diverse home disciplines, ranging from clinical areas to computer science, basic science, and the social sciences, each with different publication performance profiles. OBJECTIVE: To construct a reference standard for interpreting the h-index of medical informatics researchers based on the performance of their peers. METHODS: Using a sample of authors with articles published over the 5-year period 2006-2011 in the 2 top journals in medical informatics (as determined by impact factor), we computed their h-index using the Scopus database. Percentiles were computed to create a 6-level benchmark, similar in scheme to one used by the US National Science Foundation, and a 10-level benchmark. RESULTS: The 2 benchmarks can be used to place medical informatics researchers in an ordered category based on the performance of their peers. A validation exercise mapped the benchmark levels to the ranks of medical informatics academic faculty in the United States. The 10-level benchmark tracked academic rank better (with no ties) and is therefore more suitable for practical use. CONCLUSIONS: Our 10-level benchmark provides an objective basis to evaluate and compare the publication performance of medical informatics researchers with that of their peers using the h-index.


Assuntos
Benchmarking , Informática Médica , Editoração , Pesquisa , Tamanho da Amostra
4.
J Med Internet Res ; 14(4): e95, 2012 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-22776692

RESUMO

BACKGROUND: Users of peer-to-peer (P2P) file-sharing networks risk the inadvertent disclosure of personal health information (PHI). In addition to potentially causing harm to the affected individuals, this can heighten the risk of data breaches for health information custodians. Automated PHI detection tools that crawl the P2P networks can identify PHI and alert custodians. While there has been previous work on the detection of personal information in electronic health records, there has been a dearth of research on the automated detection of PHI in heterogeneous user files. OBJECTIVE: To build a system that accurately detects PHI in files sent through P2P file-sharing networks. The system, which we call P2P Watch, uses a pipeline of text processing techniques to automatically detect PHI in files exchanged through P2P networks. P2P Watch processes unstructured texts regardless of the file format, document type, and content. METHODS: We developed P2P Watch to extract and analyze PHI in text files exchanged on P2P networks. We labeled texts as PHI if they contained identifiable information about a person (eg, name and date of birth) and specifics of the person's health (eg, diagnosis, prescriptions, and medical procedures). We evaluated the system's performance through its efficiency and effectiveness on 3924 files gathered from three P2P networks. RESULTS: P2P Watch successfully processed 3924 P2P files of unknown content. A manual examination of 1578 randomly selected files marked by the system as non-PHI confirmed that these files indeed did not contain PHI, making the false-negative detection rate equal to zero. Of 57 files marked by the system as PHI, all contained both personally identifiable information and health information: 11 files were PHI disclosures, and 46 files contained organizational materials such as unfilled insurance forms, job applications by medical professionals, and essays. CONCLUSIONS: PHI can be successfully detected in free-form textual files exchanged through P2P networks. Once the files with PHI are detected, affected individuals or data custodians can be alerted to take remedial action.


Assuntos
Redes de Comunicação de Computadores , Confidencialidade , Registros Eletrônicos de Saúde , Registros de Saúde Pessoal , Segurança Computacional , Revelação , Humanos , Disseminação de Informação , Armazenamento e Recuperação da Informação , Software
5.
J Med Internet Res ; 14(1): e33, 2012 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-22370452

RESUMO

BACKGROUND: There are many benefits to open datasets. However, privacy concerns have hampered the widespread creation of open health data. There is a dearth of documented methods and case studies for the creation of public-use health data. We describe a new methodology for creating a longitudinal public health dataset in the context of the Heritage Health Prize (HHP). The HHP is a global data mining competition to predict, by using claims data, the number of days patients will be hospitalized in a subsequent year. The winner will be the team or individual with the most accurate model past a threshold accuracy, and will receive a US $3 million cash prize. HHP began on April 4, 2011, and ends on April 3, 2013. OBJECTIVE: To de-identify the claims data used in the HHP competition and ensure that it meets the requirements in the US Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule. METHODS: We defined a threshold risk consistent with the HIPAA Privacy Rule Safe Harbor standard for disclosing the competition dataset. Three plausible re-identification attacks that can be executed on these data were identified. For each attack the re-identification probability was evaluated. If it was deemed too high then a new de-identification algorithm was applied to reduce the risk to an acceptable level. We performed an actual evaluation of re-identification risk using simulated attacks and matching experiments to confirm the results of the de-identification and to test sensitivity to assumptions. The main metric used to evaluate re-identification risk was the probability that a record in the HHP data can be re-identified given an attempted attack. RESULTS: An evaluation of the de-identified dataset estimated that the probability of re-identifying an individual was .0084, below the .05 probability threshold specified for the competition. The risk was robust to violations of our initial assumptions. CONCLUSIONS: It was possible to ensure that the probability of re-identification for a large longitudinal dataset was acceptably low when it was released for a global user community in support of an analytics competition. This is an example of, and methodology for, achieving open data principles for longitudinal health data.


Assuntos
Sistemas Computadorizados de Registros Médicos , Sistemas de Identificação de Pacientes , Health Insurance Portability and Accountability Act , Estados Unidos
7.
J Clin Epidemiol ; 89: 168-172, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28433677

RESUMO

BACKGROUND: Patient-reported outcomes (PROs) are collected with consent for care; however, using the data for any other purpose requires consent for that additional purpose, or the anonymization of the data. Collecting explicit consent to use this data for secondary purposes, before the patient completes a PRO, can also bias the responses. OBJECTIVE: We consider the ethical and security issues related to the collection of data at the point of care or in the population and the aggregation and integration of PRO data with administrative databases to facilitate decision making and comparative effectiveness research. DISCUSSION: In this article, we describe risk-based anonymization, taking the context of the data release into account, so that we may consider the degree by which the release is considered anonymized. We also consider the ethical use of anonymized data, the anonymization of free-form text, and the secure linking data sets without sharing any personal information. Many good standards and best practices exist for the sharing of health data and could be used as a baseline in the development of a national PRO initiative.


Assuntos
Anonimização de Dados/ética , Medidas de Resultados Relatados pelo Paciente , Canadá , Congressos como Assunto , Humanos , Disseminação de Informação/ética
8.
PLoS One ; 9(4): e93285, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24714643

RESUMO

BACKGROUND: There is stigma attached to the identification of residents carrying antimicrobial resistant organisms (ARO) in long term care homes, yet there is a need to collect data about their prevalence for public health surveillance and intervention purposes. OBJECTIVE: We conducted a point prevalence study to assess ARO rates in long term care homes in Ontario using a secure data collection system. METHODS: All long term care homes in the province were asked to provide colonization or infection counts for methicillin-resistant Staphylococcus aureus (MRSA), vancomycin-resistant enterococci (VRE), and extended-spectrum beta-lactamase (ESBL) as recorded in their electronic medical records, and the number of current residents. Data was collected online during the October-November 2011 period using a Paillier cryptosystem that allows computation on encrypted data. RESULTS: A provably secure data collection system was implemented. Overall, 82% of the homes in the province responded. MRSA was the most frequent ARO identified at 3 cases per 100 residents, followed by ESBL at 0.83 per 100 residents, and VRE at 0.56 per 100 residents. The microbiological findings and their distribution were consistent with available provincial laboratory data reporting test results for AROs in hospitals. CONCLUSIONS: We describe an ARO point prevalence study which demonstrated the feasibility of collecting data from long term care homes securely across the province and providing strong privacy and confidentiality assurances, while obtaining high response rates.


Assuntos
Enterococcus/isolamento & purificação , Assistência de Longa Duração , Staphylococcus aureus Resistente à Meticilina/isolamento & purificação , Infecções Estafilocócicas/epidemiologia , Resistência a Vancomicina , Resistência beta-Lactâmica , Enterococcus/efeitos dos fármacos , Humanos , Controle de Infecções , Staphylococcus aureus Resistente à Meticilina/efeitos dos fármacos , Casas de Saúde , Ontário
9.
J Am Med Inform Assoc ; 20(3): 453-61, 2013 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-22871397

RESUMO

BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. OBJECTIVE: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. METHODS: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. RESULTS: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. CONCLUSION: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Modelos Logísticos , Vigilância de Produtos Comercializados/métodos , Bioestatística , Redes de Comunicação de Computadores , Humanos
10.
PLoS One ; 6(12): e28071, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22164229

RESUMO

BACKGROUND: Privacy legislation in most jurisdictions allows the disclosure of health data for secondary purposes without patient consent if it is de-identified. Some recent articles in the medical, legal, and computer science literature have argued that de-identification methods do not provide sufficient protection because they are easy to reverse. Should this be the case, it would have significant and important implications on how health information is disclosed, including: (a) potentially limiting its availability for secondary purposes such as research, and (b) resulting in more identifiable health information being disclosed. Our objectives in this systematic review were to: (a) characterize known re-identification attacks on health data and contrast that to re-identification attacks on other kinds of data, (b) compute the overall proportion of records that have been correctly re-identified in these attacks, and (c) assess whether these demonstrate weaknesses in current de-identification methods. METHODS AND FINDINGS: Searches were conducted in IEEE Xplore, ACM Digital Library, and PubMed. After screening, fourteen eligible articles representing distinct attacks were identified. On average, approximately a quarter of the records were re-identified across all studies (0.26 with 95% CI 0.046-0.478) and 0.34 for attacks on health data (95% CI 0-0.744). There was considerable uncertainty around the proportions as evidenced by the wide confidence intervals, and the mean proportion of records re-identified was sensitive to unpublished studies. Two of fourteen attacks were performed with data that was de-identified using existing standards. Only one of these attacks was on health data, which resulted in a success rate of 0.00013. CONCLUSIONS: The current evidence shows a high re-identification rate but is dominated by small-scale studies on data that was not de-identified according to existing standards. This evidence is insufficient to draw conclusions about the efficacy of de-identification methods.


Assuntos
Segurança Computacional , Confidencialidade/legislação & jurisprudência , Privacidade , Bases de Dados Factuais , Health Insurance Portability and Accountability Act , Humanos , Sistemas Computadorizados de Registros Médicos , Modelos Estatísticos , Reprodutibilidade dos Testes , Software , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa