Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
BMC Med Res Methodol ; 17(1): 95, 2017 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-28693507

RESUMO

BACKGROUND: Probabilistic record linkage is a process used to bring together person-based records from within the same dataset (de-duplication) or from disparate datasets using pairwise comparisons and matching probabilities. The linkage strategy and associated match probabilities are often estimated through investigations into data quality and manual inspection. However, as privacy-preserved datasets comprise encrypted data, such methods are not possible. In this paper, we present a method for estimating the probabilities and threshold values for probabilistic privacy-preserved record linkage using Bloom filters. METHODS: Our method was tested through a simulation study using synthetic data, followed by an application using real-world administrative data. Synthetic datasets were generated with error rates from zero to 20% error. Our method was used to estimate parameters (probabilities and thresholds) for de-duplication linkages. Linkage quality was determined by F-measure. Each dataset was privacy-preserved using separate Bloom filters for each field. Match probabilities were estimated using the expectation-maximisation (EM) algorithm on the privacy-preserved data. Threshold cut-off values were determined by an extension to the EM algorithm allowing linkage quality to be estimated for each possible threshold. De-duplication linkages of each privacy-preserved dataset were performed using both estimated and calculated probabilities. Linkage quality using the F-measure at the estimated threshold values was also compared to the highest F-measure. Three large administrative datasets were used to demonstrate the applicability of the probability and threshold estimation technique on real-world data. RESULTS: Linkage of the synthetic datasets using the estimated probabilities produced an F-measure that was comparable to the F-measure using calculated probabilities, even with up to 20% error. Linkage of the administrative datasets using estimated probabilities produced an F-measure that was higher than the F-measure using calculated probabilities. Further, the threshold estimation yielded results for F-measure that were only slightly below the highest possible for those probabilities. CONCLUSIONS: The method appears highly accurate across a spectrum of datasets with varying degrees of error. As there are few alternatives for parameter estimation, the approach is a major step towards providing a complete operational approach for probabilistic linkage of privacy-preserved datasets.


Assuntos
Confiabilidade dos Dados , Registro Médico Coordenado/métodos , Privacidade , Probabilidade , Segurança Computacional , Conjuntos de Dados como Assunto , Humanos , Reprodutibilidade dos Testes
2.
Front Public Health ; 5: 34, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28303240

RESUMO

In an era where the volume of structured and unstructured digital data has exploded, there has been an enormous growth in the creation of data about individuals that can be used for understanding and treating disease. Joining these records together at an individual level provides a complete picture of a patient's interaction with health services and allows better assessment of patient outcomes and effectiveness of treatment and services. Record linkage techniques provide an efficient and cost-effective method to bring individual records together as patient profiles. These linkage procedures bring their own challenges, especially relating to the protection of privacy. The development and implementation of record linkage systems that do not require the release of personal information can reduce the risks associated with record linkage and overcome legal barriers to data sharing. Current conceptual and experimental privacy-preserving record linkage (PPRL) models show promise in addressing data integration challenges. Enhancing and operationalizing PPRL protocols can help address the dilemma faced by some custodians between using data to improve quality of life and dealing with the ethical, legal, and administrative issues associated with protecting an individual's privacy. These methods can reduce the risk to privacy, as they do not require personally identifying information to be shared. PPRL methods can improve the delivery of record linkage services to the health and broader research community.

3.
Front Public Health ; 5: 13, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28229070

RESUMO

BACKGROUND: Hospitals and death registries in Australia are operated under individual state government jurisdictions. Some state borders are located in heavily populated areas or are located near to major capital cities. Mortality indicators for hospital located near state borders may not be estimated accurately if patients are lost as they cross state borders. The aim of this study was to evaluate how cross-jurisdictional linkage of state hospital and death records across state borders may improve estimation of the hospital standardized mortality ratio (HSMR), a tool used in Australia as a hospital performance indicator. METHOD: Retrospective cohort study of 7.7 million hospital patients from July 2004 to June 2009. Inhospital deaths and deaths within 30 days of hospital discharge from four state jurisdictions were used to estimate the standardized mortality ratio of hospital groups defined by geography and type of hospital (grouped HSMR) under three record linkage scenarios, as follows: (1) cross-jurisdictional person-level linkage, (2) within-jurisdictional (state-based) person-level linkage, and (3) unlinked records. All public and private hospitals in New South Wales, Queensland, Western Australia, and public hospitals in South Australia were included in this study. Death registrations from all four states were obtained from state-based registries of births, deaths, and marriages. RESULTS: Cross-jurisdictional linkage identified 11,116 cross-border hospital transfers of which 170 resulted in a cross-border inhospital death. An additional 496 cross-border deaths occurred within 30 days of hospital discharge. The inclusion of cross-jurisdictional person-level links to unlinked hospital records reduced the coefficient of variation among the grouped HSMRs from 0.19 to 0.15; the inclusion of 30-day deaths reduced the coefficient of variation further to 0.11. There were minor changes in grouped HSMRs between cross-jurisdictional and within-jurisdictional linkages, although the impact of cross-jurisdictional linkage increased when restricted to regions with high cross-border hospital use. CONCLUSION: Cross-jurisdictional linkage modified estimates of grouped HSMRs in hospital groups likely to receive a high proportion of cross-border users. Hospital identifiers will be required to confirm whether individual hospital performance indicators change.

5.
Health Inf Manag ; 45(2): 71-9, 2016 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-27178751

RESUMO

BACKGROUND: The statistical linkage key (SLK-581) is a common tool for record linkage in Australia, due to its ability to provide some privacy protection. However, newer privacy-preserving approaches may provide greater privacy protection, while allowing high-quality linkage. OBJECTIVE: To evaluate the standard SLK-581, encrypted SLK-581 and a newer privacy-preserving approach using Bloom filters, in terms of both privacy and linkage quality. METHOD: Linkage quality was compared by conducting linkages on Australian health datasets using these three techniques and examining results. Privacy was compared qualitatively in relation to a series of scenarios where privacy breaches may occur. RESULTS: The Bloom filter technique offered greater privacy protection and linkage quality compared to the SLK-based method commonly used in Australia. CONCLUSION: The adoption of new privacy-preserving methods would allow both greater confidence in research results, while significantly improving privacy protection.


Assuntos
Confidencialidade/normas , Registro Médico Coordenado/normas , Sistemas Computadorizados de Registros Médicos/organização & administração , Software , Algoritmos , Austrália , Humanos
6.
Methods Inf Med ; 55(3): 276-83, 2016 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-27096424

RESUMO

BACKGROUND: Record linkage techniques allow different data collections to be brought together to provide a wider picture of the health status of individuals. Ensuring high linkage quality is important to guarantee the quality and integrity of research. Current methods for measuring linkage quality typically focus on precision (the proportion of incorrect links), given the difficulty of measuring the proportion of false negatives. OBJECTIVES: The aim of this work is to introduce and evaluate a sampling based method to estimate both precision and recall following record linkage. METHODS: In the sampling based method, record-pairs from each threshold (including those below the identified cut-off for acceptance) are sampled and clerically reviewed. These results are then applied to the entire set of record-pairs, providing estimates of false positives and false negatives. This method was evaluated on a synthetically generated dataset, where the true match status (which records belonged to the same person) was known. RESULTS: The sampled estimates of linkage quality were relatively close to actual linkage quality metrics calculated for the whole synthetic dataset. The precision and recall measures for seven reviewers were very consistent with little variation in the clerical assessment results (overall agreement using the Fleiss Kappa statistics was 0.601). CONCLUSIONS: This method presents as a possible means of accurately estimating matching quality and refining linkages in population level linkage studies. The sampling approach is especially important for large project linkages where the number of record pairs produced may be very large often running into millions.


Assuntos
Registro Médico Coordenado/métodos , Automação , Reprodutibilidade dos Testes , Tamanho da Amostra
7.
BMC Health Serv Res ; 15: 312, 2015 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-26253452

RESUMO

BACKGROUND: The technical challenges associated with national data linkage, and the extent of cross-border population movements, are explored as part of a pioneering research project. The project involved linking state-based hospital admission records and death registrations across Australia for a national study of hospital related deaths. METHODS: The project linked over 44 million morbidity and mortality records from four Australian states between 1st July 1999 and 31st December 2009 using probabilistic methods. The accuracy of the linkage was measured through a comparison with jurisdictional keys sourced from individual states. The extent of cross-border population movement between these states was also assessed. RESULTS: Data matching identified almost twelve million individuals across the four Australian states. The percentage of individuals from one state with records found in another ranged from 3-5%. Using jurisdictional keys to measure linkage quality, results indicate a high matching efficiency (F measure 97 to 99%), with linkage processing taking only a matter of days. CONCLUSIONS: The results demonstrate the feasibility and accuracy of undertaking cross jurisdictional linkage for national research. The benefits are substantial, particularly in relation to capturing the full complement of records in patient pathways as a result of cross-border population movements. The project identified a sizeable 'mobile' population with hospital records in more than one state. Research studies that focus on a single jurisdiction will under-enumerate the extent of hospital usage by individuals in the population. It is important that researchers understand and are aware of the impact of this missing hospital activity on their studies. The project highlights the need for an efficient and accurate data linkage system to support national research across Australia.


Assuntos
Procedimentos Clínicos/normas , Armazenamento e Recuperação da Informação , Viagem , Austrália , Registros Hospitalares , Hospitalização , Humanos , Sistemas de Informação , Registro Médico Coordenado/métodos , Morbidade
8.
Med J Aust ; 202(11): 582-6, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-26068690

RESUMO

OBJECTIVE: To determine the quality and effectiveness of national data linkage capacity by performing a proof-of-concept project investigating cross-border hospital use and hospital-related deaths. DESIGN, PARTICIPANTS AND SETTING: Analysis of person-level linked hospital separation and death registration data of all public and private hospital patients in New South Wales, Queensland and Western Australia and of public hospital patients in South Australia, totalling 7.7 million hospital patients from 1 July 2004 to 30 June 2009. MAIN OUTCOME MEASURES: Counts and proportions of hospital stays and patient movement patterns. RESULTS: 223 262 patients (3.0%) travelled across a state border to attend hospitals, in particular, far northern and western NSW patients travelling to Queensland and SA hospitals, respectively. A further 48 575 patients (0.6%) moved their place of residence interstate between hospital visits, particularly to and from areas associated with major mining and tourism industries. Over 11 000 cross-border hospital transfers were also identified. Of patients who travelled across a state border to hospital, 2800 (1.3%) died in that hospital. An additional 496 deaths recorded in one jurisdiction occurred within 30 days of hospital separation from another jurisdiction. CONCLUSIONS: Access to person-level data linked across jurisdictions identified geographical hot spots of cross-border hospital use and hospital-related deaths in Australia. This has implications for planning of health service delivery and for longitudinal follow-up studies, particularly those involving mobile populations.


Assuntos
Emigração e Imigração , Mortalidade Hospitalar , Hospitalização/estatística & dados numéricos , Hospitais/estatística & dados numéricos , Austrália , Estudos de Coortes , Coleta de Dados , Seguimentos , Humanos , Estudos Retrospectivos , Viagem
9.
BMC Med Inform Decis Mak ; 14: 23, 2014 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-24678656

RESUMO

BACKGROUND: Record linkage techniques are widely used to enable health researchers to gain event based longitudinal information for entire populations. The task of record linkage is increasingly being undertaken by specialised linkage units (SLUs). In addition to the complexity of undertaking probabilistic record linkage, these units face additional technical challenges in providing record linkage 'as a service' for research. The extent of this functionality, and approaches to solving these issues, has had little focus in the record linkage literature. Few, if any, of the record linkage packages or systems currently used by SLUs include the full range of functions required. METHODS: This paper identifies and discusses some of the functions that are required or undertaken by SLUs in the provision of record linkage services. These include managing routine, on-going linkage; storing and handling changing data; handling different linkage scenarios; accommodating ever increasing datasets. Automated linkage processes are one way of ensuring consistency of results and scalability of service. RESULTS: Alternative solutions to some of these challenges are presented. By maintaining a full history of links, and storing pairwise information, many of the challenges around handling 'open' records, and providing automated managed extractions are solved. A number of these solutions were implemented as part of the development of the National Linkage System (NLS) by the Centre for Data Linkage (part of the Population Health Research Network) in Australia. CONCLUSIONS: The demand for, and complexity of, linkage services is growing. This presents as a challenge to SLUs as they seek to service the varying needs of dozens of research projects annually. Linkage units need to be both flexible and scalable to meet this demand. It is hoped the solutions presented here can help mitigate these difficulties.


Assuntos
Coleta de Dados/normas , Processamento Eletrônico de Dados/normas , Registros Eletrônicos de Saúde/normas , Gestão da Informação em Saúde/normas , Armazenamento e Recuperação da Informação/normas , Austrália , Humanos
10.
Comput Methods Programs Biomed ; 115(2): 55-63, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24768079

RESUMO

Ensuring high linkage quality is important in many record linkage applications. Current methods for ensuring quality are manual and resource intensive. This paper seeks to determine the effectiveness of graph theory techniques in identifying record linkage errors. A range of graph theory techniques was applied to two linked datasets, with known truth sets. The ability of graph theory techniques to identify groups containing errors was compared to a widely used threshold setting technique. This methodology shows promise; however, further investigations into graph theory techniques are required. The development of more efficient and effective methods of improving linkage quality will result in higher quality datasets that can be delivered to researchers in shorter timeframes.


Assuntos
Registro Médico Coordenado/métodos , Bases de Dados Factuais/estatística & dados numéricos , Humanos , Registro Médico Coordenado/normas , Modelos Estatísticos , New South Wales , Software , Austrália Ocidental
11.
J Biomed Inform ; 50: 205-12, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24333482

RESUMO

Record linkage typically involves the use of dedicated linkage units who are supplied with personally identifying information to determine individuals from within and across datasets. The personally identifying information supplied to linkage units is separated from clinical information prior to release by data custodians. While this substantially reduces the risk of disclosure of sensitive information, some residual risks still exist and remain a concern for some custodians. In this paper we trial a method of record linkage which reduces privacy risk still further on large real world administrative data. The method uses encrypted personal identifying information (bloom filters) in a probability-based linkage framework. The privacy preserving linkage method was tested on ten years of New South Wales (NSW) and Western Australian (WA) hospital admissions data, comprising in total over 26 million records. No difference in linkage quality was found when the results were compared to traditional probabilistic methods using full unencrypted personal identifiers. This presents as a possible means of reducing privacy risks related to record linkage in population level research studies. It is hoped that through adaptations of this method or similar privacy preserving methods, risks related to information disclosure can be reduced so that the benefits of linked research taking place can be fully realised.


Assuntos
Segurança Computacional , Conjuntos de Dados como Assunto , Registro Médico Coordenado , Privacidade , Austrália Ocidental
12.
Int J Health Geogr ; 12: 50, 2013 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-24207169

RESUMO

BACKGROUND: Geocoding, the process of converting textual information describing a location into one or more digital geographic representations, is a routine task performed at large organizations and government agencies across the globe. In a health context, this task is often a fundamental first step performed prior to all operations that take place in a spatially-based health study. As such, the quality of the geocoding system used within these agencies is of paramount concern to the agency (the producer) and researchers or policy-makers who wish to use these data (consumers). However, geocoding systems are continually evolving with new products coming on the market continuously. Agencies must develop and use criteria across a number axes when faced with decisions about building, buying, or maintaining any particular geocoding systems. To date, published criteria have focused on one or more aspects of geocode quality without taking a holistic view of a geocoding system's role within a large organization. The primary purpose of this study is to develop and test an evaluation framework to assist a large organization in determining which geocoding systems will meet its operational needs. METHODS: A geocoding platform evaluation framework is derived through an examination of prior literature on geocoding accuracy. The framework developed extends commonly used geocoding metrics to take into account the specific concerns of large organizations for which geocoding is a fundamental operational capability tightly-knit into its core mission of processing health data records. A case study is performed to evaluate the strengths and weaknesses of five geocoding platforms currently available in the Australian geospatial marketplace. RESULTS: The evaluation framework developed in this research is proven successful in differentiating between key capabilities of geocoding systems that are important in the context of a large organization with significant investments in geocoding resources. Results from the proposed methodology highlight important differences across all axes of geocoding system comparisons including spatial data output accuracy, reference data coverage, system flexibility, the potential for tight integration, and the need for specialized staff and/or development time and funding. Such results can empower decisions-makers within large organizations as they make decisions and investments in geocoding systems.


Assuntos
Bases de Dados Factuais/normas , Sistemas de Informação Geográfica/normas , Mapeamento Geográfico , Humanos , Austrália Ocidental/epidemiologia
13.
BMC Med Inform Decis Mak ; 13: 64, 2013 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-23739011

RESUMO

BACKGROUND: Within the field of record linkage, numerous data cleaning and standardisation techniques are employed to ensure the highest quality of links. While these facilities are common in record linkage software packages and are regularly deployed across record linkage units, little work has been published demonstrating the impact of data cleaning on linkage quality. METHODS: A range of cleaning techniques was applied to both a synthetically generated dataset and a large administrative dataset previously linked to a high standard. The effect of these changes on linkage quality was investigated using pairwise F-measure to determine quality. RESULTS: Data cleaning made little difference to the overall linkage quality, with heavy cleaning leading to a decrease in quality. Further examination showed that decreases in linkage quality were due to cleaning techniques typically reducing the variability - although correct records were now more likely to match, incorrect records were also more likely to match, and these incorrect matches outweighed the correct matches, reducing quality overall. CONCLUSIONS: Data cleaning techniques have minimal effect on linkage quality. Care should be taken during the data cleaning process.


Assuntos
Processamento Eletrônico de Dados/métodos , Registro Médico Coordenado/normas , Controle de Qualidade , Variações Dependentes do Observador
14.
BMC Health Serv Res ; 12: 480, 2012 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-23272652

RESUMO

BACKGROUND: The Centre for Data Linkage (CDL) has been established to enable national and cross-jurisdictional health-related research in Australia. It has been funded through the Population Health Research Network (PHRN), a national initiative established under the National Collaborative Research Infrastructure Strategy (NCRIS). This paper describes the development of the processes and methodology required to create cross-jurisdictional research infrastructure and enable aggregation of State and Territory linkages into a single linkage "map". METHODS: The CDL has implemented a linkage model which incorporates best practice in data linkage and adheres to data integration principles set down by the Australian Government. Working closely with data custodians and State-based data linkage facilities, the CDL has designed and implemented a linkage system to enable research at national or cross-jurisdictional level. A secure operational environment has also been established with strong governance arrangements to maximise privacy and the confidentiality of data. RESULTS: The development and implementation of a cross-jurisdictional linkage model overcomes a number of challenges associated with the federated nature of health data collections in Australia. The infrastructure expands Australia's data linkage capability and provides opportunities for population-level research. The CDL linkage model, infrastructure architecture and governance arrangements are presented. The quality and capability of the new infrastructure is demonstrated through the conduct of data linkage for the first PHRN Proof of Concept Collaboration project, where more than 25 million records were successfully linked to a very high quality. CONCLUSIONS: This infrastructure provides researchers and policy-makers with the ability to undertake linkage-based research that extends across jurisdictional boundaries. It represents an advance in Australia's national data linkage capabilities and sets the scene for stronger government-research collaboration.


Assuntos
Benchmarking , Pesquisa sobre Serviços de Saúde , Registro Médico Coordenado , Formulação de Políticas , Medicina Estatal/legislação & jurisprudência , Austrália , Segurança Computacional , Confidencialidade , Pesquisa sobre Serviços de Saúde/ética , Pesquisa sobre Serviços de Saúde/métodos , Humanos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA