Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Int J Popul Data Sci ; 8(1): 2115, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37636835

RESUMO

Databases covering all individuals of a population are increasingly used for research and decision-making. The massive size of such databases is often mistaken as a guarantee for valid inferences. However, population data have characteristics that make them challenging to use. Various assumptions on population coverage and data quality are commonly made, including how such data were captured and what types of processing have been applied to them. Furthermore, the full potential of population data can often only be unlocked when such data are linked to other databases. Record linkage often implies subtle technical problems, which are easily missed. We discuss a diverse range of myths and misconceptions relevant for anybody capturing, processing, linking, or analysing population data. Remarkably, many of these myths and misconceptions are due to the social nature of data collections and are therefore missed by purely technical accounts of data processing. Many are also not well documented in scientific publications. We conclude with a set of recommendations for using population data.


Assuntos
Confiabilidade dos Dados , Registro Médico Coordenado , Humanos , Coleta de Dados , Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Saúde da População
2.
Mach Learn ; 110(3): 451-456, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33746357

RESUMO

The F-measure, also known as the F1-score, is widely used to assess the performance of classification algorithms. However, some researchers find it lacking in intuitive interpretation, questioning the appropriateness of combining two aspects of performance as conceptually distinct as precision and recall, and also questioning whether the harmonic mean is the best way to combine them. To ease this concern, we describe a simple transformation of the F-measure, which we call F ∗ (F-star), which has an immediate practical interpretation.

3.
Int J Popul Data Sci ; 3(1): 420, 2018 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-32935001

RESUMO

Data linkage, the process of identifying records that refer to the same entities across databases, is a crucial component of Population Data Science. Data linkage has a history going back over fifty years with many different methods and techniques being developed in various disciplines including computer science, statistics, and health informatics. Data linkage researchers and practitioners are commonly only familiar with methods and techniques that have been developed or are used in their own discipline, and they often only follow research that is being published at venues in their own discipline. There is currently no single online resource that allows data linkage researchers and practitioners across different disciplines to exchange ideas, post questions, or advertise new publications, software, open positions, or upcoming conferences and workshops. This leads to a communication gap in the multi-disciplinary field of data linkage. We aim to address this gap with the DLforum, a public online discussion forum for data linkage. DLforum contains several discussion areas, including publication announcements, resources (software and datasets), information about upcoming conferences and workshops, job opportunities, and general questions related to data linkage. The forum includes a moderation process where all registered users can post content and reply to posts by other users. We anticipate that the number of users registered and the amount of content posted in the forum will show that such an online forum is of value to data linkage researchers and practitioners from different disciplines to effectively communicate and exchange their knowledge, and thus form an online community of practice. In this paper we describe the methods of developing the DLforum, its structure and content, and our plan on how to evaluate the forum. The DLforum is freely available at: https://dmm.anu.edu.au/DLforum/.

4.
Crime Sci ; 7(1): 12, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30931232

RESUMO

BACKGROUND: The Manning Cost-Benefit Tool (MCBT) was developed to assist criminal justice policymakers, policing organisations and crime prevention practitioners to assess the benefits of different interventions for reducing crime and to select those strategies that represent the greatest economic return on investment. DISCUSSION: A challenge with the MCBT and other cost-benefit tools is that users need to input, manually, a considerable amount of point-in-time data, a process that is time consuming, relies on subjective expert opinion, and introduces the potential for data-input error. In this paper, we present and discuss a conceptual model for a 'smart' MCBT that utilises machine learning techniques. SUMMARY: We argue that the Smart MCBT outlined in this paper will overcome the shortcomings of existing cost-benefit tools. It does this by reintegrating individual cost-benefit analysis (CBA) projects using a database system that securely stores and de-identifies project data, and redeploys it using a range of machine learning and data science techniques. In addition, the question of what works is respecified by the Smart MCBT tool as a data science pipeline, which serves to enhance CBA and reconfigure the policy making process in the paradigm of open data and data analytics.

5.
J Biomed Inform ; 59: 285-98, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26707453

RESUMO

The identification of similar entities represented by records in different databases has drawn considerable attention in many application areas, including in the health domain. One important type of entity matching application that is vital for quality healthcare analytics is the identification of similar patients, known as similar patient matching. A key component of identifying similar records is the calculation of similarity of the values in attributes (fields) between these records. Due to increasing privacy and confidentiality concerns, using the actual attribute values of patient records to identify similar records across different organizations is becoming non-trivial because the attributes in such records often contain highly sensitive information such as personal and medical details of patients. Therefore, the matching needs to be based on masked (encoded) values while being effective and efficient to allow matching of large databases. Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching of string and categorical values. However, no work on Bloom filter-based masking of numerical data, such as integer (e.g. age), floating point (e.g. body mass index), and modulus (numbers wrap around upon reaching a certain value, e.g. date and time), which are commonly required in the health domain, has been presented in the literature. We propose a framework with novel methods for masking numerical data using Bloom filters, thereby facilitating the calculation of similarities between records. We conduct an empirical study on publicly available real-world datasets which shows that our framework provides efficient masking and achieves similar matching accuracy compared to the matching of actual unencoded patient records.


Assuntos
Confidencialidade , Registros Eletrônicos de Saúde/normas , Privacidade , Algoritmos , Segurança Computacional , Humanos , Informática Médica
6.
Anesth Analg ; 108(4): 1069-75, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19299763

RESUMO

BACKGROUND: Associations between preoperative elevation of brain natriuretic peptide (BNP) or postoperative elevation of cardiac troponins (cTn) with major adverse cardiac events (MACE) after major surgery have been shown previously. In this study, we evaluated the added value of preoperative BNP with postoperative cTn levels for the prediction of MACE in patients undergoing major vascular surgery. METHODS: This is a prospectively prespecified, secondary analysis of data from a cohort of 133 clinically stable patients undergoing major vascular surgery enrolled in a clinical trial evaluating the effectiveness of the sympathetic nervous system-inhibiting drug moxonidine on reducing MACE. Concentrations of BNP and cTn were determined before surgery, and concentrations of cTn were measured immediately after surgery and on postoperative days 1, 2, 3, and 7. The primary end point was the occurrence of MACE (defined as any hospitalization for myocardial revascularization, acute coronary syndrome, acute congestive heart failure, or death by any cause) within 1 yr after surgery. Patients were evaluated for MACE by hospital chart review during hospitalization and by telephone interviews 12 mo after surgery. RESULTS: Within 1 yr after surgery, 19 patients (14%) had a MACE, including 14 patients (11%) who died. After adjustment for age, gender, and the revised cardiac risk index, preoperative BNP elevation > or =50 pg/mL was associated with MACE (adjusted hazard ratio [HR]: 6.5, 95% confidence interval [CI]: 1.4-29.5) regardless of the subsequent cTn I concentrations. The combination of preoperative BNP elevation > or =50 pg/mL and postoperative cTn I elevation > or =2 ng/mL was associated with MACE (adjusted HR: 25.2, 95% CI: 5.0-128.4) and all-cause mortality (adjusted HR: 18.7, 95% CI: 3.1-112.5). The negative predictive value of a normal preoperative BNP value for subsequent adverse events was 0.965 (95% CI: 0.879-0.996). CONCLUSION: These data suggest that measurement of preoperative BNP concentrations in addition to postoperative cTn concentrations provides additive prognostic information for MACE and mortality after major vascular surgery.


Assuntos
Doenças Cardiovasculares/etiologia , Peptídeo Natriurético Encefálico/sangue , Troponina I/sangue , Procedimentos Cirúrgicos Vasculares/efeitos adversos , Idoso , Biomarcadores/sangue , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/mortalidade , Feminino , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade , Projetos Piloto , Cuidados Pós-Operatórios , Valor Preditivo dos Testes , Cuidados Pré-Operatórios , Modelos de Riscos Proporcionais , Estudos Prospectivos , Medição de Risco , Suíça/epidemiologia , Fatores de Tempo , Resultado do Tratamento , Regulação para Cima , Procedimentos Cirúrgicos Vasculares/mortalidade
8.
BMC Med Inform Decis Mak ; 4: 9, 2004 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-15222890

RESUMO

BACKGROUND: The linkage of records which refer to the same entity in separate data collections is a common requirement in public health and biomedical research. Traditionally, record linkage techniques have required that all the identifying data in which links are sought be revealed to at least one party, often a third party. This necessarily invades personal privacy and requires complete trust in the intentions of that party and their ability to maintain security and confidentiality. Dusserre, Quantin, Bouzelat and colleagues have demonstrated that it is possible to use secure one-way hash transformations to carry out follow-up epidemiological studies without any party having to reveal identifying information about any of the subjects - a technique which we refer to as "blindfolded record linkage". A limitation of their method is that only exact comparisons of values are possible, although phonetic encoding of names and other strings can be used to allow for some types of typographical variation and data errors. METHODS: A method is described which permits the calculation of a general similarity measure, the n-gram score, without having to reveal the data being compared, albeit at some cost in computation and data communication. This method can be combined with public key cryptography and automatic estimation of linkage model parameters to create an overall system for blindfolded record linkage. RESULTS: The system described offers good protection against misdeeds or security failures by any one party, but remains vulnerable to collusion between or simultaneous compromise of two or more parties involved in the linkage operation. In order to reduce the likelihood of this, the use of last-minute allocation of tasks to substitutable servers is proposed. Proof-of-concept computer programmes written in the Python programming language are provided to illustrate the similarity comparison protocol. CONCLUSION: Although the protocols described in this paper are not unconditionally secure, they do suggest the feasibility, with the aid of modern cryptographic techniques and high speed communication networks, of a general purpose probabilistic record linkage system which permits record linkage studies to be carried out with negligible risk of invasion of personal privacy.


Assuntos
Registro Médico Coordenado/métodos , Redes de Comunicação de Computadores/normas , Segurança Computacional/normas , Confidencialidade/legislação & jurisprudência , Confidencialidade/normas , Eficiência Organizacional/normas , Humanos , Informática Médica/métodos , Informática Médica/normas , Sistemas Computadorizados de Registros Médicos/normas , Privacidade/legislação & jurisprudência , Linguagens de Programação , Saúde Pública , Software/normas
9.
Mutat Res ; 537(2): 151-68, 2003 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-12787820

RESUMO

Different variants of the comet assay were used to study the genotoxic and cytotoxic properties of the following eight compounds: chloral hydrate, colchicine, hydroquinone, DL-menthol, mitomycin C, sodium iodoacetate, thimerosal and valinomycin. Colchicine, mitomycin C, sodium iodoacetate and thimerosal induced genotoxic effects. The other compounds were found to be inactive. The compounds were tested in the standard comet assay as well as in the all cell comet assay (recovery of floating cells after treatment), designed in our laboratory for adherently-growing cells. This latter procedure proved to be more adequate for the assessment of the cytotoxicity for some of the compounds tested (hydroquinone, DL-menthol, thimerosal, valinomycin). Colchicine was positive in the standard comet assay (3h treatment) and in the all cell comet assay (24h treatment). Sodium iodoacetate and thimerosal were positive in the standard and/or the all cell comet assay. Chloral hydrate, hydroquinone, sodium iodoacetate, mitomycin C and thimerosal were also tested in the modified comet assay using lysed cells. Mitomycin C and thimerosal showed effects in this assay, whereas sodium iodoacetate was inactive. This indicates that it does not induce direct DNA damage. Compounds that are known or suspected to form DNA-DNA cross-links or DNA-protein cross-links (chloral hydrate, hydroquinone, mitomycin C and thimerosal) were checked for their ability to reduce ethyl methanesulfonate (EMS)-induced DNA damage. This mode of action could be demonstrated for mitomycin C only.


Assuntos
Células CHO/efeitos dos fármacos , Ensaio Cometa , Mutagênicos/toxicidade , Xenobióticos/toxicidade , Animais , Células CHO/patologia , Adesão Celular/efeitos dos fármacos , Sobrevivência Celular/efeitos dos fármacos , Cricetinae , DNA/efeitos dos fármacos , Dano ao DNA , Relação Dose-Resposta a Droga , Mesocricetus , Mutagênicos/classificação , Xenobióticos/classificação
10.
BMC Med Inform Decis Mak ; 2: 9, 2002 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-12482326

RESUMO

BACKGROUND: Record linkage refers to the process of joining records that relate to the same entity or event in one or more data collections. In the absence of a shared, unique key, record linkage involves the comparison of ensembles of partially-identifying, non-unique data items between pairs of records. Data items with variable formats, such as names and addresses, need to be transformed and normalised in order to validly carry out these comparisons. Traditionally, deterministic rule-based data processing systems have been used to carry out this pre-processing, which is commonly referred to as "standardisation". This paper describes an alternative approach to standardisation, using a combination of lexicon-based tokenisation and probabilistic hidden Markov models (HMMs). METHODS: HMMs were trained to standardise typical Australian name and address data drawn from a range of health data collections. The accuracy of the results was compared to that produced by rule-based systems. RESULTS: Training of HMMs was found to be quick and did not require any specialised skills. For addresses, HMMs produced equal or better standardisation accuracy than a widely-used rule-based system. However, accuracy was worse when used with simpler name data. Possible reasons for this poorer performance are discussed. CONCLUSION: Lexicon-based tokenisation and HMMs provide a viable and effort-effective alternative to rule-based systems for pre-processing more complex variably formatted data such as addresses. Further work is required to improve the performance of this approach with simpler data such as names. Software which implements the methods described in this paper is freely available under an open source license for other researchers to use and improve.


Assuntos
Coleta de Dados/métodos , Cadeias de Markov , Registro Médico Coordenado/métodos , Coleta de Dados/estatística & dados numéricos , Técnicas de Apoio para a Decisão , Informática Médica/tendências , Software/estatística & dados numéricos , Validação de Programas de Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA