Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 9.402
Filtrar
2.
PLoS One ; 19(10): e0311720, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39388418

RESUMEN

The malicious use of deepfake videos seriously affects information security and brings great harm to society. Currently, deepfake videos are mainly generated based on deep learning methods, which are difficult to be recognized by the naked eye, therefore, it is of great significance to study accurate and efficient deepfake video detection techniques. Most of the existing detection methods focus on analyzing the discriminative information in a specific feature domain for classification from a local or global perspective. Such detection methods based on a single type feature have certain limitations in practical applications. In this paper, we propose a deepfake detection method with the ability to comprehensively analyze the forgery face features, which integrates features in the space domain, noise domain, and frequency domain, and uses the Inception Transformer to learn the mix of global and local information dynamically. We evaluate the proposed method on the DFDC, Celeb-DF, and FaceForensic++ benchmark datasets. Extensive experiments verify the effectiveness and good generalization of the proposed method. Compared with the optimal model, the proposed method with a small number of parameters does not use pre-training, distillation, or assembly, but still achieves competitive performance. The ablation experiments evaluate the role of each component.


Asunto(s)
Grabación en Video , Humanos , Cara , Aprendizaje Profundo , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Seguridad Computacional
3.
Sci Rep ; 14(1): 23433, 2024 10 08.
Artículo en Inglés | MEDLINE | ID: mdl-39379443

RESUMEN

The expansion of smart contracts on the Ethereum blockchain has created a diverse ecosystem of decentralized applications. This growth, however, poses challenges in classifying and securing these contracts. Existing research often separately addresses either classification or vulnerability detection, without a comprehensive analysis of how contract types are related to security risks. Our study addresses this gap by developing a taxonomy of smart contracts and examining the potential vulnerabilities associated with each category. We use the Latent Dirichlet Allocation (LDA) model to analyze a dataset of over 100,040 Ethereum smart contracts, which is notably larger than those used in previous studies. Our analysis categorizes these contracts into eleven groups, with five primary categories: Notary, Token, Game, Financial, and Blockchain interaction. This categorization sheds light on the various functions and applications of smart contracts in today's blockchain environment. In response to the growing need for better security in smart contract development, we also investigate the link between these categories and common vulnerabilities. Our results identify specific vulnerabilities associated with different contract types, providing valuable insights for developers and auditors. This relationship between contract categories and vulnerabilities is a new contribution to the field, as it has not been thoroughly explored in previous research. Our findings offer a detailed taxonomy of smart contracts and practical recommendations for enhancing security. By understanding how contract categories correlate with vulnerabilities, developers can implement more effective security measures, and auditors can better prioritize their reviews. This study advances both academic knowledge of smart contracts and practical strategies for securing decentralized applications on the Ethereum platform.


Asunto(s)
Cadena de Bloques , Seguridad Computacional , Contratos , Humanos
4.
Health Res Policy Syst ; 22(1): 145, 2024 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-39407232

RESUMEN

BACKGROUND: The increasing availability of large volumes of personal data from diverse sources such as electronic health records, research programmes, commercial genetic testing, national health surveys and wearable devices presents significant opportunities for advancing public health, disease surveillance, personalized medicine and scientific research and innovation. However, this potential is hampered by a lack of clarity related to the processing and sharing of personal health data, particularly across varying national regulatory frameworks. This often leaves researcher stakeholders uncertain about how to navigate issues around secondary data use, repurposing data for different research objectives and cross-border data sharing. METHOD: We analysed 37 data protection legislation across Africa to identify key principles and requirements for processing and sharing of personal health and genetic data in scientific research. On the basis of this analysis, we propose strategies that data science research initiatives in Africa can implement to ensure compliance with data protection laws while effectively reusing and sharing personal data for health research and scientific innovation. RESULTS: In many African countries, health and genetic data are categorized as sensitive and subject to stricter protection. Key principles guiding the processing of personal data include confidentiality, non-discrimination, transparency, storage limitation, legitimacy, purpose specification, integrity, fairness, non-excessiveness, accountability and data minimality. The rights of data subjects include the right to be informed, the right of access, the right to rectification, the right to erasure/deletion of data, the right to restrict processing, the right to data portability and the right to seek compensation. Consent and adequacy assessments were the most common legal grounds for cross-border data transfers. However, considerable variation exists in legal requirements for data transfer across countries, potentially creating barriers to collaborative health research across Africa. CONCLUSIONS: We propose several strategies that data science research initiatives can adopt to align with data protection laws. These include developing a standardized module for safe data flows, using trusted data environments to minimize cross-border transfers, implementing dynamic consent mechanisms to comply with consent specificity and data subject rights and establishing codes of conduct to govern the secondary use of personal data for health research and innovation.


Asunto(s)
Macrodatos , Seguridad Computacional , Confidencialidad , Difusión de la Información , Humanos , África , Confidencialidad/legislación & jurisprudencia , Seguridad Computacional/legislación & jurisprudencia , Difusión de la Información/legislación & jurisprudencia , Investigación Biomédica/legislación & jurisprudencia , Registros Electrónicos de Salud , Ciencia de los Datos
5.
PLoS One ; 19(10): e0311215, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39361603

RESUMEN

This article explores the dissipative control for a class of nonlinear DP-CPS (distributed parameter cyber physical system) within a finite-time interval. By utilizing a Takagi-Sugeno (T-S) fuzzy model to represent the system's nonlinear aspects, the studied system is formulated as a class of fuzzy parabolic partial differential equation (PDE). In order to optimize network resources, both the system state and input signal are subjected to quantization using dynamic quantizers. Subsequently, a dynamic state control strategy is proposed, taking into account potential DoS attack. The finite-time boundedness of the fuzzy parabolic PDE is analyzed, with respect to the influence of quantization, through the construction of an appropriate Lyapunov functional. The article then presents the conditions for finite-time dissipative control design, alongside the adjustment parameters for the dynamic quantizers within the fuzzy closed-loop system. Furthermore, the decoupling of interlinked nonlinear terms in the control design conditions is achieved by using an arbitrary matrix. Finally, an example is provided and the simulation results indicate the effectiveness of the dissipative control method proposed.


Asunto(s)
Lógica Difusa , Algoritmos , Modelos Teóricos , Dinámicas no Lineales , Simulación por Computador , Seguridad Computacional
6.
PLoS One ; 19(10): e0309682, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39418269

RESUMEN

Internet of things (IoT) facilitates a variety of heterogeneous devices to be enabled with network connectivity via various network architectures to gather and exchange real-time information. On the other hand, the rise of IoT creates Distributed Denial of Services (DDoS) like security threats. The recent advancement of Software Defined-Internet of Things (SDIoT) architecture can provide better security solutions compared to the conventional networking approaches. Moreover, limited computing resources and heterogeneous network protocols are major challenges in the SDIoT ecosystem. Given these circumstances, it is essential to design a low-cost DDoS attack classifier. The current study aims to employ an improved feature selection (FS) technique which determines the most relevant features that can improve the detection rate and reduce the training time. At first, to overcome the data imbalance problem, Edited Nearest Neighbor-based Synthetic Minority Oversampling (SMOTE-ENN) was exploited. The study proposes SFMI, an FS method that combines Sequential Feature Selection (SFE) and Mutual Information (MI) techniques. The top k common features were extracted from the nominated features based on SFE and MI. Further, Principal component analysis (PCA) is employed to address multicollinearity issues in the dataset. Comprehensive experiments have been conducted on two benchmark datasets such as the KDDCup99, CIC IoT-2023 datasets. For classification purposes, Decision Tree, K-Nearest Neighbor, Gaussian Naive Bayes, Random Forest (RF), and Multilayer Perceptron classifiers were employed. The experimental results quantitatively demonstrate that the proposed SMOTE-ENN+SFMI+PCA with RF classifier achieves 99.97% accuracy and 99.39% precision with 10 features.


Asunto(s)
Algoritmos , Internet de las Cosas , Seguridad Computacional , Programas Informáticos , Análisis de Componente Principal
7.
BMC Med Inform Decis Mak ; 24(1): 303, 2024 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-39407229

RESUMEN

BACKGROUND: As digital healthcare services handle increasingly more sensitive health data, robust access control methods are required. Especially in emergency conditions, where the patient's health situation is in peril, different healthcare providers associated with critical cases may need to be granted permission to acquire access to Electronic Health Records (EHRs) of patients. The research objective of this work is to develop a proactive access control method that can grant emergency clinicians access to sensitive health data, guaranteeing the integrity and security of the data, and generating trust without the need for a trusted third party. METHODS: A contextual and blockchain-based mechanism is proposed that allows access to sensitive EHRs by applying prognostic procedures where information based on context, is utilized to identify critical situations and grant access to medical data. Specifically, to enable proactivity, Long Short Term Memory (LSTM) Neural Networks (NNs) are applied that utilize patient's recent health history to prognose the next two-hour health metrics values. Fuzzy logic is used to evaluate the severity of the patient's health state. These techniques are incorporated in a private and permissioned Hyperledger-Fabric blockchain network, capable of securing patient's sensitive information in the blockchain network. RESULTS: The developed access control method provides secure access for emergency clinicians to sensitive information and simultaneously safeguards the patient's well-being. Integrating this predictive mechanism within the blockchain network proved to be a robust tool to enhance the performance of the access control mechanism. Furthermore, the blockchain network of this work can record the history of who and when had access to a specific patient's sensitive EHRs, guaranteeing the integrity and security of the data, as well as recording the latency of this mechanism, where three different access control cases are evaluated. This access control mechanism is to be enforced in a real-life scenario in hospitals. CONCLUSIONS: The proposed mechanism informs proactively the emergency team of professional clinicians about patients' critical situations by combining fuzzy and predictive machine learning techniques incorporated in the private and permissioned blockchain network, and it exploits the distributed data of the blockchain architecture, guaranteeing the integrity and security of the data, and thus, enhancing the users' trust to the access control mechanism.


Asunto(s)
Cadena de Bloques , Seguridad Computacional , Registros Electrónicos de Salud , Humanos , Seguridad Computacional/normas , Redes Neurales de la Computación , Confidencialidad/normas , Lógica Difusa
8.
PLoS One ; 19(9): e0307619, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39264977

RESUMEN

Medical image security is paramount in the digital era but remains a significant challenge. This paper introduces an innovative zero-watermarking methodology tailored for medical imaging, ensuring robust protection without compromising image quality. We utilize Sped-up Robust features for high-precision feature extraction and singular value decomposition (SVD) to embed watermarks into the frequency domain, preserving the original image's integrity. Our methodology uniquely encodes watermarks in a non-intrusive manner, leveraging the robustness of the extracted features and the resilience of the SVD approach. The embedded watermark is imperceptible, maintaining the diagnostic value of medical images. Extensive experiments under various attacks, including Gaussian noise, JPEG compression, and geometric distortions, demonstrate the methodology's superior performance. The results reveal exceptional robustness, with high Normalized Correlation (NC) and Peak Signal-to-noise ratio (PSNR) values, outperforming existing techniques. Specifically, under Gaussian noise and rotation attacks, the watermark retrieved from the encrypted domain maintained an NC value close to 1.00, signifying near-perfect resilience. Even under severe attacks such as 30% cropping, the methodology exhibited a significantly higher NC compared to current state-of-the-art methods.


Asunto(s)
Algoritmos , Seguridad Computacional , Humanos , Diagnóstico por Imagen/métodos , Relación Señal-Ruido , Procesamiento de Imagen Asistido por Computador/métodos , Compresión de Datos/métodos
9.
PLoS One ; 19(9): e0309809, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39255289

RESUMEN

More and more attention has been paid to computer security, and its vulnerabilities urgently need more sensitive solutions. Due to the incomplete data of most vulnerability libraries, it is difficult to obtain pre-permission and post-permission of vulnerabilities, and construct vulnerability exploitation chains, so it cannot to respond to vulnerabilities in time. Therefore, a vulnerability extraction and prediction method based on improved information gain algorithm is proposed. Considering the accuracy and response speed of deep neural network, deep neural network is adopted as the basic framework. The Dropout method effectively reduces overfitting in the case of incomplete data, thus improving the ability to extract and predict vulnerabilities. These experiments confirmed that the excellent F1 and Recall of the improved method reached 0.972 and 0.968, respectively. Compared to the function fingerprints vulnerability detection method and K-nearest neighbor algorithm, the convergence is better. Its response time is 0.12 seconds, which is excellent. To ensure the reliability and validity of the proposed method in the face of missing data, the reliability and validity of Mask test are verified. The false negative rate was 0.3% and the false positive rate was 0.6%. The prediction accuracy of this method for existing permissions reached 97.9%, and it can adapt to the development of permissions more actively, so as to deal with practical challenges. In this way, companies can detect and discover vulnerabilities earlier. In security repair, this method can effectively improve the repair speed and reduce the response time. The prediction accuracy of post-existence permission reaches 96.8%, indicating that this method can significantly improve the speed and efficiency of vulnerability response, and strengthen the understanding and construction of vulnerability exploitation chain. The prediction of the posterior permission can reduce the attack surface of the vulnerability, thus reducing the risk of breach, speeding up the detection of the vulnerability, and ensuring the timely implementation of security measures. This model can be applied to public network security and application security scenarios in the field of computer security, as well as personal computer security and enterprise cloud server security. In addition, the model can also be used to analyze attack paths and security gaps after security accidents. However, the prediction of post-permissions is susceptible to dynamic environments and relies heavily on the updated guidance of security policy rules. This method can improve the accuracy of vulnerability extraction and prediction, quickly identify and respond to security vulnerabilities, shorten the window period of vulnerability exploitation, effectively reduce security risks, and improve the overall network security defense capability. Through the application of this model, the occurrence frequency of security vulnerability time is reduced effectively, and the repair time of vulnerability is shortened.


Asunto(s)
Algoritmos , Seguridad Computacional , Redes Neurales de la Computación , Reproducibilidad de los Resultados , Humanos
10.
PLoS One ; 19(9): e0308469, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39259729

RESUMEN

In an era marked by pervasive digital connectivity, cybersecurity concerns have escalated. The rapid evolution of technology has led to a spectrum of cyber threats, including sophisticated zero-day attacks. This research addresses the challenge of existing intrusion detection systems in identifying zero-day attacks using the CIC-MalMem-2022 dataset and autoencoders for anomaly detection. The trained autoencoder is integrated with XGBoost and Random Forest, resulting in the models XGBoost-AE and Random Forest-AE. The study demonstrates that incorporating an anomaly detector into traditional models significantly enhances performance. The Random Forest-AE model achieved 100% accuracy, precision, recall, F1 score, and Matthews Correlation Coefficient (MCC), outperforming the methods proposed by Balasubramanian et al., Khan, Mezina et al., Smith et al., and Dener et al. When tested on unseen data, the Random Forest-AE model achieved an accuracy of 99.9892%, precision of 100%, recall of 99.9803%, F1 score of 99.9901%, and MCC of 99.8313%. This research highlights the effectiveness of the proposed model in maintaining high accuracy even with previously unseen data.


Asunto(s)
Seguridad Computacional , Aprendizaje Automático , Humanos , Algoritmos , Modelos Teóricos
12.
PLoS One ; 19(9): e0310094, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39264886

RESUMEN

In the development of web applications, the rapid advancement of Internet technologies has brought unprecedented opportunities and increased the demand for user authentication schemes. Before the emergence of blockchain technology, establishing trust between two unfamiliar entities relied on a trusted third party for identity verification. However, the failure or malicious behavior of such a trusted third party could undermine such authentication schemes (e.g., single points of failure, credential leaks). A secure authorization system is another requirement of user authentication schemes, as users must authorize other entities to act on their behalf in some situations. If the transfer of authentication permissions is not adequately restricted, security risks such as unauthorized transfer of permissions to entities may occur. Some research has proposed blockchain-based decentralized user authentication solutions to address these risks and enhance availability and auditability. However, as we know, most proposed schemes that allow users to transfer authentication permissions to other entities require significant gas consumption when deployed and triggered in smart contracts. To address this issue, we proposed an authentication scheme with transferability solely based on hash functions. By combining one-time passwords with Hashcash, the scheme can limit the number of times permissions can be transferred while ensuring security. Furthermore, due to its reliance solely on hash functions, our proposed authentication scheme has an absolute advantage regarding computational complexity and gas consumption in smart contracts. Additionally, we have deployed smart contracts on the Goerli test network and demonstrated the practicality and efficiency of this authentication scheme.


Asunto(s)
Cadena de Bloques , Seguridad Computacional , Internet , Algoritmos , Humanos , Confidencialidad
13.
PLoS One ; 19(9): e0308206, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39264944

RESUMEN

In response to the rapidly evolving threat landscape in network security, this paper proposes an Evolutionary Machine Learning Algorithm designed for robust intrusion detection. We specifically address challenges such as adaptability to new threats and scalability across diverse network environments. Our approach is validated using two distinct datasets: BoT-IoT, reflecting a range of IoT-specific attacks, and UNSW-NB15, offering a broader context of network intrusion scenarios using GA based hybrid DT-SVM. This selection facilitates a comprehensive evaluation of the algorithm's effectiveness across varying attack vectors. Performance metrics including accuracy, recall, and false positive rates are meticulously chosen to demonstrate the algorithm's capability to accurately identify and adapt to both known and novel threats, thereby substantiating the algorithm's potential as a scalable and adaptable security solution. This study aims to advance the development of intrusion detection systems that are not only reactive but also preemptively adaptive to emerging cyber threats." During the feature selection step, a GA is used to discover and preserve the most relevant characteristics from the dataset by using evolutionary principles. Through the use of this technology based on genetic algorithms, the subset of features is optimised, enabling the subsequent classification model to focus on the most relevant components of network data. In order to accomplish this, DT-SVM classification and GA-driven feature selection are integrated in an effort to strike a balance between efficiency and accuracy. The system has been purposefully designed to efficiently handle data streams in real-time, ensuring that intrusions are promptly and precisely detected. The empirical results corroborate the study's assertion that the IDS outperforms traditional methodologies.


Asunto(s)
Algoritmos , Seguridad Computacional , Aprendizaje Automático , Humanos
14.
PLoS One ; 19(9): e0309919, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39240999

RESUMEN

In location-based service (LBS), private information retrieval (PIR) is an efficient strategy used for preserving personal privacy. However, schemes with traditional strategy that constructed by information indexing are usually denounced by its processing time and ineffective in preserving the attribute privacy of the user. Thus, in order to cope with above two weaknesses, in this paper, based on the conception of ciphertext policy attribute-based encryption (CP-ABE), a PIR scheme based on CP-ABE is proposed for preserving the personal privacy in LBS (location privacy preservation scheme with CP-ABE based PIR, short for LPPCAP). In this scheme, query and feedback are encrypted with security two-parties calculation by the user and the LBS server, so as not to violate any personal privacy and decrease the processing time in encrypting the retrieved information. In addition, this scheme can also preserve the attribute privacy of users such as the query frequency as well as the moving manner. At last, we analyzed the availability and the privacy of the proposed scheme, and then several groups of comparison experiment are given, so that the effectiveness and the usability of proposed scheme can be verified theoretically, practically, and the quality of service is also preserved.


Asunto(s)
Seguridad Computacional , Privacidad , Humanos , Almacenamiento y Recuperación de la Información/métodos , Algoritmos , Confidencialidad
15.
Stud Health Technol Inform ; 317: 11-19, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234702

RESUMEN

BACKGROUND: In the context of the telematics infrastructure, new data usage regulations, and the growing potential of artificial intelligence, cloud computing plays a key role in driving the digitalization in the German hospital sector. METHODS: Against this background, the study aims to develop and validate a scale for assessing the cloud readiness of German hospitals. It uses the TPOM (Technology, People, Organization, Macro-Environment) framework to create a scoring system. A survey involving 110 Chief Information Officers (CIOs) from German hospitals was conducted, followed by an exploratory factor analysis and reliability testing to refine the items, resulting in a final set of 30 items. RESULTS: The analysis confirmed the statistical robustness and identified key factors contributing to cloud readiness. These include IT security in the dimension "technology", collaborative research and acceptance for the need to make high quality data available in the dimension "people", scalability of IT resources in the dimension "organization", and legal aspects in the dimension "macroenvironment". The macroenvironment dimension emerged as particularly stable, highlighting the critical role of regulatory compliance in the healthcare sector. CONCLUSION: The findings suggest a certain degree of cloud readiness among German hospitals, with potential for improvement in all four dimensions. Systemically, legal requirements and a challenging political environment are top concerns for CIOs, impacting their cloud readiness.


Asunto(s)
Nube Computacional , Alemania , Hospitales , Seguridad Computacional , Humanos , Encuestas y Cuestionarios
16.
Stud Health Technol Inform ; 317: 85-93, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234710

RESUMEN

INTRODUCTION: With the establishment of the Data Sharing Framework (DSF) as a distributed business process engine in German research networks, it is becoming increasingly important to coordinate authentication, authorization, and role information between peer-to-peer network components. This information is provided in the form of an allowlist. This paper presents a concept and implementation of an Allowlist Management Application. STATE OF THE ART: In research networks using the DSF, allowlists were initially generated manually. CONCEPT: The Allowlist Management Application provides comprehensive tool support for the participating organizations and the administrators of the Allowlist Management Application. It automates the process of creating and distributing allowlists and additionally reduces errors associated with manual entries. In addition, security is improved through extensive validation of entries and enforcing review of requested changes by implementing a four-eyes principle. IMPLEMENTATION: Our implementation serves as a preliminary development for the complete automation of onboarding and allowlist management processes using established frontend and backend frameworks. The application has been deployed in the Medical Informatics Initiative and the Network University Medicine with over 40 participating organizations. LESSONS LEARNED: We learned the need for user guidance, unstructured communication in a structured tool, generalizability, and checks to ensure that the tool's outputs have actually been applied.


Asunto(s)
Difusión de la Información , Alemania , Seguridad Computacional , Humanos
17.
Stud Health Technol Inform ; 317: 59-66, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234707

RESUMEN

INTRODUCTION: To support research projects that require medical data from multiple sites is one of the goals of the German Medical Informatics Initiative (MII). The data integration centers (DIC) at university medical centers in Germany provide patient data via FHIR® in compliance with the MII core data set (CDS). Requirements for data protection and other legal bases for processing prefer decentralized processing of the relevant data in the DICs and the subsequent exchange of aggregated results for cross-site evaluation. METHODS: Requirements from clinical experts were obtained in the context of the MII use case INTERPOLAR. A software architecture was then developed, modeled using 3LGM2, finally implemented and published in a github repository. RESULTS: With the CDS tool chain, we have created software components for decentralized processing on the basis of the MII CDS. The CDS tool chain requires access to a local FHIR endpoint and then transfers the data to an SQL database. This is accessed by the DataProcessor component, which performs calculations with the help of rules (input repo) and writes the results back to the database. The CDS tool chain also has a frontend module (REDCap), which is used to display the output data and calculated results, and allows verification, evaluation, comments and other responses. This feedback is also persisted in the database and is available for further use, analysis or data sharing in the future. DISCUSSION: Other solutions are conceivable. Our solution utilizes the advantages of an SQL database. This enables flexible and direct processing of the stored data using established analysis methods. Due to the modularization, adjustments can be made so that it can be used in other projects. We are planning further developments to support pseudonymization and data sharing. Initial experience is being gathered. An evaluation is pending and planned.


Asunto(s)
Programas Informáticos , Alemania , Registros Electrónicos de Salud , Humanos , Informática Médica , Seguridad Computacional , Conjuntos de Datos como Asunto
18.
Stud Health Technol Inform ; 317: 171-179, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234720

RESUMEN

INTRODUCTION: The German Medical Text Project (GeMTeX) is one of the largest infrastructure efforts targeting German-language clinical documents. We here introduce the architecture of the de-identification pipeline of GeMTeX. METHODS: This pipeline comprises the export of raw clinical documents from the local hospital information system, the import into the annotation platform INCEpTION, fully automatic pre-tagging with protected health information (PHI) items by the Averbis Health Discovery pipeline, a manual curation step of these pre-annotated data, and, finally, the automatic replacement of PHI items with type-conformant substitutes. This design was implemented in a pilot study involving six annotators and two curators each at the Data Integration Centers of the University Hospitals Leipzig and Erlangen. RESULTS: As a proof of concept, the publicly available Graz Synthetic Text Clinical Corpus (GRASSCO) was enhanced with PHI annotations in an annotation campaign for which reasonable inter-annotator agreement values of Krippendorff's α ≈ 0.97 can be reported. CONCLUSION: These curated 1.4 K PHI annotations are released as open-source data constituting the first publicly available German clinical language text corpus with PHI metadata.


Asunto(s)
Registros Electrónicos de Salud , Proyectos Piloto , Alemania , Procesamiento de Lenguaje Natural , Confidencialidad , Humanos , Seguridad Computacional
19.
Stud Health Technol Inform ; 317: 75-84, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234709

RESUMEN

INTRODUCTION: Medical research studies which involve electronic data capture of sensitive data about human subjects need to manage medical and identifying participant data in a secure manner. To protect the identity of data subjects, an independent trusted third party should be responsible for pseudonymization and management of the identifying data. METHODS: We have developed a web-based integrated solution that combines REDCap as an electronic data capture system with the trusted third party software tools of the University Medicine Greifswald, which provides study personnel with a single user interface for both clinical data entry and management of identities, pseudonyms and informed consents. RESULTS: Integration of the two platforms enables a seamless workflow of registering new participants, entering identifying and consent information, and generating pseudonyms in the trusted third party system, with subsequent capturing of medical data in the electronic data capture system, while maintaining strict separation of medical and identifying data in the two independently managed systems. CONCLUSION: Our solution enables a time-efficient data entry workflow, provides a high level of data protection by minimizing visibility of identifying information and pseudonym lists, and avoids errors introduced by manual transfer of pseudonyms between separate systems.


Asunto(s)
Investigación Biomédica , Seguridad Computacional , Confidencialidad , Programas Informáticos , Consentimiento Informado , Anónimos y Seudónimos , Humanos , Registros Electrónicos de Salud , Integración de Sistemas , Interfaz Usuario-Computador
20.
Stud Health Technol Inform ; 317: 270-279, 2024 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-39234731

RESUMEN

INTRODUCTION: A modern approach to ensuring privacy when sharing datasets is the use of synthetic data generation methods, which often claim to outperform classic anonymization techniques in the trade-off between data utility and privacy. Recently, it was demonstrated that various deep learning-based approaches are able to generate useful synthesized datasets, often based on domain-specific analyses. However, evaluating the privacy implications of releasing synthetic data remains a challenging problem, especially when the goal is to conform with data protection guidelines. METHODS: Therefore, the recent privacy risk quantification framework Anonymeter has been built for evaluating multiple possible vulnerabilities, which are specifically based on privacy risks that are considered by the European Data Protection Board, i.e. singling out, linkability, and attribute inference. This framework was applied to a synthetic data generation study from the epidemiological domain, where the synthesization replicates time and age trends previously found in data collected during the DONALD cohort study (1312 participants, 16 time points). The conducted privacy analyses are presented, which place a focus on the vulnerability of outliers. RESULTS: The resulting privacy scores are discussed, which vary greatly between the different types of attacks. CONCLUSION: Challenges encountered during their implementation and during the interpretation of their results are highlighted, and it is concluded that privacy risk assessment for synthetic data remains an open problem.


Asunto(s)
Seguridad Computacional , Medición de Riesgo , Humanos , Estudios Longitudinales , Confidencialidad , Privacidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...