RESUMO
Effective data management is crucial for scientific integrity and reproducibility, a cornerstone of scientific progress. Well-organized and well-documented data enable validation and building on results. Data management encompasses activities including organization, documentation, storage, sharing, and preservation. Robust data management establishes credibility, fostering trust within the scientific community and benefiting researchers' careers. In experimental biomedicine, comprehensive data management is vital due to the typically intricate protocols, extensive metadata, and large datasets. Low-throughput experiments, in particular, require careful management to address variations and errors in protocols and raw data quality. Transparent and accountable research practices rely on accurate documentation of procedures, data collection, and analysis methods. Proper data management ensures long-term preservation and accessibility of valuable datasets. Well-managed data can be revisited, contributing to cumulative knowledge and potential new discoveries. Publicly funded research has an added responsibility for transparency, resource allocation, and avoiding redundancy. Meeting funding agency expectations increasingly requires rigorous methodologies, adherence to standards, comprehensive documentation, and widespread sharing of data, code, and other auxiliary resources. This review provides critical insights into raw and processed data, metadata, high-throughput versus low-throughput datasets, a common language for documentation, experimental and reporting guidelines, efficient data management systems, sharing practices, and relevant repositories. We systematically present available resources and optimal practices for wide use by experimental biomedical researchers.
Assuntos
Pesquisa Biomédica , Gerenciamento de Dados , Disseminação de Informação , Pesquisa Biomédica/normas , Pesquisa Biomédica/métodos , Disseminação de Informação/métodos , Humanos , Animais , Gerenciamento de Dados/métodosRESUMO
The lack of interoperable data standards among reference genome data-sharing platforms inhibits cross-platform analysis while increasing the risk of data provenance loss. Here, we describe the FAIR bioHeaders Reference genome (FHR), a metadata standard guided by the principles of Findability, Accessibility, Interoperability and Reuse (FAIR) in addition to the principles of Transparency, Responsibility, User focus, Sustainability and Technology. The objective of FHR is to provide an extensive set of data serialisation methods and minimum data field requirements while still maintaining extensibility, flexibility and expressivity in an increasingly decentralised genomic data ecosystem. The effort needed to implement FHR is low; FHR's design philosophy ensures easy implementation while retaining the benefits gained from recording both machine and human-readable provenance.
Assuntos
Software , Humanos , Genoma , Genômica , Disseminação de InformaçãoRESUMO
BACKGROUND: In the realm of biomedical research, the growing volume, diversity and quantity of data has escalated the demand for statistical analysis as it is indispensable for synthesizing, interpreting, and publishing data. Hence the need for accessible analysis tools drastically increased. StatiCAL emerges as a user-friendly solution, enabling researchers to conduct basic analyses without necessitating extensive programming expertise. RESULTS: StatiCAL includes divers functionalities: data management, visualization on variables and statistical analysis. Data management functionalities allow the user to freely add or remove variables, to select sub-population and to visualise selected data to better perform the analysis. With this tool, users can freely perform statistical analysis such as descriptive, graphical, univariate, and multivariate analysis. All of this can be performed without the need to learn R coding as the software is a graphical user interface where all the action can be performed by clicking a button. CONCLUSIONS: StatiCAL represents a valuable contribution to the field of biomedical research. By being open-access and by providing an intuitive interface with robust features, StatiCAL allow researchers to gain autonomy in conducting their projects.
Assuntos
Pesquisa Biomédica , Software , Interface Usuário-Computador , Biologia Computacional/métodos , Gerenciamento de Dados/métodos , Interpretação Estatística de DadosRESUMO
BACKGROUND: The increasing volume and complexity of genomic data pose significant challenges for effective data management and reuse. Public genomic data often undergo similar preprocessing across projects, leading to redundant or inconsistent datasets and inefficient use of computing resources. This is especially pertinent for bioinformaticians engaged in multiple projects. Tools have been created to address challenges in managing and accessing curated genomic datasets, however, the practical utility of such tools becomes especially beneficial for users who seek to work with specific types of data or are technically inclined toward a particular programming language. Currently, there exists a gap in the availability of an R-specific solution for efficient data management and versatile data reuse. RESULTS: Here we present ReUseData, an R software tool that overcomes some of the limitations of existing solutions and provides a versatile and reproducible approach to effective data management within R. ReUseData facilitates the transformation of ad hoc scripts for data preprocessing into Common Workflow Language (CWL)-based data recipes, allowing for the reproducible generation of curated data files in their generic formats. The data recipes are standardized and self-contained, enabling them to be easily portable and reproducible across various computing platforms. ReUseData also streamlines the reuse of curated data files and their integration into downstream analysis tools and workflows with different frameworks. CONCLUSIONS: ReUseData provides a reliable and reproducible approach for genomic data management within the R environment to enhance the accessibility and reusability of genomic data. The package is available at Bioconductor ( https://bioconductor.org/packages/ReUseData/ ) with additional information on the project website ( https://rcwl.org/dataRecipes/ ).
Assuntos
Gerenciamento de Dados , Genômica , Software , Linguagens de Programação , Fluxo de TrabalhoRESUMO
In modern reproducible, hypothesis-driven plant research, scientists are increasingly relying on research data management (RDM) services and infrastructures to streamline the processes of collecting, processing, sharing, and archiving research data. FAIR (i.e., findable, accessible, interoperable, and reusable) research data play a pivotal role in enabling the integration of interdisciplinary knowledge and facilitating the comparison and synthesis of a wide range of analytical findings. The PLANTdataHUB offers a solution that realizes RDM of scientific (meta)data as evolving collections of files in a directory - yielding FAIR digital objects called ARCs - with tools that enable scientists to plan, communicate, collaborate, publish, and reuse data on the same platform while gaining continuous quality control insights. The centralized platform is scalable from personal use to global communities and provides advanced federation capabilities for institutions that prefer to host their own satellite instances. This approach borrows many concepts from software development and adapts them to fit the challenges of the field of modern plant science undergoing digital transformation. The PLANTdataHUB supports researchers in each stage of a scientific project with adaptable continuous quality control insights, from the early planning phase to data publication. The central live instance of PLANTdataHUB is accessible at (https://git.nfdi4plants.org), and it will continue to evolve as a community-driven and dynamic resource that serves the needs of contemporary plant science.
Assuntos
Bases de Dados como Assunto , Disseminação de Informação , PlantasRESUMO
In recent years, China's advanced light sources have entered a period of rapid construction and development. As modern X-ray detectors and data acquisition technologies advance, these facilities are expected to generate massive volumes of data annually, presenting significant challenges in data management and utilization. These challenges encompass data storage, metadata handling, data transfer and user data access. In response, the Data Organization Management Access Software (DOMAS) has been designed as a framework to address these issues. DOMAS encapsulates four fundamental modules of data management software, including metadata catalogue, metadata acquisition, data transfer and data service. For light source facilities, building a data management system only requires parameter configuration and minimal code development within DOMAS. This paper firstly discusses the development of advanced light sources in China and the associated demands and challenges in data management, prompting a reconsideration of data management software framework design. It then outlines the architecture of the framework, detailing its components and functions. Lastly, it highlights the application progress and effectiveness of DOMAS when deployed for the High Energy Photon Source (HEPS) and Beijing Synchrotron Radiation Facility (BSRF).
RESUMO
The scientific community has entered an era of big data. However, with big data comes big responsibilities, and best practices for how data are contributed to databases have not kept pace with the collection, aggregation, and analysis of big data. Here, we rigorously assess the quantity of data for specific leaf area (SLA) available within the largest and most frequently used global plant trait database, the TRY Plant Trait Database, exploring how much of the data were applicable (i.e., original, representative, logical, and comparable) and traceable (i.e., published, cited, and consistent). Over three-quarters of the SLA data in TRY either lacked applicability or traceability, leaving only 22.9% of the original data usable compared with the 64.9% typically deemed usable by standard data cleaning protocols. The remaining usable data differed markedly from the original for many species, which led to altered interpretation of ecological analyses. Though the data we consider here make up only 4.5% of SLA data within TRY, similar issues of applicability and traceability likely apply to SLA data for other species as well as other commonly measured, uploaded, and downloaded plant traits. We end with suggested steps forward for global ecological databases, including suggestions for both uploaders to and curators of databases with the hope that, through addressing the issues raised here, we can increase data quality and integrity within the ecological community.
Assuntos
Folhas de Planta , Plantas , Big Data , Bases de Dados Factuais , FenótipoRESUMO
Bioimage data are generated in diverse research fields throughout the life and biomedical sciences. Its potential for advancing scientific progress via modern, data-driven discovery approaches reaches beyond disciplinary borders. To fully exploit this potential, it is necessary to make bioimaging data, in general, multidimensional microscopy images and image series, FAIR, that is, findable, accessible, interoperable and reusable. These FAIR principles for research data management are now widely accepted in the scientific community and have been adopted by funding agencies, policymakers and publishers. To remain competitive and at the forefront of research, implementing the FAIR principles into daily routines is an essential but challenging task for researchers and research infrastructures. Imaging core facilities, well-established providers of access to imaging equipment and expertise, are in an excellent position to lead this transformation in bioimaging research data management. They are positioned at the intersection of research groups, IT infrastructure providers, the institution´s administration, and microscope vendors. In the frame of German BioImaging - Society for Microscopy and Image Analysis (GerBI-GMB), cross-institutional working groups and third-party funded projects were initiated in recent years to advance the bioimaging community's capability and capacity for FAIR bioimage data management. Here, we provide an imaging-core-facility-centric perspective outlining the experience and current strategies in Germany to facilitate the practical adoption of the FAIR principles closely aligned with the international bioimaging community. We highlight which tools and services are ready to be implemented and what the future directions for FAIR bioimage data have to offer.
Assuntos
Microscopia , Pesquisa Biomédica/métodos , Gerenciamento de Dados/métodos , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodosRESUMO
Modern bioimaging core facilities at research institutions are essential for managing and maintaining high-end instruments, providing training and support for researchers in experimental design, image acquisition and data analysis. An important task for these facilities is the professional management of complex multidimensional bioimaging data, which are often produced in large quantity and very different file formats. This article details the process that led to successfully implementing the OME Remote Objects system (OMERO) for bioimage-specific research data management (RDM) at the Core Facility Cellular Imaging (CFCI) at the Technische Universität Dresden (TU Dresden). Ensuring compliance with the FAIR (findable, accessible, interoperable, reusable) principles, we outline here the challenges that we faced in adapting data handling and storage to a new RDM system. These challenges included the introduction of a standardised group-specific naming convention, metadata curation with tagging and Key-Value pairs, and integration of existing image processing workflows. By sharing our experiences, this article aims to provide insights and recommendations for both individual researchers and educational institutions intending to implement OMERO as a management system for bioimaging data. We showcase how tailored decisions and structured approaches lead to successful outcomes in RDM practices.
RESUMO
BACKGROUND: Research Electronic Data CAPture (REDCap) is a web application for creating and managing online surveys and databases. Clinical data management is an essential process before performing any statistical analysis to ensure the quality and reliability of study information. Processing REDCap data in R can be complex and often benefits from automation. While there are several R packages available for specific tasks, none offer an expansive approach to data management. RESULTS: The REDCapDM is an R package for accessing and managing REDCap data. It imports data from REDCap to R using either an API connection or the files in R format exported directly from REDCap. It has several functions for data processing and transformation, and it helps to generate and manage queries to clarify or resolve discrepancies found in the data. CONCLUSION: The REDCapDM package is a valuable tool for data scientists and clinical data managers who use REDCap and R. It assists in tasks such as importing, processing, and quality-checking data from their research studies.
Assuntos
Gerenciamento de Dados , Software , Humanos , Reprodutibilidade dos Testes , Inquéritos e Questionários , RegistrosRESUMO
The Solar eruptioN Integral Field Spectrograph (SNIFS) is a solar-gazing spectrograph scheduled to fly in the summer of 2025 on a NASA sounding rocket. Its goal is to view the solar chromosphere and transition region at a high cadence (1 s) both spatially ( 0.5 â³ ) and spectrally (33 mÅ) viewing wavelengths around Lyman alpha (1216 Å), Si iii (1206 Å), and O v (1218 Å) to observe spicules, nanoflares, and possibly a solar flare. This time cadence will provide yet-unobserved detail about fast-changing features of the Sun. The instrument is comprised of a Gregorian-style reflecting telescope combined with a spectrograph via a specialized mirrorlet array that focuses the light from each spatial location in the image so that it may be spectrally dispersed without overlap from neighboring locations. This paper discusses the driving science, detailed instrument and subsystem design, and preintegration testing of the SNIFS instrument.
RESUMO
HXI on ASO-S and STIX onboard Solar Orbiter are the first simultaneously operating solar hard X-ray imaging spectrometers. ASO-S's low Earth orbit and Solar Orbiter's periodic displacement from the Sun-Earth line enables multi-viewpoint solar hard X-ray spectroscopic imaging analysis for the first time. Here, we demonstrate the potential of this new capability by reporting the first results of 3D triangulation of hard X-ray sources in the SOL2023-12-31T21:55 X5 flare. HXI and STIX observed the flare near the east limb with an observer separation angle of 18°. We triangulated the brightest regions within each source, which enabled us to characterise the large-scale hard X-ray geometry of the flare. The footpoints were found to be in the chromosphere within uncertainty, as expected, while the thermal looptop source was centred at an altitude of 15.1 ± 1 Mm. Given the footpoint separation, this implies a more elongated magnetic-loop structure than predicted by a semi-circular model. These results show the strong diagnostic power of joint HXI and STIX observations for understanding the 3D geometry of solar flares. We conclude by discussing the next steps required to fully exploit their potential.
RESUMO
BACKGROUND: The future European Health Research and Innovation Cloud (HRIC), as fundamental part of the European Health Data Space (EHDS), will promote the secondary use of data and the capabilities to push the boundaries of health research within an ethical and legally compliant framework that reinforces the trust of patients and citizens. OBJECTIVE: This study aimed to analyse health data management mechanisms in Europe to determine their alignment with FAIR principles and data discovery generating best. practices for new data hubs joining the HRIC ecosystem. In this line, the compliance of health data hubs with FAIR principles and data discovery were assessed, and a set of best practices for health data hubs was concluded. METHODS: A survey was conducted in January 2022, involving 99 representative health data hubs from multiple countries, and 42 responses were obtained in June 2022. Stratification methods were employed to cover different levels of granularity. The survey data was analysed to assess compliance with FAIR and data discovery principles. The study started with a general analysis of survey responses, followed by the creation of specific profiles based on three categories: organization type, function, and level of data aggregation. RESULTS: The study produced specific best practices for data hubs regarding the adoption of FAIR principles and data discoverability. It also provided an overview of the survey study and specific profiles derived from category analysis, considering different types of data hubs. CONCLUSIONS: The study concluded that a significant number of health data hubs in Europe did not fully comply with FAIR and data discovery principles. However, the study identified specific best practices that can guide new data hubs in adhering to these principles. The study highlighted the importance of aligning health data management mechanisms with FAIR principles to enhance interoperability and reusability in the future HRIC.
Assuntos
Computação em Nuvem , Humanos , Europa (Continente) , Inquéritos e Questionários , Gerenciamento de Dados/métodos , Registros Eletrônicos de Saúde , Informática Médica/métodosRESUMO
Transfusion medicine requires meticulous record keeping from the time a blood donation is made to the time a patient receives a transfusion. As such, blood collection establishments and processing laboratories generate large amounts of data. This data must be managed, analyzed, and visualized appropriately to ensure safety of the blood supply. Today, the use of information technology (IT) solutions for data management in transfusion medicine varies widely between institutions. In this report, blood center professionals describe how they currently use IT solutions to improve their blood processing methods, the challenges they have, and how they envision IT solutions improving transfusion medicine in the future.
Assuntos
Medicina Transfusional , Humanos , Objetivos , Transfusão de Sangue , Bancos de Sangue , Doadores de SangueRESUMO
BACKGROUND: In conflict settings, as it is the case in Syria, it is crucial to enhance health information management to facilitate an effective and sustainable approach to strengthening health systems in such contexts. In this study, we aim to provide a baseline understanding of the present state of health information management in Northwest Syria (NWS) to better plan for strengthening the health information system of the area that is transitioning to an early-recovery stage. METHODS: A combination of questionnaires and subsequent interviews was used for data collection. Purposive sampling was used to select twenty-one respondents directly involved in managing and directing different domains of health information in the NWS who worked with local NGOs, INGOs, UN-agencies, or part of the Health Working Group. A scoring system for each public health domain was constructed based on the number and quality of the available datasets for these domains, which were established by Checci and others. RESULTS & CONCLUSIONS: Reliable and aggregate health information in the NWS is limited, despite some improvements made over the past decade. The conflict restricted and challenged efforts to establish a concentrated and harmonized HIS in the NWS, which led to a lack of leadership, poor coordination, and duplication of key activities. Although the UN established the EWARN and HeRAMS as common data collection systems in the NWS, they are directed toward advocacy and managed by external experts with little participation or access from local stakeholders to these datasets. RECOMMENDATIONS: There is a need for participatory approaches and the empowerment of local actors and local NGOs, cooperation between local and international stakeholders to increase access to data, and a central domain for planning, organization, and harmonizing the process. To enhance the humanitarian health response in Syria and other crisis areas, it is imperative to invest in data collection and utilisation, mHealth and eHealth technologies, capacity building, and robust technical and autonomous leadership.
Assuntos
Gestão da Informação em Saúde , Síria , Humanos , Inquéritos e Questionários , Conflitos ArmadosRESUMO
OBJECTIVE: To explore how sports injury epidemiological outcomes (i.e., prevalence, average prevalence, incidence, burden, and time to first injury) vary depending on the response rates to a weekly online self-reported questionnaire for athletes. METHODS: Weekly information on athletics injuries and exposure from 391 athletics (track and field) athletes was prospectively collected over 39 weeks (control group of the PREVATHLE randomized controlled trial) using an online self-reported questionnaire. The data were used to calculate sports injury epidemiological outcomes (i.e., prevalence, average prevalence, incidence, burden, and time to first injury) for sub-groups with different minimum individual athletes' response rates (i.e., from at least 100%, at least 97%, at least 95%, to at least 0% response rate). We then calculated the relative variation between each sub-group and the sub-group with a 100% response rate as a reference. A substantial variation was considered when the relative variation was greater than one SD or 95% CI of the respective epidemiological outcome calculated in the sub-group with a 100% response rate. RESULTS: Of 15 249 expected weekly questionnaires, 7209 were completed and returned, resulting in an overall response rate of 47.3%. The individual athletes' response rates ranged from 0% (n = 51) to 100% (n = 100). The prevalence, average weekly prevalence, and time to first injury only varied substantially for the sub-groups below a 5%, 10% and 18% minimum individual response rate, respectively. The incidence and injury burden showed substantial variations for all sub-groups with a response rate below 100%. CONCLUSIONS: Epidemiological outcomes varied depending on the minimum individual athletes' response rate, with injury prevalence, average weekly prevalence, and time to first injury varying less than injury incidence and injury burden. This highlights the need to take into account the individual response rate when calculating epidemiological outcomes, and determining the optimal study-specific cut-offs of the minimum individual response rate needed.
Assuntos
Traumatismos em Atletas , Atletismo , Humanos , Traumatismos em Atletas/epidemiologia , Seguimentos , Atletas , AutorrelatoRESUMO
BACKGROUND: The effective management of surgical and anesthesia care relies on quality data and its readily availability for both patient-centered decision-making and facility-level improvement efforts. Recognizing this critical need, the Strengthening Systems for Improved Surgical Outcomes (SSISO) project addressed surgical care data management and information use practices across 23 health facilities from October 2019 to September 2022. This study aimed to evaluate the effectiveness of SSISO interventions in enhancing practices related to surgical data capture, reporting, analysis, and visualization. METHODS: This study employed a mixed method, pre- post intervention evaluation design to assess changes in data management and utilization practices at intervention facilities. The intervention packages included capacity building trainings, monthly mentorship visits facilitated by a hub-and-spoke approach, provision of data capture tools, and reinforcement of performance review teams. Data collection occurred at baseline (February - April 2020) and endline (April - June 2022). The evaluation focused on the availability and appropriate use of data capture tools, as well as changes in performance review practices. Appropriate use of registers was defined as filling all the necessary data onto the registers, and this was verified by completeness of selected key data elements in the registers. RESULTS: The proportion of health facilities with Operation Room (OR) scheduling, referral, and surgical site infection registers significantly increased by 34.8%, 56.5% and 87%, respectively, at project endline compared to baseline. Availability of OR and Anesthesia registers remained high throughout the project, at 91.3% and 95.6%, respectively. Furthermore, the appropriate use of these registers improved, with statistically significant increases observed for OR scheduling registers (34.8% increase). Increases were also noted for OR register (9.5% increase) and anesthesia register (4.5% increase), although not statistically significant. Assessing the prior three months reports, the report submissions to the Ministry of Health/Regional Health Bureau (MOH/RHB) rose from 85 to 100%, reflecting complete reporting at endline period. Additionally, the proportion of surgical teams analyzing and displaying data for informed decision-making significantly increased from 30.4% at baseline to 60.8% at endline period. CONCLUSION: The implemented interventions positively impacted surgical data management and utilization practice at intervention facilities. These positive changes were likely attributable to capacity building trainings and regular mentorship visits via hub-and-spoke approach. Hence, we recommend further investigation into the effectiveness of similar intervention packages in improving surgical data management, data analysis and visualization practices in low- and middle-income country settings.
Assuntos
Melhoria de Qualidade , Humanos , Etiópia , Instalações de Saúde/normas , Instalações de Saúde/estatística & dados numéricos , Procedimentos Cirúrgicos Operatórios/estatística & dados numéricos , Procedimentos Cirúrgicos Operatórios/normas , Fortalecimento Institucional , Gerenciamento de Dados , Salas Cirúrgicas/organização & administração , Salas Cirúrgicas/normas , Salas Cirúrgicas/estatística & dados numéricosRESUMO
BACKGROUND: The record of the origin and the history of data, known as provenance, holds importance. Provenance information leads to higher interpretability of scientific results and enables reliable collaboration and data sharing. However, the lack of comprehensive evidence on provenance approaches hinders the uptake of good scientific practice in clinical research. OBJECTIVE: This scoping review aims to identify approaches and criteria for provenance tracking in the biomedical domain. We reviewed the state-of-the-art frameworks, associated artifacts, and methodologies for provenance tracking. METHODS: This scoping review followed the methodological framework developed by Arksey and O'Malley. We searched the PubMed and Web of Science databases for English-language articles published from 2006 to 2022. Title and abstract screening were carried out by 4 independent reviewers using the Rayyan screening tool. A majority vote was required for consent on the eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading and screening were performed independently by 2 reviewers, and information was extracted into a pretested template for the 5 research questions. Disagreements were resolved by a domain expert. The study protocol has previously been published. RESULTS: The search resulted in a total of 764 papers. Of 624 identified, deduplicated papers, 66 (10.6%) studies fulfilled the inclusion criteria. We identified diverse provenance-tracking approaches ranging from practical provenance processing and managing to theoretical frameworks distinguishing diverse concepts and details of data and metadata models, provenance components, and notations. A substantial majority investigated underlying requirements to varying extents and validation intensities but lacked completeness in provenance coverage. Mostly, cited requirements concerned the knowledge about data integrity and reproducibility. Moreover, these revolved around robust data quality assessments, consistent policies for sensitive data protection, improved user interfaces, and automated ontology development. We found that different stakeholder groups benefit from the availability of provenance information. Thereby, we recognized that the term provenance is subjected to an evolutionary and technical process with multifaceted meanings and roles. Challenges included organizational and technical issues linked to data annotation, provenance modeling, and performance, amplified by subsequent matters such as enhanced provenance information and quality principles. CONCLUSIONS: As data volumes grow and computing power increases, the challenge of scaling provenance systems to handle data efficiently and assist complex queries intensifies, necessitating automated and scalable solutions. With rising legal and scientific demands, there is an urgent need for greater transparency in implementing provenance systems in research projects, despite the challenges of unresolved granularity and knowledge bottlenecks. We believe that our recommendations enable quality and guide the implementation of auditable and measurable provenance approaches as well as solutions in the daily tasks of biomedical scientists. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/31750.
Assuntos
Fluxo de Trabalho , Humanos , Pesquisa Biomédica/métodosRESUMO
Artificial intelligence (AI) is emerging as a transformative technology in healthcare, including endodontics. A gap in knowledge exists in understanding AI's applications and limitations among endodontic experts. This comprehensive review aims to (A) elaborate on technical and ethical aspects of using data to implement AI models in endodontics; (B) elaborate on evaluation metrics; (C) review the current applications of AI in endodontics; and (D) review the limitations and barriers to real-world implementation of AI in the field of endodontics and its future potentials/directions. The article shows that AI techniques have been applied in endodontics for critical tasks such as detection of radiolucent lesions, analysis of root canal morphology, prediction of treatment outcome and post-operative pain and more. Deep learning models like convolutional neural networks demonstrate high accuracy in these applications. However, challenges remain regarding model interpretability, generalizability, and adoption into clinical practice. When thoughtfully implemented, AI has great potential to aid with diagnostics, treatment planning, clinical interventions, and education in the field of endodontics. However, concerted efforts are still needed to address limitations and to facilitate integration into clinical workflows.
Assuntos
Inteligência Artificial , Endodontia , Inteligência Artificial/normas , Segurança Computacional , Endodontia/educação , Endodontia/ética , Endodontia/tendências , HumanosRESUMO
OBJECTIVE: This study aimed to develop and validate a quantitative index system for evaluating the data quality of Electronic Medical Records (EMR) in disease risk prediction using Machine Learning (ML). MATERIALS AND METHODS: The index system was developed in four steps: (1) a preliminary index system was outlined based on literature review; (2) we utilized the Delphi method to structure the indicators at all levels; (3) the weights of these indicators were determined using the Analytic Hierarchy Process (AHP) method; and (4) the developed index system was empirically validated using real-world EMR data in a ML-based disease risk prediction task. RESULTS: The synthesis of review findings and the expert consultations led to the formulation of a three-level index system with four first-level, 11 second-level, and 33 third-level indicators. The weights of these indicators were obtained through the AHP method. Results from the empirical analysis illustrated a positive relationship between the scores assigned by the proposed index system and the predictive performances of the datasets. DISCUSSION: The proposed index system for evaluating EMR data quality is grounded in extensive literature analysis and expert consultation. Moreover, the system's high reliability and suitability has been affirmed through empirical validation. CONCLUSION: The novel index system offers a robust framework for assessing the quality and suitability of EMR data in ML-based disease risk predictions. It can serve as a guide in building EMR databases, improving EMR data quality control, and generating reliable real-world evidence.