This report presents the findings of a comprehensive assessment of Somalia’s health information system undertaken by WHO in 2022 at the request of Federal Ministry of Health and Human Services. Health information systems including civil registration and vital statistics systems provide health information data for programme and performance monitoring quality of care planning and policymaking. The assessment resulted in a set of recommendations for the Ministry and other stakeholders to develop comprehensive and efficient systems to: monitor health risks and determinants; track health status and outcomes including cause-specific mortality; and assess health system performance. The recommendations also provide an opportunity for the country to respond to the growing demands for health data to measure progress towards the health-related Sustainable Development Goals.

Systems Biology Markup Language (SBML) has emerged as a standard for representing biological models, facilitating model sharing and interoperability. It stores many types of data and complex relationships, complicating data management and analysis. Traditional database management systems struggle to effectively capture these complex networks of interactions within biological systems. Graph-oriented databases perform well in managing interactions between different entities. We present neo4jsbml, a new solution that bridges the gap between the Systems Biology Markup Language data and the Neo4j database, for storing, querying and analyzing data. The Systems Biology Markup Language organizes biological entities in a hierarchical structure, reflecting their interdependencies. The inherent graphical structure represents these hierarchical relationships, offering a natural and efficient means of navigating and exploring the model's components. Neo4j is an excellent solution for handling this type of data. By representing entities as nodes and their relationships as edges, Cypher, Neo4j's query language, efficiently traverses this type of graph representing complex biological networks. We have developed neo4jsbml, a Python library for importing Systems Biology Markup Language data into a Neo4j database using a user-defined schema. By leveraging Neo4j's graphical database technology, exploration of complex biological networks becomes intuitive and information retrieval efficient. Neo4jsbml is a tool designed to import Systems Biology Markup Language data into a Neo4j database. Only the desired data is loaded into the Neo4j database. neo4jsbml is user-friendly and can become a useful new companion for visualizing and analyzing metabolic models through the Neo4j graphical database. neo4jsbml is open source software and available at

With the rapidly growing amount of biological data, powerful but also flexible data management and visualization systems are of increasingly crucial importance. The COVID-19 pandemic has more than highlighted this need and the challenges scientists are facing. Here, we provide an example and a step-by-step template for non-IT personnel to easily implement an intuitive, interactive data management solution to manage and visualize the high influx of biological samples and associated metadata in a laboratory setting. Our approach is illustrated with the genomic surveillance for SARS-CoV-2 in Germany, covering over 11 600 internal and 130 000 external samples from multiple datasets. We compare three data management options used in laboratories: (i) simple, yet error-prone and inefficient spreadsheets, (ii) complex and long-to-implement laboratory information management systems and (iii) high-performance database management systems. We highlight the advantages and pitfalls of each option and outline why a document-oriented NoSQL option via MongoDB Atlas can be a suitable solution for many labs. Our example can be treated as a template and easily adapted to allow scientists to focus on their core work and not on complex data administration.

The increasing amount and complexity of clinical data require an appropriate way of storing and analyzing those data. Traditional approaches use a tabular structure (relational databases) for storing data and thereby complicate storing and retrieving interlinked data from the clinical domain. Graph databases provide a great solution for this by storing data in a graph as nodes (vertices) that are connected by edges (links). The underlying graph structure can be used for the subsequent data analysis (graph learning). Graph learning consists of two parts: graph representation learning and graph analytics. Graph representation learning aims to reduce high-dimensional input graphs to low-dimensional representations. Then, graph analytics uses the obtained representations for analytical tasks like visualization, classification, link prediction and clustering which can be used to solve domain-specific problems. In this survey, we review current state-of-the-art graph database management systems, graph learning algorithms and a variety of graph applications in the clinical domain. Furthermore, we provide a comprehensive use case for a clearer understanding of complex graph learning algorithms. Graphical abstract.

BACKGROUND: Data management system for diabetes clinical trials is used to support clinical data management processes. The purpose of this study was to evaluate the quality and usability of this system from the users' perspectives. METHODS: This study was conducted in 2020, and the pre-post evaluation method was used to examine the quality and usability of the designed system. Initially, a questionnaire was designed and distributed among the researchers who were involved in the diabetes clinical trials (n = 30) to investigate their expectations. Then, the researchers were asked to use the system and explain their perspectives about it by completing two questionnaires. RESULTS: There was no statistically significant differences between the users' perspectives about the information quality, service quality, achievements, and communication before and after using the system. However, in terms of the system quality (P = 0.042) and users' autonomy (P = 0.026), the users' expectations were greater than the system performance. The system usability was at a good level based on the users' opinions. CONCLUSION: It seems that the designed system largely met the users' expectations in most areas. However, the system quality and users' autonomy need further attentions. In addition, the system should be used in multicenter trials and re-evaluated by a larger group of users.

Introdução: Um grande desafio para a utilização de registros e bases de dados secundárias é a qualidade do registro e o percentual de perdas em variáveis estratégicas e necessárias à plena utilização do banco. Objetivo: Propor um método de correção para a variável de estadiamento no âmbito dos Registros Hospitalares de Câncer (RHC), a fim de aprimorar sua completude e qualidade. Método: Estudo descritivo, abrangendo as Unidades da Federação, utilizando-se como fonte de informação o RHC, de janeiro de 2013 a dezembro de 2019. O câncer de pulmão foi escolhido como caso para a correção do banco, em razão da sua alta taxa de mortalidade no Brasil e no mundo. As análises foram realizadas com o software de análises estatísticas SAS Studio e a base de dados organizada em Excel. Resultados: O total de casos registrados no RHC foi de 86.026, e a variável de interesse, o estadiamento, teve um total de 32,0% de perda. Ao final de todas as etapas de correção, a perda foi de 9,8%, correspondendo a 22,2% de recuperação. Conclusão: A metodologia proposta representa um avanço na correção do banco do RHC, possibilitando a utilização dos dados de base secundária, com melhor representatividade das diferentes Regiões do país, sobre o tratamento de câncer de pulmão, com possibilidade de expansão de seu uso para outras topografias

Introduction: A major challenge to utilize the registries and secondary databases is the quality of the data and the percentage of losses in strategic and necessary variables for better effectiveness of the database. Objective: To propose a correction method for the cancer staging variable of the HospitalBased Cancer Registry (HBCR), to improve its completeness and quality. Method: HBCR-based descriptive analysis covering Brazil's Federation Units from January 2013 to December 2019. Due to its high mortality in Brazil and worldwide, lung cancer was selected as case for database correction. The analyzes were performed with the software SAS Studio for statistical analyzes and the data were organized in Excel. Results: The total number of cases registered at the HBCR was 86,026, and 32% the variable of interest, staging, were missed. At the end of the correction process, the missed data reached 9.8%, corresponding to a recovery of 22.2%. Conclusion: The proposed methodology is an advance for the correction of the HBCR database on the treatment of lung cancer, allowing a more extensive use, with better representativeness of different country regions, and potential utilization in other topographies

Introducción: Un gran desafío para el uso de registros y bases de datos secundarias es la calidad del registro en sí, el porcentaje de pérdidas en variables estratégicas y necesarias para el pleno uso de la base de datos. Objetivo: Proponer un método de corrección de la variable estadificación en el ámbito de los Registros Hospitalarios de Cáncer (RHC), con el fin de mejorar su exhaustividad y calidad. Método: Análisis descriptivo, abarcando las Unidades de la Federación. Se utilizó el RHC como fuente de información, de enero de 2013 a diciembre de 2019. El cáncer de pulmón fue elegido como caso para la corrección de la base de datos, debido a su alta tasa de mortalidad en el Brasil y en el mundo. Los análisis se realizaron con el software de análisis estadístico SAS Studio y los datos se organizaron en Excel. Resultados: El total de casos registrados en el RHC fue de 86.026, y la variable de interés, la estadificación, tuvo una pérdida total del 32,0% Al final de todas las etapas esta fue de 9,8%, es decir el 22,2% de recuperación. Conclusión: La metodología propuesta representa un avance en la corrección del RHC, permitiendo una mejor utilización de la base de datos, con una mejor representatividad de las diferentes regiones del país, sobre el tratamiento del cáncer de pulmón, con la posibilidad de expandir su uso a otras topografías

The effective use of data in the management and delivery of public health services has long been understood as critical. The Data Management Competency Framework was developed to be a practical tool providing both a structure and methodology to enable the health workforce, including both decision-makers and implementors, to identify capacity gaps and define the competencies required for the whole data life cycle at all levels of health organizations. It includes 4 areas which are further subdivided into seventeen domains, each with a set of knowledge and skills across 4 proficiency levels. The framework can empower Member States to drive strategic, integrated, and sustainable health workforce capacity building.

Introdução: Diante da realidade virtual que se encontram os procedimentos burocráticos, observa-se a necessidade de se idealizar programas de triagem nas clínicas-escola com os objetivos de se encaminhar pacientes para a clínica mais compatível com as suas necessidades, e substituir os prontuários físicos pelos eletrônicos, numa alternativa ambientalmente correta.Objetivo: Avaliar a efetividade de um modelo de triagem informatizado, comparando-o ao modelo utilizado atualmente, no serviço de Serviço de Triagem e Documentação Odontológica do Departamento de Odontologia da Universidade Federal do Rio Grande do Norte. Metodologia: O estudo realizado foi do tipo descritivo, constituído de uma amostra de 50 pacientes, que foram submetidos ao modelo de triagem utilizado atualmente no Serviço de Triagem e Documentação Odontológica do Departamento de Odontologia da Universidade Federal do Rio Grande do Norte e a triagem com aplicação de um programa informatizado. Foi avaliada a efetividade do dispositivo e feita uma comparação entre os modelos. A análise estatística foi feita por meio do índice de correlação intra-classe, utilizando-se um banco de dados criado no software Statistical Package for Social Sciences, versão 20.0, adotando significância de 95% (p< 0,05).Resultados: Após análise estatística, com realização de correlação entre os resultados do software e o modelo atual de triagem, obteve-se coeficiente de correlação intra-classe de 0,578, com o nível de significância, para avaliação dos dados obtidos de (P<0,05), foi possível evidenciar que ocorreu correlação satisfatória positiva e significativa entre os resultados do software e o modelo atual de triagem.Conclusões:Os resultados denotam concordância entre os modelos de triagem estudados e demonstram que a utilização destes recursos apresenta resultados satisfatórios. Notadamente, evidenciando-se a vantagem da utilização do modelo de triagem informatizado (AU).

Introduction: In view of the virtual reality of bureaucratic procedures, it is necessary to devise screening programs in school clinics to refer patients to the clinic more compatible with their needs and replace physical with electronic records as an environmentally friendly alternative.Objective: To evaluate the effectiveness of a computerized screening model, comparing it to the model currently used in the Dental Documentation and Screening Service of the Dentistry Department of the Federal University of Rio Grande do Norte. Methodology: The descriptive study consisted of a sample of 50 patients who were submitted to the screening model currently used in the abovementioned service and the computerized screening model. The effectiveness of the device was evaluated and a comparison was made between the models. Statistical analysis was made using the intra-class correlation index and a database created in Statistical Package for Social Sciences version 20.0, adopting a significance of 95% (p < 0.05). Results: An intra-class correlation coefficient of 0.578 was obtained with the significance level of p < 0.05. There was a positive and significant satisfactory correlation between the software results and the current screening model.Conclusions: There was agreement between the studied models and the use of these resources yield satisfactory results. Therefore, the advantage of using the computerized screening model was confirmed (AU).

Introducción: Ante la realidad virtual de los trámites burocráticos, surge la necesidad de diseñar programas de cribado en las clínicas docentes con el objetivo de enviar a los pacientes a la clínica más compatible con sus necesidades, reemplazando los registros físicos y electrónicos en una alternativa ambientalmente correcta.Objetivo: Evaluar la efectividad de un modelo de cribado informatizado, comparándolo con el modelo utilizado actualmente en el Servicio de Cribado y Documentación Dental del Departamento de Odontología de la Universidad Federal de Rio Grande do Norte.Metodología: El estudio realizado fue de tipo descriptivo, constituido por una muestra de 50 pacientes que fueron sometidos al modelo de cribado actualmente utilizado en el dicho servicio y al cribado mediante programa informatizado. Se evaluó la efectividad del dispositivo y se realizó una comparación entre los modelos. El análisis estadístico se realizó mediante el índice de correlación intraclase, utilizando una base de datos creada en el software Statistical Package for Social Sciences, versión 20.0, adoptando un nivel de significación del 95% (p< 0,05).Resultados: Luego del análisis estadístico, con correlación entre los resultados del software y el modelo de cribadoactual, se obtuvo un coeficiente de correlación intraclase de 0.578, con nivel de significancia, para evaluación de los datos obtenidos de (P<0.05). Fue posible mostrar que hubo una correlación positiva y significativa satisfactoria entre los resultados del software y el modelo de cribado actual. Conclusiones: Los resultados muestran concordancia entre los modelos de cribado estudiados y demuestran que el uso de estos recursos presenta resultados satisfactorios. En particular, demostrando la ventaja de usar el modelo de cribado computarizado (AU).

Despite genomic sequencing rapidly transforming from being a bench-side tool to a routine procedure in a hospital, there is a noticeable lack of genomic analysis software that supports both clinical and research workflows as well as crowdsourcing. Furthermore, most existing software packages are not forward-compatible in regards to supporting ever-changing diagnostic rules adopted by the genetics community. Regular updates of genomics databases pose challenges for reproducible and traceable automated genetic diagnostics tools. Lastly, most of the software tools score low on explainability amongst clinicians. We have created a fully open-source variant curation tool, AnFiSA, with the intention to invite and accept contributions from clinicians, researchers, and professional software developers. The design of AnFiSA addresses the aforementioned issues via the following architectural principles: using a multidimensional database management system (DBMS) for genomic data to address reproducibility, curated decision trees adaptable to changing clinical rules, and a crowdsourcing-friendly interface to address difficult-to-diagnose cases. We discuss how we have chosen our technology stack and describe the design and implementation of the software. Finally, we show in detail how selected workflows can be implemented using the current version of AnFiSA by a medical geneticist.

Advances in network technology have led to extensive information technology construction work in all walks of life; universities, as a key component of national development, cannot be overlooked in this regard. In today's universities, the Web-based integrated academic management information system is widely used, promoting higher education management system innovation and improving the management level of education departments and teaching management. The traditional management mode is incapable of locating "knowledge" in the mountains of student transcripts, and the original management mode must be improved. In business, finance, insurance, marketing, and other fields, digital exploration technology is widely used. This article describes the design approach for a data mining-based analysis and management system for PE course teaching quality, as well as the application of information technology and data mining technology in PE by combining actual PE teaching in schools, with the goal of realizing a data mining-based PE performance management system to serve PE teaching in schools and improve PE teaching quality. The results show that the time required to find frequent itemsets using a traditional algorithm running on a single machine, as well as the time required to scan the database several times for frequent itemset search in a distributed cluster of 20 computing nodes, is significantly longer than that required by the data mining algorithm. As a result, the proposed sports performance management system is functional, simple, and scalable, with each functional module operating independently and cooperatively, reflecting the concept of "high cohesion and low coupling."

The proposed Edge-based Trust Management System (E-TMS) uses an Eigenvector-based approach for eliminating the security threats present in the Internet of Things (IoT) enabled smart city environment. In most existing trust management systems, the trust aggregation process completely depends on the direct trust ratings obtained from both legitimate and malicious neighboring IoT devices. E-TMS possesses an edge-assisted two-level trust computation approach for ensuring the malicious free trust evaluation of IoT devices. The E-TMS aims at removing the false contribution on aggregated trust data. It utilizes the properties of the Eigenvector for identifying compromised IoT devices. The Eigenvector Analysis also helps to avoid false detection. The analysis involves a comparison of all the contributed trust data about every single connected device. A spectral matrix will be generated corresponding to the contributions and the received trust will be scaled based on the obtained spectral values. The absolute sum of obtained values will contain only true contributions. The accurate identification of false data will remove the effect of malicious contributions from the final trust value of a connected IoT device. Since the final trust value calculated by the edge node contains only the trustworthy data, the prediction about the malicious nodes will be accurate. Eventually, the performance of E-TMS has been validated. Throughput and network resilience are higher than the existing system.

Resumen Evaluar el funcionamiento de los gestores de información y conocimiento implementados en el Instituto Superior de Tecnologías y Ciencias Aplicadas (InSTEC) es el objetivo de la investigación. Ello, no solo posibilita el mejoramiento futuro del desempeño de estos, sino también sirve como estudio preliminar para la inserción de otros gestores en el futuro. Los métodos de análisis documental y evaluación heurística sustentan las bases teóricas, mientras que las herramientas automáticas Nibbler, GooglePageRank, SEOptimer, Website Grader, la entrevista y las encuestas a usuarios evidencian el desempeño de los gestores. Los resultados revelaron que la calidad del sitio externo y de la intranet obtuvo 69,4% de competencia global. Varios de los indicadores de estos sitios se deben perfeccionar en aras de brindar un mejor servicio a los usuarios del Instituto en la gestión de la información y el conocimiento, tan necesaria en las universidades.

Abstract The papers goal is to assess the already implemented information and knowledge managers at InSTEC. This allows improving the managers future work, and serves as a preliminary study for including others in the future. Documentary analysis and heuristics evaluation methods are the theoretical basis; meanwhile, automatic tools Nibbler, GooglePageRank, SEOptimer, Website Grader, interviews and polls to users evidenced the actual development of both sites. The studys outcomes revealed the quality of InSTECs external website and its intranet achieved 69,4 % of general competency. These sites indicators must be enhanced to provide a better service to users in the information and knowledge management so needed in universities.

This work introduces CGRdb2.0─an open-source database management system for molecules, reactions, and chemical data. CGRdb2.0 is a Python package connecting to a PostgreSQL database that enables native searches for molecules and reactions without complicated SQL syntax. The library provides out-of-the-box implementations for similarity and substructure searches for molecules, as well as similarity and substructure searches for reactions in two ways─based on reaction components and based on the Condensed Graph of Reaction approach, the latter significantly accelerating the performance. In benchmarking studies with the RDKit database cartridge, we demonstrate that CGRdb2.0 performs searches faster for smaller data sets, while allowing for interactive access to the retrieved data.

The Chemical Effects in Biological Systems database (CEBS) contains extensive toxicology study results and metadata from the Division of the National Toxicology Program (NTP) and other studies of environmental health interest. This resource grants public access to search and collate data from over 10 250 studies for 12 750 test articles (chemicals, environmental agents). CEBS has made considerable strides over the last 5 years to integrate growing internal data repositories into data warehouses and data marts to better serve the public with high quality curated datasets. This effort includes harmonizing legacy terms and metadata to current standards, mapping test articles to external identifiers, and aligning terms to OBO (Open Biological and Biomedical Ontology) Foundry ontologies. The data are made available through the CEBS Homepage (, guided search applications, flat files on FTP (file transfer protocol), and APIs (application programming interface) for user access and to provide a bridge for computational tools. The user interface is intuitive with a single search bar to query keywords related to study metadata, publications, and data availability. Results are consolidated to single pages for each test article with NTP conclusions, publications, individual studies, data collections, and links to related test articles and projects available together.

The objective of our study was to provide practical directions on the storage of genomic information and novel phenotypes (treated here as unstructured data) using a non-relational database. The MongoDB technology was assessed for this purpose, enabling frequent data transactions involving numerous individuals under genetic evaluation. Our study investigated different genomic (Illumina Final Report, PLINK, 0125, FASTQ, and VCF formats) and phenotypic (including media files) information, using both real and simulated datasets. Advantages of our centralized database concept include the sublinear running time for queries after increasing the number of samples/markers exponentially, in addition to the comprehensive management of distinct data formats while searching for specific genomic regions. A comparison of our non-relational and generic solution, with an existing relational approach (developed for tabular data types using 2 bits to store genotypes), showed reduced importing time to handle 50M SNPs (PLINK format) achieved by the relational schema. Our experimental results also reinforce that data conversion is a costly step required to manage genomic data into both relational and non-relational database systems, and therefore, must be carefully treated for large applications.

