Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 275
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
J Safety Res ; 89: 91-104, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38858066

RESUMO

INTRODUCTION: Workplace accidents in the petroleum industry can cause catastrophic damage to people, property, and the environment. Earlier studies in this domain indicate that the majority of the accident report information is available in unstructured text format. Conventional techniques for the analysis of accident data are time-consuming and heavily dependent on experts' subject knowledge, experience, and judgment. There is a need to develop a machine learning-based decision support system to analyze the vast amounts of unstructured text data that are frequently overlooked due to a lack of appropriate methodology. METHOD: To address this gap in the literature, we propose a hybrid methodology that uses improved text-mining techniques combined with an un-bias group decision-making framework to combine the output of objective weights (based on text mining) and subjective weights (based on expert opinion) of risk factors to prioritize them. Based on the contextual word embedding models and term frequencies, we extracted five important clusters of risk factors comprising more than 32 risk sub-factors. A heterogeneous group of experts and employees in the petroleum industry were contacted to obtain their opinions on the extracted risk factors, and the best-worst method was used to convert their opinions to weights. CONCLUSIONS AND PRACTICAL APPLICATIONS: The applicability of our proposed framework was tested on the data compiled from the accident data released by the petroleum industries in India. Our framework can be extended to accident data from any industry, to reduce analysis time and improve the accuracy in classifying and prioritizing risk factors.


Assuntos
Acidentes de Trabalho , Mineração de Dados , Gestão de Riscos , Humanos , Acidentes de Trabalho/prevenção & controle , Gestão de Riscos/métodos , Mineração de Dados/métodos , Índia , Consenso , Fatores de Risco , Indústria de Petróleo e Gás , Aprendizado de Máquina , Técnicas de Apoio para a Decisão
2.
Food Chem Toxicol ; 187: 114638, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38582341

RESUMO

With a society increasingly demanding alternative protein food sources, new strategies for evaluating protein safety issues, such as allergenic potential, are needed. Large-scale and systemic studies on allergenic proteins are hindered by the limited and non-harmonized clinical information available for these substances in dedicated databases. A missing key information is that representing the symptomatology of the allergens, especially given in terms of standard vocabularies, that would allow connecting with other biomedical resources to carry out different studies related to human health. In this work, we have generated the first resource with a comprehensive annotation of allergens' symptomatology, using a text-mining approach that extracts significant co-mentions between these entities from the scientific literature (PubMed, ∼36 million abstracts). The method identifies statistically significant co-mentions between the textual descriptions of the two types of entities in the literature as indication of relationship. 1,180 clinical signs extracted from the Human Phenotype Ontology, the Medical Subject Heading terms of PubMed together with other allergen-specific symptoms, were linked to 1,036 unique allergens annotated in two main allergen-related public databases via 14,009 relationships. This novel resource, publicly available through an interactive web interface, could serve as a starting point for future manually curated compilation of allergen symptomatology.


Assuntos
Alérgenos , Mineração de Dados , Humanos , Mineração de Dados/métodos , Bases de Dados Factuais , Proteínas/metabolismo
3.
J Med Syst ; 48(1): 47, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38662184

RESUMO

Ontologies serve as comprehensive frameworks for organizing domain-specific knowledge, offering significant benefits for managing clinical data. This study presents the development of the Fall Risk Management Ontology (FRMO), designed to enhance clinical text mining, facilitate integration and interoperability between disparate data sources, and streamline clinical data analysis. By representing major entities within the fall risk management domain, the FRMO supports the unification of clinical language and decision-making processes, ultimately contributing to the prevention of falls among older adults. We used Ontology Web Language (OWL) to build the FRMO in Protégé. Of the seven steps of the Stanford approach, six steps were utilized in the development of the FRMO: (1) defining the domain and scope of the ontology, (2) reusing existing ontologies when possible, (3) enumerating ontology terms, (4) specifying the classes and their hierarchy, (5) defining the properties of the classes, and (6) defining the facets of the properties. We evaluated the FRMO using four main criteria: consistency, completeness, accuracy, and clarity. The developed ontology comprises 890 classes arranged in a hierarchical structure, including six top-level classes with a total of 43 object properties and 28 data properties. FRMO is the first comprehensively described semantic ontology for fall risk management. Healthcare providers can use the ontology as the basis of clinical decision technology for managing falls among older adults.


Assuntos
Acidentes por Quedas , Mineração de Dados , Gestão de Riscos , Acidentes por Quedas/prevenção & controle , Humanos , Mineração de Dados/métodos , Ontologias Biológicas , Registros Eletrônicos de Saúde/organização & administração , Semântica
4.
Support Care Cancer ; 32(5): 314, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38683417

RESUMO

PURPOSE: This study aimed to assess the different needs of patients with breast cancer and their families in online health communities at different treatment phases using a Latent Dirichlet Allocation (LDA) model. METHODS: Using Python, breast cancer-related posts were collected from two online health communities: patient-to-patient and patient-to-doctor. After data cleaning, eligible posts were categorized based on the treatment phase. Subsequently, an LDA model identifying the distinct need-related topics for each phase of treatment, including data preprocessing and LDA topic modeling, was established. Additionally, the demographic and interactive features of the posts were manually analyzed. RESULTS: We collected 84,043 posts, of which 9504 posts were included after data cleaning. Early diagnosis and rehabilitation treatment phases had the highest and lowest number of posts, respectively. LDA identified 11 topics: three in the initial diagnosis phase and two in each of the remaining treatment phases. The topics included disease outcomes, diagnosis analysis, treatment information, and emotional support in the initial diagnosis phase; surgical options and outcomes, postoperative care, and treatment planning in the perioperative treatment phase; treatment options and costs, side effects management, and disease prognosis assessment in the non-operative treatment phase; diagnosis and treatment options, disease prognosis, and emotional support in the relapse and metastasis treatment phase; and follow-up and recurrence concerns, physical symptoms, and lifestyle adjustments in the rehabilitation treatment phase. CONCLUSION: The needs of patients with breast cancer and their families differ across various phases of cancer therapy. Therefore, specific information or emotional assistance should be tailored to each phase of treatment based on the unique needs of patients and their families.


Assuntos
Neoplasias da Mama , Mineração de Dados , Humanos , Neoplasias da Mama/psicologia , Neoplasias da Mama/terapia , Neoplasias da Mama/reabilitação , Feminino , Mineração de Dados/métodos , Avaliação das Necessidades , Internet
5.
J Biomed Inform ; 153: 104642, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38621641

RESUMO

OBJECTIVE: To develop a natural language processing (NLP) package to extract social determinants of health (SDoH) from clinical narratives, examine the bias among race and gender groups, test the generalizability of extracting SDoH for different disease groups, and examine population-level extraction ratio. METHODS: We developed SDoH corpora using clinical notes identified at the University of Florida (UF) Health. We systematically compared 7 transformer-based large language models (LLMs) and developed an open-source package - SODA (i.e., SOcial DeterminAnts) to facilitate SDoH extraction from clinical narratives. We examined the performance and potential bias of SODA for different race and gender groups, tested the generalizability of SODA using two disease domains including cancer and opioid use, and explored strategies for improvement. We applied SODA to extract 19 categories of SDoH from the breast (n = 7,971), lung (n = 11,804), and colorectal cancer (n = 6,240) cohorts to assess patient-level extraction ratio and examine the differences among race and gender groups. RESULTS: We developed an SDoH corpus using 629 clinical notes of cancer patients with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH, and another cross-disease validation corpus using 200 notes from opioid use patients with 4,342 SDoH concepts/attributes. We compared 7 transformer models and the GatorTron model achieved the best mean average strict/lenient F1 scores of 0.9122 and 0.9367 for SDoH concept extraction and 0.9584 and 0.9593 for linking attributes to SDoH concepts. There is a small performance gap (∼4%) between Males and Females, but a large performance gap (>16 %) among race groups. The performance dropped when we applied the cancer SDoH model to the opioid cohort; fine-tuning using a smaller opioid SDoH corpus improved the performance. The extraction ratio varied in the three cancer cohorts, in which 10 SDoH could be extracted from over 70 % of cancer patients, but 9 SDoH could be extracted from less than 70 % of cancer patients. Individuals from the White and Black groups have a higher extraction ratio than other minority race groups. CONCLUSIONS: Our SODA package achieved good performance in extracting 19 categories of SDoH from clinical narratives. The SODA package with pre-trained transformer models is available at https://github.com/uf-hobi-informatics-lab/SODA_Docker.


Assuntos
Narração , Processamento de Linguagem Natural , Determinantes Sociais da Saúde , Humanos , Feminino , Masculino , Viés , Registros Eletrônicos de Saúde , Documentação/métodos , Mineração de Dados/métodos
6.
J Biomed Inform ; 153: 104630, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38548007

RESUMO

OBJECTIVE: To develop soft prompt-based learning architecture for large language models (LLMs), examine prompt-tuning using frozen/unfrozen LLMs, and assess their abilities in transfer learning and few-shot learning. METHODS: We developed a soft prompt-based learning architecture and compared 4 strategies including (1) fine-tuning without prompts; (2) hard-prompting with unfrozen LLMs; (3) soft-prompting with unfrozen LLMs; and (4) soft-prompting with frozen LLMs. We evaluated GatorTron, a clinical LLM with up to 8.9 billion parameters, and compared GatorTron with 4 existing transformer models for clinical concept and relation extraction on 2 benchmark datasets for adverse drug events and social determinants of health (SDoH). We evaluated the few-shot learning ability and generalizability for cross-institution applications. RESULTS AND CONCLUSION: When LLMs are unfrozen, GatorTron-3.9B with soft prompting achieves the best strict F1-scores of 0.9118 and 0.8604 for concept extraction, outperforming the traditional fine-tuning and hard prompt-based models by 0.6 âˆ¼ 3.1 % and 1.2 âˆ¼ 2.9 %, respectively; GatorTron-345 M with soft prompting achieves the best F1-scores of 0.8332 and 0.7488 for end-to-end relation extraction, outperforming other two models by 0.2 âˆ¼ 2 % and 0.6 âˆ¼ 11.7 %, respectively. When LLMs are frozen, small LLMs have a big gap to be competitive with unfrozen models; scaling LLMs up to billions of parameters makes frozen LLMs competitive with unfrozen models. Soft prompting with a frozen GatorTron-8.9B model achieved the best performance for cross-institution evaluation. We demonstrate that (1) machines can learn soft prompts better than hard prompts composed by human, (2) frozen LLMs have good few-shot learning ability and generalizability for cross-institution applications, (3) frozen LLMs reduce computing cost to 2.5 âˆ¼ 6 % of previous methods using unfrozen LLMs, and (4) frozen LLMs require large models (e.g., over several billions of parameters) for good performance.


Assuntos
Processamento de Linguagem Natural , Humanos , Aprendizado de Máquina , Mineração de Dados/métodos , Algoritmos , Determinantes Sociais da Saúde , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos
7.
Stud Health Technol Inform ; 301: 192-197, 2023 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-37172179

RESUMO

BACKGROUND: Many components must work together to continuously improve processes in healthcare organizations. Process mining has recently developed into a discipline that can make a significant contribution here. OBJECTIVES: We want to extend an existing management tool to assess and improve the capability of organizations in this area. METHOD: We add a dimension to the adoption readiness assessment and maturity model for sharable clinical pathways to assess and improve event data quality. RESULTS: We present different approaches for formal and checkpoint assessments and an embedding of the improvement strategy with examples. CONCLUSION: The additional dimension from the process mining domain integrates with the existing model. At all levels, links can be established between the various aspects of event data quality with existing dimensions. The model has yet to be tested in a real-world use case.


Assuntos
Gerenciamento de Dados , Atenção à Saúde , Instalações de Saúde , Organizações , Mineração de Dados/métodos
8.
IEEE Trans Vis Comput Graph ; 29(6): 2849-2861, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37030774

RESUMO

Collusive fraud, in which multiple fraudsters collude to defraud health insurance funds, threatens the operation of the healthcare system. However, existing statistical and machine learning-based methods have limited ability to detect fraud in the scenario of health insurance due to the high similarity of fraudulent behaviors to normal medical visits and the lack of labeled data. To ensure the accuracy of the detection results, expert knowledge needs to be integrated with the fraud detection process. By working closely with health insurance audit experts, we propose FraudAuditor, a three-stage visual analytics approach to collusive fraud detection in health insurance. Specifically, we first allow users to interactively construct a co-visit network to holistically model the visit relationships of different patients. Second, an improved community detection algorithm that considers the strength of fraud likelihood is designed to detect suspicious fraudulent groups. Finally, through our visual interface, users can compare, investigate, and verify suspicious patient behavior with tailored visualizations that support different time scales. We conducted case studies in a real-world healthcare scenario, i.e., to help locate the actual fraud group and exclude the false positive group. The results and expert feedback proved the effectiveness and usability of the approach.


Assuntos
Gráficos por Computador , Mineração de Dados , Humanos , Mineração de Dados/métodos , Seguro Saúde , Algoritmos , Fraude
9.
Comput Inform Nurs ; 41(6): 426-433, 2023 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-36225163

RESUMO

Text-mining algorithms can identify the most prevalent factors of risk-benefit assessment on the use of complementary and integrative health approaches that are found in healthcare professionals' written notes. The aims of this study were to discover the key factors of decision-making on patients' complementary and integrative health use by healthcare professionals and to build a consensus-derived decision algorithm on the benefit-risk assessment of complementary and integrative health use in diabetes. The retrospective study of an archival dataset used a text-mining method designed to extract and analyze unstructured textual data from healthcare professionals' responses. The techniques of classification, clustering, and extraction were performed with 1398 unstructured clinical notes made by healthcare professionals between 2019 and 2020. The most important factor for decision-making by healthcare professionals about complementary and integrative health use in patients with diabetes was the ingredients of the product. Other important factors were the patient's diabetes control, the undesirable effects from complementary and integrative health, evidence-based complementary and integrative health, medical laboratory data, and the product's affordability. This exploratory text-mining study provides insight into how healthcare professionals decide complementary and integrative health use for patients with diabetes after a risk-benefit assessment from clinical narrative notes.


Assuntos
Terapias Complementares , Diabetes Mellitus , Humanos , Estudos Retrospectivos , Diabetes Mellitus/terapia , Mineração de Dados/métodos , Atenção à Saúde
10.
Comput Intell Neurosci ; 2022: 2901167, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36275959

RESUMO

In the time of data blast and the ascent of the Web, the dramatic development of information and the data needs of small- and medium-sized endeavors affect the customary expense of the executives. Instructions to all the more likely mine viable data in information to give effective ways and technique merits considering. As the center innovation of handling enormous information, information mining can deal with a lot of mind-boggling information effectively. Accordingly, this paper talks about the cycle and technique for applying information mining at the expense of the executives of small- and medium-sized endeavors, to work on the seriousness of small- and medium-sized undertakings. This examination depends on the exploration and examination of the monetary information of small- and medium-sized ventures, joined with information mining innovation, extricates and uses the immense monetary information produced in the day-to-day administration cycle of the monetary division of the small- and medium-sized undertakings, plans and executes an information mining-based monetary information examination of small- and medium-sized endeavors framework. Joined with programming plan thoughts, through fundamental interest exploration and examination and many benefits of the ongoing B/S design, it was chosen to utilize the Java programming language, MyEclipse11 programming apparatus, Microsoft SQL Server 2008 data set administration instrument, J2EE advancement stage, and the exemplary Apriori in information mining. Mining techniques, for example, affiliation rules, bunching calculations, and choice tree calculations, have completely dissected the monetary information of small- and medium-sized undertakings, naturally and dependably give monetary administration branches of small- and medium-sized endeavors and ranking directors of small- and medium-sized ventures with helpful monetary data, and can help small- and medium-sized endeavors. Business pioneers pursue speedy choices. The planned and executed monetary examination framework in light of information mining incorporates the fundamental elements of SME monetary administration, resource stock administration, resource designation of the board, resource deterioration and discount, resource information upkeep of the executives, and so on. Small- and medium-sized undertakings' monetary administration framework is a mix of information mining innovation, programming innovation, and small- and medium-sized monetary administration. The effective, solid, and helpful way has further developed the center seriousness of small- and medium-sized undertakings partially and accomplished a definitive objective of the framework plan.


Assuntos
Comércio , Mineração de Dados , Mineração de Dados/métodos
11.
Comput Intell Neurosci ; 2022: 1467195, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36156958

RESUMO

In order to better solve the problems of low efficiency, large consumption of human resources, and relatively low degree of intelligence in the abnormal data monitoring system of financial accounting in colleges and universities under the background of current accounting computerization, this article takes data mining and neural network algorithm as the technical basis to build the abnormal data monitoring system of financial accounting in colleges and universities. This article uses data mining algorithm and neural network analysis technology to process the original accounting information of colleges and universities, effectively eliminate invalid data, retain valuable data, and improve the detection efficiency of abnormal accounting data. The system test results show that the accuracy of identifying 50 abnormal situations in the original accounting data of colleges and universities is more than 90% by using data mining and neural network model.


Assuntos
Algoritmos , Redes Neurais de Computação , Mineração de Dados/métodos , Árvores de Decisões , Humanos , Universidades
12.
Comput Intell Neurosci ; 2022: 1518202, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35655506

RESUMO

With the implementation of national strategies such as sports power and national fitness, the sports economy has become an important element of high-quality national development, and the demand for sports economy and management talents is greatly increased. Particularly in the new area with big data as the typical feature, the teaching content, teaching method, and teaching mode of sports economics and management majors have put forward new requirements. The continuous progress of storage and network technology has prompted the generation of massive multisource spatiotemporal data in various fields. The advantage of association analysis algorithms is that they are easy to code and implement. The relationships found by association analysis can take two forms: frequent itemsets or association rules. We use correlation analysis methods to perform correlation learning between sports economy and related big data and thus improve the development of sports economy. Mining and analyzing the relevant big data can precisely reveal the problems of sports economic development and can realize the fine management of sports, thus contributing to the healthy development of sports. Mastering the skills of acquiring, analyzing, and applying big data is the core content of sports economic analysis. The sports economy has refined and intelligent management means, and its adoption of virtual reality reflects the current situation and development trend of the sports business, which further highlights the status and role of multisource big data in the sports economy. Based on these, this paper proposed a sports economy mining algorithm in view of the correlation analysis and big data model. Then, we verified the effectiveness of the model through experiments, which laid the foundation for the development of the sports economy.


Assuntos
Big Data , Esportes , Algoritmos , Mineração de Dados/métodos
13.
J Biomed Inform ; 127: 104009, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35196579

RESUMO

Health monitoring systems (HMSs) capture physiological measurements through biosensors (sensing), obtain significant properties and measures from the output signal (perceiving), use algorithms for data analysis (reasoning), and trigger warnings or alarms (acting) when an emergency occurs. These systems have the potential to enhance health care delivery in different application domains, showing promising benefits for health diagnosis, early symptom detection, disease prediction, among others. However, the implementation of HMS presents challenges for sensing, perceiving, reasoning, and acting based on monitored data, mainly when data processing should be performed in real time. Thus, the quality of these diagnoses relies heavily on the data and data analysis methods applied. Data mining techniques have been broadly investigated in health systems; however, it is not clear what real-time data analysis techniques are best suited for each context. This work carries out a search in five scientific electronic databases to identify recent studies that investigated HMS using real-time data analysis techniques. Thirty-six research studies were selected after screening 2,822 works. Applied data analysis methods, application domains, utilized sensors, physiological parameters, extracted features, claimed benefits, limitations, datasets used, and published results were described, compared and analyzed. The findings indicate that machine learning methods are trending in such studies. There is no universal solution for all health domains; however, support vector machines are a predominant method. Among the application domains, cardiovascular disease is the most investigated. Most reviewed studies reported improvements in performing data mining tasks or operational modes of solutions. Although studies tested algorithms and presented promising results, those are particular for each experiment. This review gives a comprehensive overview of HMS real-time data analysis and points to directions for future research.


Assuntos
Análise de Dados , Aprendizado de Máquina , Algoritmos , Mineração de Dados/métodos , Monitorização Fisiológica
14.
Comput Math Methods Med ; 2022: 5115089, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35198037

RESUMO

Studies have shown that the physical, psychological, and social problems of liver cancer patients are more serious than those of other cancer patients and their quality of life is significantly reduced. This may be related to the poor treatment effect of patients with advanced liver cancer. Patients often have adverse symptoms such as cancer pain, pleural effusion, and ascites, etc., which have a great impact on patients' psychology and recovery from illness. With the change of the medical model, it has become history to rely solely on drugs to care for patients with advanced liver cancer and comprehensive nursing intervention has become very important. Continuous nursing intervention focuses on individualized and full-hearted care, effectively alleviating patients' anxiety and fear and improving patients' environmental adaptability and psychological defense mechanisms. However, in the field of liver cancer, there is no detailed comparison between the efficacy of continuous nursing and traditional conventional nursing. This article applies the hidden Markov model, starts with medical data mining, and describes the process achieved by the application of this article and the analysis of the results obtained by the two nursing methods, which reflect the difference in curative effect evaluation, and it proves that continuous nursing has more advantages in the curative effect of patients with liver tumors.


Assuntos
Mineração de Dados/métodos , Neoplasias Hepáticas/enfermagem , Modelos de Enfermagem , Algoritmos , China , Biologia Computacional , Mineração de Dados/estatística & dados numéricos , Humanos , Cadeias de Markov
15.
Clin Pharmacol Ther ; 111(1): 209-217, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34260087

RESUMO

Many real-word evidence (RWE) studies that utilize existing healthcare data to evaluate treatment effects incur substantial but avoidable bias from methodologically flawed study design; however, the extent of preventable methodological pitfalls in current RWE is unknown. To characterize the prevalence of avoidable methodological pitfalls with potential for bias in published claims-based studies of medication safety or effectiveness, we conducted an English-language search of PubMed for articles published from January 1, 2010 to May 20, 2019 and randomly selected 75 studies (10 case-control and 65 cohort studies) that evaluated safety or effectiveness of cardiovascular, diabetes, or osteoporosis medications using US health insurance claims. General and methodological study characteristics were extracted independently by two reviewers, and potential for bias was assessed across nine bias domains. Nearly all studies (95%) had at least one avoidable methodological issue known to incur bias, and 81% had potentially at least one of the four issues considered major due to their potential to undermine study validity: time-related bias (57%), potential for depletion of outcome-susceptible individuals (44%), inappropriate adjustment for postbaseline variables (41%), or potential for reverse causation (39%). The median number of major issues per study was 2 (interquartile range (IQR), 1-3) and was lower in cohort studies with a new-user, active-comparator design (median 1, IQR 0-1) than in cohort studies of prevalent users with a nonuser comparator (median 3, IQR 3-4). Recognizing and avoiding known methodological study design pitfalls could substantially improve the utility of RWE and confidence in its validity.


Assuntos
Mineração de Dados/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Viés , Estudos de Casos e Controles , Estudos de Coortes , Análise de Dados , Bases de Dados Factuais , Humanos , Revisão da Utilização de Seguros , Métodos , Prevalência , Projetos de Pesquisa
16.
Comput Math Methods Med ; 2021: 2059432, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34819987

RESUMO

Traditional audit data analysis algorithms have many shortcomings, such as the lack of means to mine the hidden audit clues behind the data, the difficulty of finding increasingly hidden cheating techniques caused by the electronic and networked environment, and the inability to solve the quality defects of the audited data. Correlation analysis algorithm in data mining technology is an effective means to obtain knowledge from massive data, which can complete, muffle, clean, and reduce defective data and then can analyze massive data and obtain audit trails under the guidance of expert experience or analysts. Therefore, on the basis of summarizing and analyzing previous research works, this paper expounds the research status and significance of audit data analysis and application; elaborates the development background, current status, and future challenges of correlation analysis algorithm; introduces the methods and principles of data model and its conversion and audit model construction; conducts audit data collection and cleaning; implements audit data preprocessing and its algorithm description; performs audit data analysis based on correlation analysis algorithm; analyzes the hidden node activation value and audit rule extraction in correlation analysis algorithm; proposes the application of audit data based on correlation analysis algorithm; discusses the relationship between audit data quality and audit risk; and finally compares different data mining algorithms in audit data analysis. The findings demonstrate that by analyzing association rules, the correlation analysis algorithm can determine the significance of a huge quantity of audit data and characterise the degree to which linked events would occur concurrently or sequentially in a probabilistic manner. The correlation analysis algorithm first inputs the collected audit data through preprocessing module to filter out useless data and then organizes the obtained data into a format that can be recognized by data mining algorithm and executes the correlation analysis algorithm on the sorted data; finally, the obtained hidden data is divided into normal data and suspicious data by comparing it with the pattern in the rule base. The algorithm can conduct in-depth analysis and research on the company's accounting vouchers, account books, and a large number of financial accounting data and other data of various natures in the company's accounting vouchers; reveal its original characteristics and internal connections; and turn it into an audit. People need more direct and useful information. The study results of this paper provide a reference for further researches on audit data analysis and application based on correlation analysis algorithm.


Assuntos
Algoritmos , Big Data , Análise de Dados , Auditoria Financeira/métodos , Biologia Computacional , Correlação de Dados , Mineração de Dados/métodos , Mineração de Dados/estatística & dados numéricos , Auditoria Financeira/estatística & dados numéricos , Humanos
17.
PLoS One ; 16(9): e0257686, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34555076

RESUMO

Transfer Entropy was applied to analyze the correlations and flow of information between 200,500 tweets and 23 of the largest capitalized companies during 6 years along the period 2013-2018. The set of tweets were obtained applying a text mining algorithm and classified according to daily date and company mentioned. We proposed the construction of a Sentiment Index applying a Natural Processing Language algorithm and structuring the sentiment polarity for each data set. Bootstrapped Simulations of Transfer Entropy were performed between stock prices and Sentiment Indexes. The results of the Transfer Entropy simulations show a clear information flux between general public opinion and companies' stock prices. There is a considerable amount of information flowing from general opinion to stock prices, even between different Sentiment Indexes. Our results suggest a deep relationship between general public opinion and stock prices. This is important for trading strategies and the information release policies for each company.


Assuntos
Mineração de Dados/métodos , Setor Privado/economia , Mídias Sociais , Comércio , Entropia , Humanos , Processamento de Linguagem Natural
18.
Biochem Med (Zagreb) ; 31(3): 030902, 2021 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-34393596

RESUMO

INTRODUCTION: It is common for patients to switch between several healthcare providers. In this context, the long-term follow-up of medical conditions based on laboratory test results obtained from different laboratories is a challenge. The measurement uncertainty in an inter-laboratory context should also be considered in data mining research based on routine results from randomly selected laboratories. As a proof-of-concept study, we aimed at estimating the inter-laboratory reference change value (IL-RCV) for exemplary analytes from publicly available data on external quality assessment (EQA) and biological variation. MATERIALS AND METHODS: External quality assessment data of the Reference Institute for Bioanalytics (RfB, Bonn, Germany) for serum creatinine, calcium, aldosterone, PSA, and of whole blood HbA1c from campaigns sent out in 2019 were analysed. The median CVs of all EQA participants were calculated based on 8 samples from 4 EQA campaigns per analyte. Using intra-individual biological variation data from the EFLM database, positive and negative IL-RCV were estimated with a formula based on log transformation under the assumption that the analytes under examination have a skewed distribution. RESULTS: We estimated IL-RCVs for all exemplary analytes, ranging from 13.3% to 203% for the positive IL-RCV and - 11.8% to - 67.0% for the negative IL-RCV (serum calcium - serum aldosterone), respectively. CONCLUSION: External quality assessment data together with data on the biological variation - both freely available - allow the estimation of inter-laboratory RCVs. These differ substantially between different analytes and can help to assess the boundaries of interoperability in laboratory medicine.


Assuntos
Análise Química do Sangue/normas , Técnicas de Laboratório Clínico , Mineração de Dados/métodos , Aldosterona/sangue , Cálcio/sangue , Creatinina/sangue , Coleta de Dados , Tomada de Decisões , Desenho de Equipamento , Hemoglobinas Glicadas/biossíntese , Humanos , Modelos Teóricos , Antígeno Prostático Específico/sangue , Controle de Qualidade , Valores de Referência , Reprodutibilidade dos Testes
19.
Methods Mol Biol ; 2328: 191-202, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34251627

RESUMO

The system-wide complexity of genome regulation encoding the organism phenotypic diversity is well understood. However, a major challenge persists about the appropriate method to describe the systematic dynamic genome regulation event utilizing enormous multi-omics datasets. Here, we describe Interactive Dynamic Regulatory Events Miner (iDREM) which reconstructs gene-regulatory networks from temporal transcriptome, proteome, and epigenome datasets during stress to envisage "master" regulators by simulating cascades of temporal transcription-regulatory and interactome events. The iDREM is a Java-based software that integrates static and time-series transcriptomics and proteomics datasets, transcription factor (TF)-target interactions, microRNA (miRNA)-target interaction, and protein-protein interactions to reconstruct temporal regulatory network and identify significant regulators in an unsupervised manner. The hidden Markov model detects specialized manipulated pathways as well as genes to recognize statistically significant regulators (TFs/miRNAs) that diverge in temporal activity. This method can be translated to any biotic or abiotic stress in plants and animals to predict the master regulators from condition-specific multi-omics datasets including host-pathogen interactions for comprehensive understanding of manipulated biological pathways.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Redes Reguladoras de Genes , Interações Hospedeiro-Patógeno/genética , RNA-Seq/métodos , Epigenômica , Regulação da Expressão Gênica de Plantas/genética , Genômica , Interações Hospedeiro-Patógeno/imunologia , Cadeias de Markov , Metabolômica , MicroRNAs/genética , MicroRNAs/metabolismo , Plantas/genética , Plantas/imunologia , Plantas/metabolismo , Linguagens de Programação , Transdução de Sinais/genética , Software , Análise Espaço-Temporal , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
20.
JAMA Pediatr ; 175(9): 957-965, 2021 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-34097007

RESUMO

Importance: Although there is no pharmacological treatment for autism spectrum disorder (ASD) itself, behavioral and pharmacological therapies have been used to address its symptoms and common comorbidities. A better understanding of the medications used to manage comorbid conditions in this growing population is critical; however, most previous efforts have been limited in size, duration, and lack of broad representation. Objective: To use a nationally representative database to uncover trends in the prevalence of co-occurring conditions and medication use in the management of symptoms and comorbidities over time among US individuals with ASD. Design, Setting, and Participants: This retrospective, population-based cohort study mined a nationwide, managed health plan claims database containing more than 86 million unique members. Data from January 1, 2014, to December 31, 2019, were used to analyze prescription frequency and diagnoses of comorbidities. A total of 26 722 individuals with ASD who had been prescribed at least 1 of 24 medications most commonly prescribed to treat ASD symptoms or comorbidities during the 6-year study period were included in the analysis. Exposures: Diagnosis codes for ASD based on International Classification of Diseases, Ninth Revision, and International Statistical Classification of Diseases and Related Health Problems, Tenth Revision. Main Outcomes and Measures: Quantitative estimates of prescription frequency for the 24 most commonly prescribed medications among the study cohort and the most common comorbidities associated with each medication in this population. Results: Among the 26 722 individuals with ASD included in the analysis (77.7% male; mean [SD] age, 14.45 [9.40] years), polypharmacy was common, ranging from 28.6% to 31.5%. Individuals' prescription regimens changed frequently within medication classes, rather than between classes. The prescription frequency of a specific medication varied considerably, depending on the coexisting diagnosis of a given comorbidity. Of the 24 medications assessed, 15 were associated with at least a 15% prevalence of a mood disorder, and 11 were associated with at least a 15% prevalence of attention-deficit/hyperactivity disorder. For patients taking antipsychotics, the 2 most common comorbidities were combined type attention-deficit/hyperactivity disorder (11.6%-17.8%) and anxiety disorder (13.1%-30.1%). Conclusions and Relevance: This study demonstrated considerable variability and transiency in the use of prescription medications by US clinicians to manage symptoms and comorbidities associated with ASD. These findings support the importance of early and ongoing surveillance of patients with ASD and co-occurring conditions and offer clinicians insight on the targeted therapies most commonly used to manage co-occurring conditions. Future research and policy efforts are critical to assess the extent to which pharmacological management of comorbidities affects quality of life and functioning in patients with ASD while continuing to optimize clinical guidelines, to ensure effective care for this growing population.


Assuntos
Transtorno do Espectro Autista/economia , Comorbidade , Acessibilidade aos Serviços de Saúde/estatística & dados numéricos , Seguro/normas , Adolescente , Anfetaminas/administração & dosagem , Anfetaminas/uso terapêutico , Cloridrato de Atomoxetina/administração & dosagem , Cloridrato de Atomoxetina/uso terapêutico , Transtorno do Deficit de Atenção com Hiperatividade/tratamento farmacológico , Transtorno do Espectro Autista/epidemiologia , Bupropiona/administração & dosagem , Bupropiona/uso terapêutico , Criança , Pré-Escolar , Estudos de Coortes , Mineração de Dados/métodos , Mineração de Dados/estatística & dados numéricos , Transtorno Depressivo Maior/tratamento farmacológico , Cloridrato de Dexmetilfenidato/administração & dosagem , Cloridrato de Dexmetilfenidato/uso terapêutico , Dextroanfetamina/administração & dosagem , Dextroanfetamina/uso terapêutico , Feminino , Humanos , Seguro/estatística & dados numéricos , Dimesilato de Lisdexanfetamina/administração & dosagem , Dimesilato de Lisdexanfetamina/uso terapêutico , Masculino , Programas de Assistência Gerenciada/organização & administração , Programas de Assistência Gerenciada/estatística & dados numéricos , Prevalência , Estudos Retrospectivos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA