Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 363
Filter
1.
Comput Math Methods Med ; 2022: 6503402, 2022.
Article in English | MEDLINE | ID: mdl-35178118

ABSTRACT

The selection of MOOC teaching resources is influenced by diversified resource positioning methods, which leads to low index efficiency of resource mining. Therefore, this paper proposes a multiresource mining method based on association rules to collect the learning behavior data of MOOC users and establish the MOOC teaching resource warehouse. Aiming at the attribute set of information association positioning, the association rules of teaching resources are designed. In addition, the association rules are combined with the shortest path scheduling scheme of teaching resources to establish the location and mining of diversified MOOC teaching-associated resources. Finally, the clustering method is used to process the results of teaching resource mining and complete the clustering of diversified teaching resources. Experimental results show that the index time required by the proposed mining method is 0.1 s, which is only 1/6 of other resource mining methods.


Subject(s)
Computer-Assisted Instruction/methods , Data Mining/methods , Education, Distance/methods , Algorithms , Association Learning , China , Computational Biology , Computer Simulation , Computer-Assisted Instruction/statistics & numerical data , Data Mining/statistics & numerical data , Education, Distance/statistics & numerical data , Humans , Internet , Language , Models, Educational , Software
2.
Comput Math Methods Med ; 2022: 5115089, 2022.
Article in English | MEDLINE | ID: mdl-35198037

ABSTRACT

Studies have shown that the physical, psychological, and social problems of liver cancer patients are more serious than those of other cancer patients and their quality of life is significantly reduced. This may be related to the poor treatment effect of patients with advanced liver cancer. Patients often have adverse symptoms such as cancer pain, pleural effusion, and ascites, etc., which have a great impact on patients' psychology and recovery from illness. With the change of the medical model, it has become history to rely solely on drugs to care for patients with advanced liver cancer and comprehensive nursing intervention has become very important. Continuous nursing intervention focuses on individualized and full-hearted care, effectively alleviating patients' anxiety and fear and improving patients' environmental adaptability and psychological defense mechanisms. However, in the field of liver cancer, there is no detailed comparison between the efficacy of continuous nursing and traditional conventional nursing. This article applies the hidden Markov model, starts with medical data mining, and describes the process achieved by the application of this article and the analysis of the results obtained by the two nursing methods, which reflect the difference in curative effect evaluation, and it proves that continuous nursing has more advantages in the curative effect of patients with liver tumors.


Subject(s)
Data Mining/methods , Liver Neoplasms/nursing , Models, Nursing , Algorithms , China , Computational Biology , Data Mining/statistics & numerical data , Humans , Markov Chains
3.
Comput Math Methods Med ; 2022: 9339905, 2022.
Article in English | MEDLINE | ID: mdl-35103072

ABSTRACT

Due to the increasing prosperity of human life science and technology, many huge research results have been obtained, and the scientific research of molecular biology is developing rapidly. Therefore, the output of biological genome data has increased exponentially, which constitutes a huge amount of data analysis. The seemingly chaotic and massive amount of data information actually contains a large amount of data and information of great key scientific significance and value. Therefore, this kind of genomic data information not only contains the information content that describes the characteristics of human life but also contains the information content that can express the essence of the biological organism. It includes macroeconomic information that can reflect the basic structure and capabilities of living organisms and microinformation in related fields of molecular biology. This massive amount of genetic data is usually closely related to each other, can influence each other, and does not exist alone. In the article, the causes of uncertain data and the classification of uncertain data are introduced, and the basic concepts and related algorithms of data mining are explained. Focusing on the research and analysis of abnormal point detection and clustering algorithms in uncertain data mining technology, this paper solves the problem of how to obtain more diverse and accurate outlier detection and cluster analysis results in uncertain data. The results showed that whether it was related to obesity or not, the Lp(a) level of the sarcopenia group was significantly higher than that of the nonsarcopenia group. At the same time, the correlation analysis showed that ASM/height was negatively correlated with Lp(a). ASM/height is one of the criteria for diagnosing sarcoidosis, and it is also the core of the analysis. Among the 1956 tumor patients collected in this study, 432 had sarcopenia, accounting for 22.08%, and the incidence of sarcopenia in patients with gastrointestinal tumors increased.


Subject(s)
Data Mining/methods , Exercise/physiology , Sarcopenia/etiology , Aged , Aged, 80 and over , Algorithms , Computational Biology , Data Mining/statistics & numerical data , Exercise/statistics & numerical data , Female , Hand Strength/physiology , Humans , Lipoprotein(a)/blood , Logistic Models , Male , Middle Aged , Models, Biological , Muscle, Skeletal/physiology , Neoplasms/complications , Neoplasms/physiopathology , Sarcopenia/diagnosis , Sarcopenia/physiopathology
4.
Comput Math Methods Med ; 2022: 9288452, 2022.
Article in English | MEDLINE | ID: mdl-35154361

ABSTRACT

One of the leading causes of deaths around the globe is heart disease. Heart is an organ that is responsible for the supply of blood to each part of the body. Coronary artery disease (CAD) and chronic heart failure (CHF) often lead to heart attack. Traditional medical procedures (angiography) for the diagnosis of heart disease have higher cost as well as serious health concerns. Therefore, researchers have developed various automated diagnostic systems based on machine learning (ML) and data mining techniques. ML-based automated diagnostic systems provide an affordable, efficient, and reliable solutions for heart disease detection. Various ML, data mining methods, and data modalities have been utilized in the past. Many previous review papers have presented systematic reviews based on one type of data modality. This study, therefore, targets systematic review of automated diagnosis for heart disease prediction based on different types of modalities, i.e., clinical feature-based data modality, images, and ECG. Moreover, this paper critically evaluates the previous methods and presents the limitations in these methods. Finally, the article provides some future research directions in the domain of automated heart disease detection based on machine learning and multiple of data modalities.


Subject(s)
Diagnosis, Computer-Assisted/methods , Heart Failure/diagnosis , Machine Learning , Algorithms , Arrhythmias, Cardiac/diagnosis , Arrhythmias, Cardiac/diagnostic imaging , Computational Biology , Coronary Artery Disease/diagnosis , Coronary Artery Disease/diagnostic imaging , Data Mining/statistics & numerical data , Databases, Factual/statistics & numerical data , Diagnosis, Computer-Assisted/statistics & numerical data , Diagnosis, Computer-Assisted/trends , Electrocardiography/statistics & numerical data , Heart Failure/diagnostic imaging , Humans , Image Interpretation, Computer-Assisted/statistics & numerical data , Machine Learning/trends , Neural Networks, Computer
5.
Nucleic Acids Res ; 50(D1): D222-D230, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34850920

ABSTRACT

MicroRNAs (miRNAs) are noncoding RNAs with 18-26 nucleotides; they pair with target mRNAs to regulate gene expression and produce significant changes in various physiological and pathological processes. In recent years, the interaction between miRNAs and their target genes has become one of the mainstream directions for drug development. As a large-scale biological database that mainly provides miRNA-target interactions (MTIs) verified by biological experiments, miRTarBase has undergone five revisions and enhancements. The database has accumulated >2 200 449 verified MTIs from 13 389 manually curated articles and CLIP-seq data. An optimized scoring system is adopted to enhance this update's critical recognition of MTI-related articles and corresponding disease information. In addition, single-nucleotide polymorphisms and disease-related variants related to the binding efficiency of miRNA and target were characterized in miRNAs and gene 3' untranslated regions. miRNA expression profiles across extracellular vesicles, blood and different tissues, including exosomal miRNAs and tissue-specific miRNAs, were integrated to explore miRNA functions and biomarkers. For the user interface, we have classified attributes, including RNA expression, specific interaction, protein expression and biological function, for various validation experiments related to the role of miRNA. We also used seed sequence information to evaluate the binding sites of miRNA. In summary, these enhancements render miRTarBase as one of the most research-amicable MTI databases that contain comprehensive and experimentally verified annotations. The newly updated version of miRTarBase is now available at https://miRTarBase.cuhk.edu.cn/.


Subject(s)
3' Untranslated Regions , Databases, Nucleic Acid , Gene Regulatory Networks , MicroRNAs/genetics , Neoplasms/genetics , RNA, Untranslated/genetics , Animals , Binding Sites , Biomarkers/metabolism , Data Mining/statistics & numerical data , Exosomes/chemistry , Exosomes/metabolism , Gene Expression Regulation , Humans , Internet , Mice , MicroRNAs/classification , MicroRNAs/metabolism , Molecular Sequence Annotation , Neoplasms/metabolism , Neoplasms/pathology , Polymorphism, Single Nucleotide , RNA, Untranslated/classification , RNA, Untranslated/metabolism , Tumor Cells, Cultured , User-Computer Interface
6.
Am J Emerg Med ; 53: 285.e1-285.e5, 2022 03.
Article in English | MEDLINE | ID: mdl-34602329

ABSTRACT

STUDY OBJECTIVES: COVID-19 brought unique challenges; however, it remains unclear what effect the pandemic had on violence in healthcare. The objective of this study was to identify the impact of the pandemic on workplace violence at an academic emergency department (ED). METHODS: This mixed-methods study involved a prospective descriptive survey study and electronic medical record review. Within our hospital referral region (HRR), the first COVID-19 case was documented on 3/11/2020 and cases peaked in mid-November 2020. We compared the monthly HRR COVID-19 case rate per 100,000 people to the rate of violent incidents per 1000 ED visits. Multidisciplinary ED staff were surveyed both pre/early-pandemic (April 2020) and mid/late-pandemic (December 2020) regarding workplace violence experienced over the prior 6-months. The study was deemed exempt by the Mayo Clinic Institutional Review Board. RESULTS: There was a positive association between the monthly HRR COVID-19 case rate and rate of violent ED incidents (r = 0.24). Violent incidents increased overall during the pandemic (2.53 incidents per 1000 visits) compared to the 3 months prior (1.13 incidents per 1000 visits, p < .001), as well as compared to the previous year (1.24 incidents per 1000 patient visits, p < .001). Survey respondents indicated a higher incidence of assault during the pandemic, compared to before (p = .019). DISCUSSION: Incidents of workplace violence at our ED increased during the pandemic and there was a positive association of these incidents with the COVID-19 case rate. Our findings indicate health systems should prioritize employee safety during future pandemics.


Subject(s)
COVID-19/psychology , Emergency Service, Hospital/statistics & numerical data , Workplace Violence/statistics & numerical data , Academic Medical Centers/organization & administration , Academic Medical Centers/statistics & numerical data , Adult , COVID-19/prevention & control , COVID-19/transmission , Chi-Square Distribution , Crime Victims/rehabilitation , Data Mining/statistics & numerical data , Emergency Service, Hospital/organization & administration , Female , Health Personnel/psychology , Health Personnel/statistics & numerical data , Humans , Male , Middle Aged , Prospective Studies , Surveys and Questionnaires , Workplace Violence/trends
7.
Comput Math Methods Med ; 2021: 6323357, 2021.
Article in English | MEDLINE | ID: mdl-34887940

ABSTRACT

The current article paper is aimed at assessing and comparing the seasonal check-in behavior of individuals in Shanghai, China, using location-based social network (LBSN) data and a variety of spatiotemporal analytic techniques. The article demonstrates the uses of location-based social network's data by analyzing the trends in check-ins throughout a three-year term for health purpose. We obtained the geolocation data from Sina Weibo, one of the biggest renowned Chinese microblogs (Weibo). The composed data is converted to geographic information system (GIS) type and assessed using temporal statistical analysis and spatial statistical analysis using kernel density estimation (KDE) assessment. We have applied various algorithms and trained machine learning models and finally satisfied with sequential model results because the accuracy we got was leading amongst others. The location cataloguing is accomplished via the use of facts about the characteristics of physical places. The findings demonstrate that visitors' spatial operations are more intense than residents' spatial operations, notably in downtown. However, locals also visited outlying regions, and tourists' temporal behaviors vary significantly while citizens' movements exhibit a more steady stable behavior. These findings may be used in destination management, metro planning, and the creation of digital cities.


Subject(s)
Big Data , Data Mining/statistics & numerical data , Machine Learning/statistics & numerical data , Social Media/statistics & numerical data , Travel/statistics & numerical data , China , Cities , Computational Biology , Decision Trees , Geographic Information Systems , Humans , Seasons , Social Networking , Spatio-Temporal Analysis
8.
Comput Math Methods Med ; 2021: 2059432, 2021.
Article in English | MEDLINE | ID: mdl-34819987

ABSTRACT

Traditional audit data analysis algorithms have many shortcomings, such as the lack of means to mine the hidden audit clues behind the data, the difficulty of finding increasingly hidden cheating techniques caused by the electronic and networked environment, and the inability to solve the quality defects of the audited data. Correlation analysis algorithm in data mining technology is an effective means to obtain knowledge from massive data, which can complete, muffle, clean, and reduce defective data and then can analyze massive data and obtain audit trails under the guidance of expert experience or analysts. Therefore, on the basis of summarizing and analyzing previous research works, this paper expounds the research status and significance of audit data analysis and application; elaborates the development background, current status, and future challenges of correlation analysis algorithm; introduces the methods and principles of data model and its conversion and audit model construction; conducts audit data collection and cleaning; implements audit data preprocessing and its algorithm description; performs audit data analysis based on correlation analysis algorithm; analyzes the hidden node activation value and audit rule extraction in correlation analysis algorithm; proposes the application of audit data based on correlation analysis algorithm; discusses the relationship between audit data quality and audit risk; and finally compares different data mining algorithms in audit data analysis. The findings demonstrate that by analyzing association rules, the correlation analysis algorithm can determine the significance of a huge quantity of audit data and characterise the degree to which linked events would occur concurrently or sequentially in a probabilistic manner. The correlation analysis algorithm first inputs the collected audit data through preprocessing module to filter out useless data and then organizes the obtained data into a format that can be recognized by data mining algorithm and executes the correlation analysis algorithm on the sorted data; finally, the obtained hidden data is divided into normal data and suspicious data by comparing it with the pattern in the rule base. The algorithm can conduct in-depth analysis and research on the company's accounting vouchers, account books, and a large number of financial accounting data and other data of various natures in the company's accounting vouchers; reveal its original characteristics and internal connections; and turn it into an audit. People need more direct and useful information. The study results of this paper provide a reference for further researches on audit data analysis and application based on correlation analysis algorithm.


Subject(s)
Algorithms , Big Data , Data Analysis , Financial Audit/methods , Computational Biology , Correlation of Data , Data Mining/methods , Data Mining/statistics & numerical data , Financial Audit/statistics & numerical data , Humans
9.
Comput Math Methods Med ; 2021: 7690902, 2021.
Article in English | MEDLINE | ID: mdl-34812270

ABSTRACT

The intelligent diagnosis of cervical cancer by using a class of data mining algorithms has important practical significance. In particular, the useful information included in a significant quantity of medical data may not only discreetly boost the development of medical technology but also detect cervical cancer in the future. This paper improves the data mining algorithm and combines image recognition technology and data mining technology to extract and analyze image features. Moreover, this paper makes full use of the information contained in the image to realize the segmentation of the cervical cancer cell image, select the feature vector according to the characteristics of the cervical cancer cell, and use the statistical classification method to design the classifier. The test results show that the automatic recognition effect of this system is good, and it has a good auxiliary diagnosis effect. Therefore, it can be verified in clinical practice in the follow-up.


Subject(s)
Algorithms , Data Mining/statistics & numerical data , Diagnosis, Computer-Assisted/statistics & numerical data , Uterine Cervical Neoplasms/diagnosis , Computational Biology , Female , Humans , Image Interpretation, Computer-Assisted/statistics & numerical data , Logistic Models , Uterine Cervical Neoplasms/diagnostic imaging
10.
Comput Math Methods Med ; 2021: 7937573, 2021.
Article in English | MEDLINE | ID: mdl-34795792

ABSTRACT

Semantic mining is always a challenge for big biomedical text data. Ontology has been widely proved and used to extract semantic information. However, the process of ontology-based semantic similarity calculation is so complex that it cannot measure the similarity for big text data. To solve this problem, we propose a parallelized semantic similarity measurement method based on Hadoop MapReduce for big text data. At first, we preprocess and extract the semantic features from documents. Then, we calculate the document semantic similarity based on ontology network structure under MapReduce framework. Finally, based on the generated semantic document similarity, document clusters are generated via clustering algorithms. To validate the effectiveness, we use two kinds of open datasets. The experimental results show that the traditional methods can hardly work for more than ten thousand biomedical documents. The proposed method keeps efficient and accurate for big dataset and is of high parallelism and scalability.


Subject(s)
Big Data , Cluster Analysis , Data Mining/methods , Semantics , Algorithms , Biological Ontologies/statistics & numerical data , Computational Biology , Data Mining/statistics & numerical data , Documentation/methods , Documentation/statistics & numerical data , Humans , MEDLINE/statistics & numerical data , Machine Learning
11.
Comput Math Methods Med ; 2021: 6842752, 2021.
Article in English | MEDLINE | ID: mdl-34646337

ABSTRACT

Clustering analysis is one of the most important technologies for single-cell data mining. It is widely used in the division of different gene sequences, the identification of functional genes, and the detection of new cell types. Although the traditional unsupervised clustering method does not require label data, the distribution of the original data, the setting of hyperparameters, and other factors all affect the effectiveness of the clustering algorithm. While in some cases the type of some cells is known, it is hoped to achieve high accuracy if the prior information about those cells is utilized sufficiently. In this study, we propose SCMAG (a semisupervised single-cell clustering method based on a matrix aggregation graph convolutional neural network) that takes into full consideration the prior information for single-cell data. To evaluate the performance of the proposed semisupervised clustering method, we test on different single-cell datasets and compare with the current semisupervised clustering algorithm in recognizing cell types on various real scRNA-seq data; the results show that it is a more accurate and significant model.


Subject(s)
Cluster Analysis , Neural Networks, Computer , Single-Cell Analysis/statistics & numerical data , Supervised Machine Learning , Algorithms , Computational Biology , Data Mining/statistics & numerical data , Databases, Nucleic Acid , Humans , RNA-Seq
12.
Comput Math Methods Med ; 2021: 3854518, 2021.
Article in English | MEDLINE | ID: mdl-34691237

ABSTRACT

There is currently no effective analytical method in colorectal image analysis, which leads to certain errors in colorectal image analysis. In order to improve the accuracy of colorectal imaging detection, this study used a genetic algorithm as the data mining algorithm and combined it with image processing technology to perform image analysis. At the same time, combined with the actual requirements of image detection, the gray theory model is used as the basic theory of image processing, and the image detection prediction model is constructed to predict the data. In addition, in order to study the effectiveness of the algorithm, the experiment is carried out to analyze the validity of the data of the study, and the predicted value is compared with the actual value. The research shows that the proposed algorithm has certain accuracy and can provide theoretical reference for subsequent related research.


Subject(s)
Algorithms , Colorectal Neoplasms/diagnostic imaging , Data Mining/methods , Image Interpretation, Computer-Assisted/methods , Adenocarcinoma/diagnostic imaging , Adenocarcinoma/secondary , Colorectal Neoplasms/pathology , Computational Biology , Data Mining/statistics & numerical data , Humans , Image Interpretation, Computer-Assisted/statistics & numerical data , Lymphatic Metastasis/diagnostic imaging , Rectal Neoplasms/diagnostic imaging , Rectal Neoplasms/pathology , Tomography, X-Ray Computed/statistics & numerical data
13.
Mol Syst Biol ; 17(10): e10387, 2021 10.
Article in English | MEDLINE | ID: mdl-34664389

ABSTRACT

We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.


Subject(s)
COVID-19/immunology , Computational Biology/methods , Databases, Factual , SARS-CoV-2/immunology , Software , Antiviral Agents/therapeutic use , COVID-19/genetics , COVID-19/virology , Computer Graphics , Cytokines/genetics , Cytokines/immunology , Data Mining/statistics & numerical data , Gene Expression Regulation , Host Microbial Interactions/genetics , Host Microbial Interactions/immunology , Humans , Immunity, Cellular/drug effects , Immunity, Humoral/drug effects , Immunity, Innate/drug effects , Lymphocytes/drug effects , Lymphocytes/immunology , Lymphocytes/virology , Metabolic Networks and Pathways/genetics , Metabolic Networks and Pathways/immunology , Myeloid Cells/drug effects , Myeloid Cells/immunology , Myeloid Cells/virology , Protein Interaction Mapping , SARS-CoV-2/drug effects , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , Signal Transduction , Transcription Factors/genetics , Transcription Factors/immunology , Viral Proteins/genetics , Viral Proteins/immunology , COVID-19 Drug Treatment
14.
PLoS One ; 16(9): e0256603, 2021.
Article in English | MEDLINE | ID: mdl-34473761

ABSTRACT

From administrative registers of last names in Santiago, Chile, we create a surname affinity network that encodes socioeconomic data. This network is a multi-relational graph with nodes representing surnames and edges representing the prevalence of interactions between surnames by socioeconomic decile. We model the prediction of links as a knowledge base completion problem, and find that sharing neighbors is highly predictive of the formation of new links. Importantly, We distinguish between grounded neighbors and neighbors in the embedding space, and find that the latter is more predictive of tie formation. The paper discusses the implications of this finding in explaining the high levels of elite endogamy in Santiago.


Subject(s)
Data Mining/statistics & numerical data , Machine Learning , Names , Pedigree , Chile , Consanguinity , Female , Humans , Male , Social Class
15.
Med Ref Serv Q ; 40(3): 329-336, 2021.
Article in English | MEDLINE | ID: mdl-34495798

ABSTRACT

The explosive growth of digital information in recent years has amplified the information overload experienced by today's health-care professionals. In particular, the wide variety of unstructured text makes it difficult for researchers to find meaningful data without spending a considerable amount of time reading. Text mining can be used to facilitate better discoverability and analysis, and aid researchers in identifying critical trends and connections. This column will introduce key text-mining terms, recent use cases of biomedical text mining, and current applications for this technology in medical libraries.


Subject(s)
Biomedical Research/trends , COVID-19 , Data Collection/trends , Data Mining/trends , Research Report/trends , Biomedical Research/statistics & numerical data , Data Collection/statistics & numerical data , Data Mining/statistics & numerical data , Forecasting , Humans
16.
PLoS One ; 16(9): e0256940, 2021.
Article in English | MEDLINE | ID: mdl-34520453

ABSTRACT

Fake news is a complex problem that leads to different approaches used to identify them. In our paper, we focus on identifying fake news using its content. The used dataset containing fake and real news was pre-processed using syntactic analysis. Dependency grammar methods were used for the sentences of the dataset and based on them the importance of each word within the sentence was determined. This information about the importance of words in sentences was utilized to create the input vectors for classifications. The paper aims to find out whether it is possible to use the dependency grammar to improve the classification of fake news. We compared these methods with the TfIdf method. The results show that it is possible to use the dependency grammar information with acceptable accuracy for the classification of fake news. An important finding is that the dependency grammar can improve existing techniques. We have improved the traditional TfIdf technique in our experiment.


Subject(s)
Data Mining/statistics & numerical data , Deception , Linguistics/statistics & numerical data , Social Media/ethics , Datasets as Topic , Humans
17.
PLoS Comput Biol ; 17(8): e1008844, 2021 08.
Article in English | MEDLINE | ID: mdl-34370723

ABSTRACT

Many biological processes are mediated by protein-protein interactions (PPIs). Because protein domains are the building blocks of proteins, PPIs likely rely on domain-domain interactions (DDIs). Several attempts exist to infer DDIs from PPI networks but the produced datasets are heterogeneous and sometimes not accessible, while the PPI interactome data keeps growing. We describe a new computational approach called "PPIDM" (Protein-Protein Interactions Domain Miner) for inferring DDIs using multiple sources of PPIs. The approach is an extension of our previously described "CODAC" (Computational Discovery of Direct Associations using Common neighbors) method for inferring new edges in a tripartite graph. The PPIDM method has been applied to seven widely used PPI resources, using as "Gold-Standard" a set of DDIs extracted from 3D structural databases. Overall, PPIDM has produced a dataset of 84,552 non-redundant DDIs. Statistical significance (p-value) is calculated for each source of PPI and used to classify the PPIDM DDIs in Gold (9,175 DDIs), Silver (24,934 DDIs) and Bronze (50,443 DDIs) categories. Dataset comparison reveals that PPIDM has inferred from the 2017 releases of PPI sources about 46% of the DDIs present in the 2020 release of the 3did database, not counting the DDIs present in the Gold-Standard. The PPIDM dataset contains 10,229 DDIs that are consistent with more than 13,300 PPIs extracted from the IMEx database, and nearly 23,300 DDIs (27.5%) that are consistent with more than 214,000 human PPIs extracted from the STRING database. Examples of newly inferred DDIs covering more than 10 PPIs in the IMEx database are provided. Further exploitation of the PPIDM DDI reservoir includes the inventory of possible partners of a protein of interest and characterization of protein interactions at the domain level in combination with other methods. The result is publicly available at http://ppidm.loria.fr/.


Subject(s)
Protein Interaction Domains and Motifs , Protein Interaction Mapping/statistics & numerical data , Protein Interaction Maps , Algorithms , Computational Biology , Data Mining/statistics & numerical data , Databases, Protein/statistics & numerical data , Humans , Software
18.
Cytogenet Genome Res ; 161(6-7): 382-394, 2021.
Article in English | MEDLINE | ID: mdl-34433169

ABSTRACT

Embryonal carcinoma (EC) and seminoma (SE) are both derived from germ cell neoplasia in situ but show big differences in growth patterns and clinical prognosis. Epigenetic regulation may play an important role in the development of EC and SE. This study investigated the DNA methylation-based genetic alterations between EC and SE by analyzing the datasets of mRNA expression and DNA methylation profiling. The datasets were downloaded from the Gene Expression Omnibus database. The differentially expressed genes (DEGs) were identified between EC and SE by limma package in R environment. Gene function enrichment analysis of the DEGs was performed on the DAVID tool, the results of which suggested differences in capability of pluripotency and genomic stability between EC and SE. The minfi package and wANNOVAR tool were used to identify differentially methylated genes. A total of 37 genes were discovered with both mRNA expression and the accordant DNA methylation changes. The findings were verified by the sequencing data from The Cancer Genome Atlas database, and Kaplan-Meier survival analysis was performed. Finally, 5 genes (PRDM1, LMO2, FAM53B, HCN4, and FAM124B) were found that showed both low expression and high methylation in EC, and were significantly associated with relapse-free survival. The findings of methylation-based genetic features between EC and SE might be helpful in studying the role of DNA methylation in cancer development.


Subject(s)
Biomarkers, Tumor/genetics , DNA Methylation , Data Mining/methods , Gene Expression Profiling/methods , Gene Expression Regulation, Neoplastic , Neoplasms, Germ Cell and Embryonal/genetics , Testicular Neoplasms/genetics , Data Mining/statistics & numerical data , Epigenesis, Genetic , Gene Ontology , Humans , Kaplan-Meier Estimate , Male , Signal Transduction/genetics
19.
Comput Math Methods Med ; 2021: 4602465, 2021.
Article in English | MEDLINE | ID: mdl-34335861

ABSTRACT

Dementia interferes with the individual's motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.


Subject(s)
AIDS Dementia Complex/diagnosis , Acquired Immunodeficiency Syndrome/complications , Algorithms , Dementia/etiology , AIDS Dementia Complex/epidemiology , AIDS Dementia Complex/etiology , Aged , Brazil/epidemiology , Computational Biology , Data Mining/methods , Data Mining/statistics & numerical data , Databases, Factual , Decision Trees , Female , Follow-Up Studies , Humans , Logistic Models , Machine Learning , Male , Middle Aged , Neural Networks, Computer , Risk Factors
20.
Medicine (Baltimore) ; 100(32): e26713, 2021 Aug 13.
Article in English | MEDLINE | ID: mdl-34397874

ABSTRACT

OBJECTIVE: The aim of this study is to investigate the impact of Coronavirus disease 2019 (COVID-19) on toothache patients through posts on Sina Weibo. METHODS: Using Gooseeker, we searched and screened 24,108 posts about toothache on Weibo during the dental clinical closure period of China (February 1, 2020-February 29, 2020), and then divided them into 4 categories (causes of toothache, treatments of toothache, impacts of COVID-19 on toothache treatment, popular science articles of toothache), including 10 subcategories, to analyze the proportion of posts in each category. RESULTS: There were 12,603 postings closely related to toothache. Among them, 87.6% of posts did not indicate a specific cause of pain, and 92.8% of posts did not clearly indicate a specific method of treatment. There were 38.9% of the posts that clearly showed that their dental treatment of toothache was affected by COVID-19, including 10.5% of the posts in which patients were afraid to see the dentists because of COVID-19, and 28.4% of the posts in which patients were unable to see the dentists because the dental clinic was closed. Only 3.5% of all posts were about popular science of toothache. CONCLUSIONS: We have studied and analyzed social media data about toothache during the COVID-19 epidemic, so as to provide some insights for government organizations, the media and dentists to better guide the public to pay attention to oral health through social media. Research on social media data can help formulate public health policies.


Subject(s)
COVID-19/complications , Social Media/statistics & numerical data , Toothache/complications , COVID-19/epidemiology , COVID-19/psychology , China/epidemiology , Data Mining/methods , Data Mining/statistics & numerical data , Humans , Oral Health/standards , Oral Health/trends , Toothache/epidemiology , Toothache/psychology
SELECTION OF CITATIONS
SEARCH DETAIL
...