Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.844
Filtrar
1.
J Med Internet Res ; 23(2): e25108, 2021 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-33497351

RESUMO

BACKGROUND: The Centers for Disease Control and Prevention (CDC) is a national public health protection agency in the United States. With the escalating impact of the COVID-19 pandemic on society in the United States and around the world, the CDC has become one of the focal points of public discussion. OBJECTIVE: This study aims to identify the topics and their overarching themes emerging from the public COVID-19-related discussion about the CDC on Twitter and to further provide insight into public's concerns, focus of attention, perception of the CDC's current performance, and expectations from the CDC. METHODS: Tweets were downloaded from a large-scale COVID-19 Twitter chatter data set from March 11, 2020, when the World Health Organization declared COVID-19 a pandemic, to August 14, 2020. We used R (The R Foundation) to clean the tweets and retain tweets that contained any of five specific keywords-cdc, CDC, centers for disease control and prevention, CDCgov, and cdcgov-while eliminating all 91 tweets posted by the CDC itself. The final data set included in the analysis consisted of 290,764 unique tweets from 152,314 different users. We used R to perform the latent Dirichlet allocation algorithm for topic modeling. RESULTS: The Twitter data generated 16 topics that the public linked to the CDC when they talked about COVID-19. Among the topics, the most discussed was COVID-19 death counts, accounting for 12.16% (n=35,347) of the total 290,764 tweets in the analysis, followed by general opinions about the credibility of the CDC and other authorities and the CDC's COVID-19 guidelines, with over 20,000 tweets for each. The 16 topics fell into four overarching themes: knowing the virus and the situation, policy and government actions, response guidelines, and general opinion about credibility. CONCLUSIONS: Social media platforms, such as Twitter, provide valuable databases for public opinion. In a protracted pandemic, such as COVID-19, quickly and efficiently identifying the topics within the public discussion on Twitter would help public health agencies improve the next-round communication with the public.


Assuntos
Centers for Disease Control and Prevention, U.S. , Mineração de Dados , Opinião Pública , Mídias Sociais , Comunicação , Humanos , Pandemias , Saúde Pública , Política Pública , Estados Unidos
2.
J Med Internet Res ; 23(2): e26254, 2021 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-33468449

RESUMO

BACKGROUND: The COVID-19 pandemic is affecting people with dementia in numerous ways. Nevertheless, there is a paucity of research on the COVID-19 impact on people with dementia and their care partners. OBJECTIVE: Using Twitter, the purpose of this study is to understand the experiences of COVID-19 for people with dementia and their care partners. METHODS: We collected tweets on COVID-19 and dementia using the GetOldTweets application in Python from February 15 to September 7, 2020. Thematic analysis was used to analyze the tweets. RESULTS: From the 5063 tweets analyzed with line-by-line coding, we identified 4 main themes including (1) separation and loss; (2) COVID-19 confusion, despair, and abandonment; (3) stress and exhaustion exacerbation; and (4) unpaid sacrifices by formal care providers. CONCLUSIONS: There is an imminent need for governments to rethink using a one-size-fits-all response to COVID-19 policy and use a collaborative approach to support people with dementia. Collaboration and more evidence-informed research are essential to reducing COVID-19 mortality and improving the quality of life for people with dementia and their care partners.


Assuntos
Cuidadores , Demência , Família , Pessoal de Saúde , Mídias Sociais , Luto , Mineração de Dados , Humanos , Casas de Saúde , Pandemias , Qualidade de Vida , Risco , Estresse Psicológico , Visitas a Pacientes
3.
Nat Commun ; 12(1): 711, 2021 01 29.
Artigo em Inglês | MEDLINE | ID: mdl-33514699

RESUMO

Sepsis is a leading cause of death in hospitals. Early prediction and diagnosis of sepsis, which is critical in reducing mortality, is challenging as many of its signs and symptoms are similar to other less critical conditions. We develop an artificial intelligence algorithm, SERA algorithm, which uses both structured data and unstructured clinical notes to predict and diagnose sepsis. We test this algorithm with independent, clinical notes and achieve high predictive accuracy 12 hours before the onset of sepsis (AUC 0.94, sensitivity 0.87 and specificity 0.87). We compare the SERA algorithm against physician predictions and show the algorithm's potential to increase the early detection of sepsis by up to 32% and reduce false positives by up to 17%. Mining unstructured clinical notes is shown to improve the algorithm's accuracy compared to using only clinical measures for early warning 12 to 48 hours before the onset of sepsis.


Assuntos
Regras de Decisão Clínica , Mineração de Dados/métodos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Aprendizado de Máquina , Sepse/diagnóstico , Diagnóstico Precoce , Estudos de Viabilidade , Humanos , Unidades de Terapia Intensiva/estatística & dados numéricos , Valor Preditivo dos Testes , Prevalência , Curva ROC , Medição de Risco , Sepse/epidemiologia , Índice de Gravidade de Doença , Fatores de Tempo
4.
Medicine (Baltimore) ; 100(2): e24029, 2021 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-33466147

RESUMO

BACKGROUND: Functional constipation is a common functional problem of the digestive system that has a negative impact on physical, mental health of patients and quality of life. At present, acupoint herbal patching as an adjuvant therapy is currently undergoing clinical trials in different medical centers. However, no relevant systematic review or meta-analysis has been designed to evaluate the effects of acupoint herbal patching on functional constipation. There is also a lack of systematic evaluation and analysis of acupoints and herbs. METHODS: We will search the following 8 databases from their inception to November 15, 2020, without language restrictions: the Cochrane Central Register of Controlled Trials, PubMed, Embase, the Web of Science, the Chinese Biomedical Literature Database, the Chinese Scientific Journal Database, the Wan-Fang Database and the China National Knowledge Infrastructure. The primary outcome measures will be clinical effective rate, functional outcomes, and quality of life. Data that meets the inclusion criteria will be extracted and analyzed using RevMan V.5.3 software. Two reviewers will evaluate the studies using the Cochrane Collaboration risk of bias tool. We will use the GRADE approach to assess the overall quality of evidence supporting the primary outcomes. We will also use Spass software (Version19.0) for complex network analysis to explore the potential core prescription of acupoint herbal patching for functional constipation. RESULTS: This study will analyze the clinical effective rate, functional outcomes, quality of life, improvement of clinical symptoms of functional constipation, and effective prescriptions of acupoint herbal patching for patients with functional constipation. CONCLUSION: Our findings will provide evidence for the effectiveness and potential treatment prescriptions of acupoint herbal patching for patients with functional constipation. PROSPERO REGISTRATION NUMBER: PROSPERO CRD 42020193489.


Assuntos
Pontos de Acupuntura , Terapia por Acupuntura/métodos , Constipação Intestinal/terapia , Plantas Medicinais , Mineração de Dados/métodos , Humanos , Metanálise em Rede , Qualidade de Vida , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa
5.
Acta Pharm ; 71(2): 175-184, 2021 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-33151168

RESUMO

Recently, an outbreak of a fatal coronavirus, SARS-CoV-2, has emerged from China and is rapidly spreading worldwide. Possible interaction of SARS-CoV-2 with DPP4 peptidase may partly contribute to the viral pathogenesis. An integrative bioinformatics approach starting with mining the biomedical literature for high confidence DPP4-protein/gene associations followed by functional analysis using network analysis and pathway enrichment was adopted. The results indicate that the identified DPP4 networks are highly enriched in viral processes required for viral entry and infection, and as a result, we propose DPP4 as an important putative target for the treatment of COVID-19. Additionally, our protein-chemical interaction networks identified important interactions between DPP4 and sitagliptin. We conclude that sitagliptin may be beneficial for the treatment of COVID-19 disease, either as monotherapy or in combination with other therapies, especially for diabetic patients and patients with pre-existing cardiovascular conditions who are already at higher risk of COVID-19 mortality.


Assuntos
Infecções por Coronavirus/tratamento farmacológico , Dipeptidil Peptidase 4/efeitos dos fármacos , Inibidores da Dipeptidil Peptidase IV/farmacologia , Inibidores da Dipeptidil Peptidase IV/uso terapêutico , Pneumonia Viral/tratamento farmacológico , Fosfato de Sitagliptina/farmacologia , Fosfato de Sitagliptina/uso terapêutico , Doenças Cardiovasculares/complicações , Doenças Cardiovasculares/tratamento farmacológico , Biologia Computacional , Infecções por Coronavirus/complicações , Cristalografia por Raios X , Mineração de Dados , Complicações do Diabetes/tratamento farmacológico , Sistemas de Liberação de Medicamentos , Reposicionamento de Medicamentos , Redes Reguladoras de Genes , Humanos , Estrutura Molecular , Pandemias , Pneumonia Viral/complicações
6.
J Environ Manage ; 280: 111858, 2021 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-33360552

RESUMO

Flash flood is one of the most dangerous hydrologic and natural phenomena and is considered as the top ranking of such events among various natural disasters due to their fast onset characteristics and the proportion of individual fatalities. Mapping the probability of flash flood events remains challenges because of its complexity and rapid onset of precipitation. Thus, this study aims to propose a state-of-the-art data mining approach based on a hybrid equilibrium optimized SysFor, namely, the HE-SysFor model, for spatial prediction of flash floods. A tropical storm region located in the Northwest areas of Vietnam is selected as a case study. For this purpose, 1866 flash-flooded locations and ten indicators were used. The results show that the proposed HE-SysFor model yielded the highest predictive performance (total accuracy = 93.8%, Kappa index = 0.875, F1-score = 0.939, and AUC = 0.975) and produced the better performance than those of the C4.5 decision tree (C4.5), the radial basis function-based support vector machine (SVM-RBF), the logistic regression (LReg), and deep learning neural network (DeepLNN) models in both the training and the testing phases. Among the ten indicators, elevation, slope, and land cover are the most important. It is concluded that the proposed model provides an alternative tool and may help for effectively monitoring flash floods in tropical areas and robust policies for decision making in mitigating the flash flood impacts.


Assuntos
Tempestades Ciclônicas , Inundações , Mineração de Dados , Rios , Vietnã
7.
Sci Total Environ ; 753: 141821, 2021 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-32891993

RESUMO

Intense human disturbance has made algal bloom a prominent environmental problem in gate-controlled urban water bodies. Urban water bodies present the characteristics of natural rivers and lakes simultaneously, whose algal blooms may manifest multi-factor interactions. Hence, effective regulation strategies require a multi-factor analysis to understand local blooming mechanisms. This study designed a holistic multi-factor analysis framework by integrating five data mining techniques. First, the Kolmogorov-Smirnov test was conducted to screen out the possible explanatory variables. Then, correlation analyses and principal component analyses were performed to identify variable collinearity and mutual causality, respectively. After collinearity and mutual causality were treated prudently by using orthogonalization and instrumental variables, multilinear regression can be properly conducted to quantify factor contributions to algae growth. Lastly, a decision tree was used innovatively to depict the limiting threshold curves of each driving factor that restricts algae growth under different circumstances. The driving factors, their contributions, and the limiting threshold curves compose the complete blooming mechanisms, thus providing a clear direction for the targeted regulation task. A typical case study was performed in Suzhou, a Chinese city with an intricate gate-controlled river network. Results confirmed that climatic factors (i.e., water temperature and solar radiation), hydrodynamic factors (i.e., flow velocity), nutrients (i.e., phosphorus and nitrogen), and external loadings contributed 49.3%, 21.7%, 21.3%, and 7.7%, respectively, to algae growth. These results indicate that a joint regulation strategy is urgently required. Future studies can focus on coupling the revealed mechanisms with an ecological model to provide a comprehensive toolkit for the optimization of an adaptive joint regulation plan under the background of global warming.


Assuntos
Monitoramento Ambiental , Eutrofização , China , Cidades , Mineração de Dados , Análise Fatorial , Humanos , Lagos , Fósforo/análise
8.
Anesth Analg ; 132(2): 545-555, 2021 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-33323789

RESUMO

BACKGROUND: High-quality and high-utility feedback allows for the development of improvement plans for trainees. The current manual assessment of the quality of this feedback is time consuming and subjective. We propose the use of machine learning to rapidly distinguish the quality of attending feedback on resident performance. METHODS: Using a preexisting databank of 1925 manually reviewed feedback comments from 4 anesthesiology residency programs, we trained machine learning models to predict whether comments contained 6 predefined feedback traits (actionable, behavior focused, detailed, negative feedback, professionalism/communication, and specific) and predict the utility score of the comment on a scale of 1-5. Comments with ≥4 feedback traits were classified as high-quality and comments with ≥4 utility scores were classified as high-utility; otherwise comments were considered low-quality or low-utility, respectively. We used RapidMiner Studio (RapidMiner, Inc, Boston, MA), a data science platform, to train, validate, and score performance of models. RESULTS: Models for predicting the presence of feedback traits had accuracies of 74.4%-82.2%. Predictions on utility category were 82.1% accurate, with 89.2% sensitivity, and 89.8% class precision for low-utility predictions. Predictions on quality category were 78.5% accurate, with 86.1% sensitivity, and 85.0% class precision for low-quality predictions. Fifteen to 20 hours were spent by a research assistant with no prior experience in machine learning to become familiar with software, create models, and review performance on predictions made. The program read data, applied models, and generated predictions within minutes. In contrast, a recent manual feedback scoring effort by an author took 15 hours to manually collate and score 200 comments during the course of 2 weeks. CONCLUSIONS: Harnessing the potential of machine learning allows for rapid assessment of attending feedback on resident performance. Using predictive models to rapidly screen for low-quality and low-utility feedback can aid programs in improving feedback provision, both globally and by individual faculty.


Assuntos
Anestesiologistas/educação , Anestesiologia/educação , Competência Clínica , Mineração de Dados , Educação de Pós-Graduação em Medicina , Feedback Formativo , Internato e Residência , Aprendizado de Máquina , Corpo Clínico Hospitalar , Bases de Dados Factuais , Avaliação de Desempenho Profissional , Humanos , Análise e Desempenho de Tarefas , Estados Unidos
9.
Biomed Pharmacother ; 133: 111074, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33378973

RESUMO

In the era of big data, massive genetic data, as a new industry, has quickly swept almost all industries, especially the pharmaceutical industry. As countries around the world start to build their own gene banks, scientists study the data to explore the origins and migration of humans. Moreover, big data encourage the development of cancer therapy and bring good news to cancer patients. Big datum has been involved in the study of many diseases, and it has been found that analyzing diseases at the gene level can lead to more beneficial treatment options than ordinary treatments. This review will introduce the development of extensive data in medical research from the perspective of big data and tumor, neurological and psychiatric diseases, cardiovascular diseases, other applications and the development direction of big data in medicine.


Assuntos
Inteligência Artificial , Big Data , Pesquisa Biomédica , Genômica , Medicina de Precisão , Animais , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/terapia , Mineração de Dados , Humanos , Transtornos Mentais/diagnóstico , Transtornos Mentais/genética , Transtornos Mentais/terapia , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/terapia
10.
Zootaxa ; 4881(3): zootaxa.4881.3.13, 2020 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-33311311

RESUMO

The spotted bumblebee shrimp Gnathophyllum elegans (Risso, 1816) is a caridean species of the family Palaemonidae Rafinesque, 1815 widely distributed in the eastern Atlantic and the entire Mediterranean Sea (Zariquiey Alvarez 1968; d'Udekem d'Acoz 1999; De Grave et al. 2015). It is a solitary sciaphilous taxon that grows up to 40 mm of total length, and at daytime hides under stones, in crevices or amidst Posidonia oceanica (Linnaeus) Delile rhizomes from the intertidal to about 30 m depth, with some authors even considering it as preferring coralligenous environments (Pérès Picard 1964; Ledoyer 1968; d'Udekem d'Acoz 1999). Such a cryptic behavior makes the detection of G. elegans often difficult in the field, although the species is easily distinguishable by the other eastern Atlantic-Mediterranean shrimp species due to its colourful appearance and mostly due to its dark purple-brown body entirely covered by yellow-orange dots (Zariquiey Alvarez 1968; Falciai Minervini 1992).


Assuntos
Palaemonidae , Mídias Sociais , Adulto , Animais , Abelhas , Ciência do Cidadão , Cor , Mineração de Dados , Humanos
11.
BMC Bioinformatics ; 21(Suppl 23): 580, 2020 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-33372589

RESUMO

BACKGROUND:  : Syntactic analysis, or parsing, is a key task in natural language processing and a required component for many text mining approaches. In recent years, Universal Dependencies (UD) has emerged as the leading formalism for dependency parsing. While a number of recent tasks centering on UD have substantially advanced the state of the art in multilingual parsing, there has been only little study of parsing texts from specialized domains such as biomedicine. METHODS:  : We explore the application of state-of-the-art neural dependency parsing methods to biomedical text using the recently introduced CRAFT-SA shared task dataset. The CRAFT-SA task broadly follows the UD representation and recent UD task conventions, allowing us to fine-tune the UD-compatible Turku Neural Parser and UDify neural parsers to the task. We further evaluate the effect of transfer learning using a broad selection of BERT models, including several models pre-trained specifically for biomedical text processing. RESULTS:  : We find that recently introduced neural parsing technology is capable of generating highly accurate analyses of biomedical text, substantially improving on the best performance reported in the original CRAFT-SA shared task. We also find that initialization using a deep transfer learning model pre-trained on in-domain texts is key to maximizing the performance of the parsing methods.


Assuntos
Pesquisa Biomédica , Mineração de Dados , Software , Humanos , Idioma , Modelos Estatísticos , Processamento de Linguagem Natural
12.
J Transl Med ; 18(1): 494, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33380328

RESUMO

BACKGROUND: Tracking the genetic variability of Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) is a crucial challenge. Mainly to identify target sequences in order to generate robust vaccines and neutralizing monoclonal antibodies, but also to track viral genetic temporal and geographic evolution and to mine for variants associated with reduced or increased disease severity. Several online tools and bioinformatic phylogenetic analyses have been released, but the main interest lies in the Spike protein, which is the pivotal element of current vaccine design, and in the Receptor Binding Domain, that accounts for most of the neutralizing the antibody activity. METHODS: Here, we present an open-source bioinformatic protocol, and a web portal focused on SARS-CoV-2 single mutations and minimal consensus sequence building as a companion vaccine design tool. Furthermore, we provide immunogenomic analyses to understand the impact of the most frequent RBD variations. RESULTS: Results on the whole GISAID sequence dataset at the time of the writing (October 2020) reveals an emerging mutation, S477N, located on the central part of the Spike protein Receptor Binding Domain, the Receptor Binding Motif. Immunogenomic analyses revealed some variation in mutated epitope MHC compatibility, T-cell recognition, and B-cell epitope probability for most frequent human HLAs. CONCLUSIONS: This work provides a framework able to track down SARS-CoV-2 genomic variability.


Assuntos
/virologia , Glicoproteína da Espícula de Coronavírus/genética , Sítios de Ligação/genética , /genética , Biologia Computacional , Mineração de Dados , Variação Genética , Humanos , Fenômenos Imunogenéticos , Modelos Moleculares , Mutação , Pandemias/estatística & dados numéricos , Domínios Proteicos , Receptores Virais , Software , Glicoproteína da Espícula de Coronavírus/imunologia , Pesquisa Médica Translacional
13.
Stroke Vasc Neurol ; 5(4): 381-387, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33376199

RESUMO

The discovery of targeted drugs heavily relies on three-dimensional (3D) structures of target proteins. When the 3D structure of a protein target is unknown, it is very difficult to design its corresponding targeted drugs. Although the 3D structures of some proteins (the so-called undruggable targets) are known, their targeted drugs are still absent. As increasing crystal/cryogenic electron microscopy structures are deposited in Protein Data Bank, it is much more possible to discover the targeted drugs. Moreover, it is also highly probable to turn previous undruggable targets into druggable ones when we identify their hidden allosteric sites. In this review, we focus on the currently available advanced methods for the discovery of novel compounds targeting proteins without 3D structure and how to turn undruggable targets into druggable ones.


Assuntos
Inteligência Artificial , Big Data , Desenho Assistido por Computador , Mineração de Dados , Desenho de Fármacos , Descoberta de Drogas , Preparações Farmacêuticas/química , Proteínas/química , Animais , Bases de Dados de Proteínas , Humanos , Ligantes , Estrutura Molecular , Terapia de Alvo Molecular , Conformação Proteica , Relação Estrutura-Atividade
14.
Sci Data ; 7(1): 437, 2020 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-33328476

RESUMO

Stressful experiences are part of everyday life and animals have evolved physiological and behavioral responses aimed at coping with stress and maintaining homeostasis. However, repeated or intense stress can induce maladaptive reactions leading to behavioral disorders. Adaptations in the brain, mediated by changes in gene expression, have a crucial role in the stress response. Recent years have seen a tremendous increase in studies on the transcriptional effects of stress. The input raw data are freely available from public repositories and represent a wealth of information for further global and integrative retrospective analyses. We downloaded from the Sequence Read Archive 751 samples (SRA-experiments), from 18 independent BioProjects studying the effects of different stressors on the brain transcriptome in mice. We performed a massive bioinformatics re-analysis applying a single, standardized pipeline for computing differential gene expression. This data mining allowed the identification of novel candidate stress-related genes and specific signatures associated with different stress conditions. The large amount of computational results produced was systematized in the interactive "Stress Mice Portal".


Assuntos
Encéfalo/fisiologia , Expressão Gênica , Estresse Fisiológico , Estresse Psicológico , Transcriptoma , Animais , Biologia Computacional , Mineração de Dados , Conjuntos de Dados como Assunto , Feminino , Masculino , Camundongos
15.
Zhongguo Zhong Yao Za Zhi ; 45(22): 5356-5361, 2020 Nov.
Artigo em Chinês | MEDLINE | ID: mdl-33350194

RESUMO

This article analyze acupoint selection and characteristics of plaster therapy for stable chronic obstructive pulmonary di-sease(COPD) by data mining. The CNKI, VIP, CBM, WanFang, PubMed, EMbase, Cochrane Library were retrieved for collecting clinical studies of plaster therapy for stable COPD. After literature screening, a total of 46 systematic reviews were included. Frequency statistics, cluster analysis, and Apriori correlation analysis were used to analyze the pattern and characteristics of plaster therapy for stable COPD. The result showed that the main acupoints for stable COPD were BL13, Dingchuan, CV22, BL23 and BL20. The acupoints used are mainly concentrated on the chest and back. The most frequently used meridian is the bladder meridian. Analysis of the acupoints yielded 27 correlation rules. And cluster analysis grouped the high frequency acupoints into 5 categories. The results of the study showed that the current choice of acupoints is rather concentrated. "Local acupuncture points" and "matching points with front and back" were the main acupoint selection rules. The choice of acupuncture points reflected the traditional Chinese medicine treatment principle of strengthening healthy Qi to eliminate pathogenic factor, treating both manifestation and root cause of disease, and preventing measure taken after the occurrence of disease.


Assuntos
Terapia por Acupuntura , Meridianos , Doença Pulmonar Obstrutiva Crônica , Pontos de Acupuntura , Mineração de Dados , Humanos , Doença Pulmonar Obstrutiva Crônica/tratamento farmacológico
16.
BMC Bioinformatics ; 21(Suppl 16): 543, 2020 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-33323106

RESUMO

BACKGROUND: Although biomedical publications and literature are growing rapidly, there still lacks structured knowledge that can be easily processed by computer programs. In order to extract such knowledge from plain text and transform them into structural form, the relation extraction problem becomes an important issue. Datasets play a critical role in the development of relation extraction methods. However, existing relation extraction datasets in biomedical domain are mainly human-annotated, whose scales are usually limited due to their labor-intensive and time-consuming nature. RESULTS: We construct BioRel, a large-scale dataset for biomedical relation extraction problem, by using Unified Medical Language System as knowledge base and Medline as corpus. We first identify mentions of entities in sentences of Medline and link them to Unified Medical Language System with Metamap. Then, we assign each sentence a relation label by using distant supervision. Finally, we adapt the state-of-the-art deep learning and statistical machine learning methods as baseline models and conduct comprehensive experiments on the BioRel dataset. CONCLUSIONS: Based on the extensive experimental results, we have shown that BioRel is a suitable large-scale datasets for biomedical relation extraction, which provides both reasonable baseline performance and many remaining challenges for both deep learning and statistical methods.


Assuntos
Pesquisa Biomédica , Mineração de Dados , Software , Bases de Dados como Assunto , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
17.
Nat Commun ; 11(1): 6338, 2020 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-33311500

RESUMO

The transcriptional regulatory network (TRN) of Bacillus subtilis coordinates cellular functions of fundamental interest, including metabolism, biofilm formation, and sporulation. Here, we use unsupervised machine learning to modularize the transcriptome and quantitatively describe regulatory activity under diverse conditions, creating an unbiased summary of gene expression. We obtain 83 independently modulated gene sets that explain most of the variance in expression and demonstrate that 76% of them represent the effects of known regulators. The TRN structure and its condition-dependent activity uncover putative or recently discovered roles for at least five regulons, such as a relationship between histidine utilization and quorum sensing. The TRN also facilitates quantification of population-level sporulation states. As this TRN covers the majority of the transcriptome and concisely characterizes the global expression state, it could inform research on nearly every aspect of transcriptional regulation in B. subtilis.


Assuntos
Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Redes Reguladoras de Genes , Aprendizado de Máquina , Transcriptoma , Proteínas de Bactérias/metabolismo , Dano ao DNA , Mineração de Dados , Etanol/metabolismo , Regulação Bacteriana da Expressão Gênica , Histidina/metabolismo , Quelantes de Ferro , Percepção de Quorum , Triptofano/biossíntese
18.
BMC Med Inform Decis Mak ; 20(Suppl 4): 283, 2020 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-33317518

RESUMO

BACKGROUND: Semantic web technology has been applied widely in the biomedical informatics field. Large numbers of biomedical datasets are available online in the resource description framework (RDF) format. Semantic relationship mining among genes, disorders, and drugs is widely used in, for example, precision medicine and drug repositioning. However, most of the existing studies focused on a single dataset. It is not easy to find the most current relationships among disorder-gene-drug relationships since the relationships are distributed in heterogeneous datasets. How to mine their semantic relationships from different biomedical datasets is an important issue. METHODS: First, a variety of biomedical datasets were converted into RDF triple data; then, multisource biomedical datasets were integrated into a storage system using a data integration algorithm. Second, nine query patterns among genes, disorders, and drugs from different biomedical datasets were designed. Third, the gene-disorder-drug semantic relationship mining algorithm is presented. This algorithm can query the relationships among various entities from different datasets. RESULTS AND CONCLUSIONS: We focused on mining the putative and the most current disorder-gene-drug relationships about Parkinson's disease (PD). The results demonstrate that our method has significant advantages in mining and integrating multisource heterogeneous biomedical datasets. Twenty-five new relationships among the genes, disorders, and drugs were mined from four different datasets. The query results showed that most of them came from different datasets. The precision of the method increased by 2.51% compared to that of the multisource linked open data fusion method presented in the 4th International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019). Moreover, the number of query results increased by 7.7%, and the number of correct queries increased by 9.5%.


Assuntos
Preparações Farmacêuticas , Semântica , Algoritmos , Mineração de Dados , Humanos , Projetos de Pesquisa
19.
BMC Med Inform Decis Mak ; 20(Suppl 4): 315, 2020 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-33317524

RESUMO

In this introduction, we first summarize the Fourth International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019) held on October 26, 2019 in conjunction with the 18th International Semantic Web Conference (ISWC 2019) in Auckland, New Zealand, and then briefly introduce seven research articles included in this supplement issue, covering the topics on Knowledge Graph, Ontology-Powered Analytics, and Deep Learning.


Assuntos
Mineração de Dados , Semântica , Humanos , Nova Zelândia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA