Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 78
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
BMC Med Res Methodol ; 24(1): 108, 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38724903

RESUMO

OBJECTIVE: Systematic literature reviews (SLRs) are critical for life-science research. However, the manual selection and retrieval of relevant publications can be a time-consuming process. This study aims to (1) develop two disease-specific annotated corpora, one for human papillomavirus (HPV) associated diseases and the other for pneumococcal-associated pediatric diseases (PAPD), and (2) optimize machine- and deep-learning models to facilitate automation of the SLR abstract screening. METHODS: This study constructed two disease-specific SLR screening corpora for HPV and PAPD, which contained citation metadata and corresponding abstracts. Performance was evaluated using precision, recall, accuracy, and F1-score of multiple combinations of machine- and deep-learning algorithms and features such as keywords and MeSH terms. RESULTS AND CONCLUSIONS: The HPV corpus contained 1697 entries, with 538 relevant and 1159 irrelevant articles. The PAPD corpus included 2865 entries, with 711 relevant and 2154 irrelevant articles. Adding additional features beyond title and abstract improved the performance (measured in Accuracy) of machine learning models by 3% for HPV corpus and 2% for PAPD corpus. Transformer-based deep learning models that consistently outperformed conventional machine learning algorithms, highlighting the strength of domain-specific pre-trained language models for SLR abstract screening. This study provides a foundation for the development of more intelligent SLR systems.


Assuntos
Aprendizado de Máquina , Infecções por Papillomavirus , Humanos , Infecções por Papillomavirus/diagnóstico , Economia Médica , Algoritmos , Avaliação de Resultados em Cuidados de Saúde/métodos , Aprendizado Profundo , Indexação e Redação de Resumos/métodos
2.
Circ Res ; 126(3): 350-360, 2020 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-31801406

RESUMO

Rationale: GWAS (Genome-Wide Association Studies) have identified hundreds of genetic loci associated with atrial fibrillation (AF). However, these loci explain only a small proportion of AF heritability. Objective: To develop an approach to identify additional AF-related genes by integrating multiple omics data. Methods and Results: Three types of omics data were integrated: (1) summary statistics from the AFGen 2017 GWAS; (2) a whole blood EWAS (Epigenome-Wide Association Study) of AF; and (3) a whole blood TWAS (Transcriptome-Wide Association Study) of AF. The variant-level GWAS results were collapsed into gene-level associations using fast set-based association analysis. The CpG-level EWAS results were also collapsed into gene-level associations by an adapted SNP-set Kernel Association Test approach. Both GWAS and EWAS gene-based associations were then meta-analyzed with TWAS using a fixed-effects model weighted by the sample size of each data set. A tissue-specific network was subsequently constructed using the NetWAS (Network-Wide Association Study). The identified genes were then compared with the AFGen 2018 GWAS that contained more than triple the number of AF cases compared with AFGen 2017 GWAS. We observed that the multiomics approach identified many more relevant AF-related genes than using AFGen 2018 GWAS alone (1931 versus 206 genes). Many of these genes are involved in the development and regulation of heart- and muscle-related biological processes. Moreover, the gene set identified by multiomics approach explained much more AF variance than those identified by GWAS alone (10.4% versus 3.5%). Conclusions: We developed a strategy to integrate multiple omics data to identify AF-related genes. Our integrative approach may be useful to improve the power of traditional GWAS, which might be particularly useful for rare traits and diseases with limited sample size.


Assuntos
Fibrilação Atrial/genética , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Bases de Dados Genéticas , Epigênese Genética , Humanos , Polimorfismo de Nucleotídeo Único , Software , Transcriptoma
3.
J Biomed Inform ; 125: 103976, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34906737

RESUMO

Broader patient-reported experiences in oncology are largely unknown due to the lack of available information from traditional data sources. Online health community data provide an exploratory way to uncover these experiences at a large scale. Analyzing these data can guide further studies towards understanding patients' needs and experiences. However, analysis of online health data is inherently difficult due to the unstructured nature of these data and the variety of ways information can be expressed over text. Specifically, subscribers may not disclose critical information such as the age of the patient in their posts. In fact, the number of health forum posts that explicitly mention the age of the patient is significantly lower than the number of posts that do not include this information in the Reddit r/Cancer health forum under consideration in the present paper. Health-focused studies often need to consider or control for age as a confounder, hence the importance of having sufficient age data. This paper presents a methodology that can help classify health forum posts according to four age groups (0-17, 18-39, 40-64 and 65 + years) even when the posts do not contain explicit mention of the age of the patient. First, the subset of the posts that include explicit mention of the age of the patient is identified. Second, the explicit age clues are removed from these posts and used to train the proposed age classifier. The resulting classifier is able to infer the age of the patient using only implicit age clues with an average true positive rate (TPR) of 71%. This TPR is comparable to the average TPR of 69% obtained from human annotations for the same set of posts.


Assuntos
Registros de Saúde Pessoal , Fatores Etários , Humanos
4.
J Biomed Inform ; 134: 104162, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36029954

RESUMO

The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a "Maps to" relationship, while unmapped vaccine-pairs are those without a "Maps to" relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary.


Assuntos
Vacinas , Vocabulário , Melhoria de Qualidade , Vocabulário Controlado
5.
J Med Internet Res ; 24(1): e17273, 2022 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-35014964

RESUMO

BACKGROUND: Patient-clinician secure messaging is an important function in patient portals and enables patients and clinicians to communicate on a wide spectrum of issues in a timely manner. With its growing adoption and patient engagement, it is time to comprehensively study the secure messages and user behaviors in order to improve patient-centered care. OBJECTIVE: The aim of this paper was to analyze the secure messages sent by patients and clinicians in a large multispecialty health system at Mayo Clinic, Rochester. METHODS: We performed message-based, sender-based, and thread-based analyses of more than 5 million secure messages between 2010 and 2017. We summarized the message volumes, patient and clinician population sizes, message counts per patient or clinician, as well as the trends of message volumes and user counts over the years. In addition, we calculated the time distribution of clinician-sent messages to understand their workloads at different times of a day. We also analyzed the time delay in clinician responses to patient messages to assess their communication efficiency and the back-and-forth rounds to estimate the communication complexity. RESULTS: During 2010-2017, the patient portal at Mayo Clinic, Rochester experienced a significant growth in terms of the count of patient users and the total number of secure messages sent by patients and clinicians. Three clinician categories, namely "physician-primary care," "registered nurse-specialty," and "physician-specialty," bore the majority of message volume increase. The patient portal also demonstrated growing trends in message counts per patient and clinician. The "nurse practitioner or physician assistant-primary care" and "physician-primary care" categories had the heaviest per-clinician workload each year. Most messages by the clinicians were sent from 7 AM to 5 PM during a day. Yet, between 5 PM and 7 PM, the physicians sent 7.0% (95,785/1,377,006) of their daily messages, and the nurse practitioner or physician assistant sent 5.4% (22,121/408,526) of their daily messages. The clinicians replied to 72.2% (1,272,069/1,761,739) patient messages within 1 day and 90.6% (1,595,702/1,761,739) within 3 days. In 95.1% (1,499,316/1,576,205) of the message threads, the patients communicated with their clinicians back and forth for no more than 4 rounds. CONCLUSIONS: Our study found steady increases in patient adoption of the secure messaging system and the average workload per clinician over 8 years. However, most clinicians responded timely to meet the patients' needs. Our study also revealed differential patient-clinician communication patterns across different practice roles and care settings. These findings suggest opportunities for care teams to optimize messaging tasks and to balance the workload for optimal efficiency.


Assuntos
Medicina , Portais do Paciente , Comunicação , Humanos , Participação do Paciente , Estudos Retrospectivos
6.
Mol Cell Proteomics ; 18(8): 1683-1699, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31097671

RESUMO

The label-free proteome quantification (LFQ) is multistep workflow collectively defined by quantification tools and subsequent data manipulation methods that has been extensively applied in current biomedical, agricultural, and environmental studies. Despite recent advances, in-depth and high-quality quantification remains extremely challenging and requires the optimization of LFQs by comparatively evaluating their performance. However, the evaluation results using different criteria (precision, accuracy, and robustness) vary greatly, and the huge number of potential LFQs becomes one of the bottlenecks in comprehensively optimizing proteome quantification. In this study, a novel strategy, enabling the discovery of the LFQs of simultaneously enhanced performance from thousands of workflows (integrating 18 quantification tools with 3,128 manipulation chains), was therefore proposed. First, the feasibility of achieving simultaneous improvement in the precision, accuracy, and robustness of LFQ was systematically assessed by collectively optimizing its multistep manipulation chains. Second, based on a variety of benchmark datasets acquired by various quantification measurements of different modes of acquisition, this novel strategy successfully identified a number of manipulation chains that simultaneously improved the performance across multiple criteria. Finally, to further enhance proteome quantification and discover the LFQs of optimal performance, an online tool (https://idrblab.org/anpela/) enabling collective performance assessment (from multiple perspectives) of the entire LFQ workflow was developed. This study confirmed the feasibility of achieving simultaneous improvement in precision, accuracy, and robustness. The novel strategy proposed and validated in this study together with the online tool might provide useful guidance for the research field requiring the mass-spectrometry-based LFQ technique.


Assuntos
Proteômica/métodos , Proteoma , Software , Fluxo de Trabalho
7.
J Med Internet Res ; 23(8): e25670, 2021 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-34346903

RESUMO

BACKGROUND: Genealogical information, such as that found in family trees, is imperative for biomedical research such as disease heritability and risk prediction. Researchers have used policyholder and their dependent information in medical claims data and emergency contacts in electronic health records (EHRs) to infer family relationships at a large scale. We have previously demonstrated that online obituaries can be a novel data source for building more complete and accurate family trees. OBJECTIVE: Aiming at supplementing EHR data with family relationships for biomedical research, we built an end-to-end information extraction system using a multitask-based artificial neural network model to construct genealogical knowledge graphs (GKGs) from online obituaries. GKGs are enriched family trees with detailed information including age, gender, death and birth dates, and residence. METHODS: Built on a predefined family relationship map consisting of 4 types of entities (eg, people's name, residence, birth date, and death date) and 71 types of relationships, we curated a corpus containing 1700 online obituaries from the metropolitan area of Minneapolis and St Paul in Minnesota. We also adopted data augmentation technology to generate additional synthetic data to alleviate the issue of data scarcity for rare family relationships. A multitask-based artificial neural network model was then built to simultaneously detect names, extract relationships between them, and assign attributes (eg, birth dates and death dates, residence, age, and gender) to each individual. In the end, we assemble related GKGs into larger ones by identifying people appearing in multiple obituaries. RESULTS: Our system achieved satisfying precision (94.79%), recall (91.45%), and F-1 measures (93.09%) on 10-fold cross-validation. We also constructed 12,407 GKGs, with the largest one made up of 4 generations and 30 people. CONCLUSIONS: In this work, we discussed the meaning of GKGs for biomedical research, presented a new version of a corpus with a predefined family relationship map and augmented training data, and proposed a multitask deep neural system to construct and assemble GKGs. The results show our system can extract and demonstrate the potential of enriching EHR data for more genetic research. We share the source codes and system with the entire scientific community on GitHub without the corpus for privacy protection.


Assuntos
Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Registros Eletrônicos de Saúde , Humanos , Armazenamento e Recuperação da Informação , Conhecimento
8.
J Med Internet Res ; 23(7): e26770, 2021 07 30.
Artigo em Inglês | MEDLINE | ID: mdl-34328444

RESUMO

BACKGROUND: Patient portals tethered to electronic health records systems have become attractive web platforms since the enacting of the Medicare Access and Children's Health Insurance Program Reauthorization Act and the introduction of the Meaningful Use program in the United States. Patients can conveniently access their health records and seek consultation from providers through secure web portals. With increasing adoption and patient engagement, the volume of patient secure messages has risen substantially, which opens up new research and development opportunities for patient-centered care. OBJECTIVE: This study aims to develop a data model for patient secure messages based on the Fast Healthcare Interoperability Resources (FHIR) standard to identify and extract significant information. METHODS: We initiated the first draft of the data model by analyzing FHIR and manually reviewing 100 sentences randomly sampled from more than 2 million patient-generated secure messages obtained from the online patient portal at the Mayo Clinic Rochester between February 18, 2010, and December 31, 2017. We then annotated additional sets of 100 randomly selected sentences using the Multi-purpose Annotation Environment tool and updated the data model and annotation guideline iteratively until the interannotator agreement was satisfactory. We then created a larger corpus by annotating 1200 randomly selected sentences and calculated the frequency of the identified medical concepts in these sentences. Finally, we performed topic modeling analysis to learn the hidden topics of patient secure messages related to 3 highly mentioned microconcepts, namely, fatigue, prednisone, and patient visit, and to evaluate the proposed data model independently. RESULTS: The proposed data model has a 3-level hierarchical structure of health system concepts, including 3 macroconcepts, 28 mesoconcepts, and 85 microconcepts. Foundation and base macroconcepts comprise 33.99% (841/2474), clinical macroconcepts comprise 64.38% (1593/2474), and financial macroconcepts comprise 1.61% (40/2474) of the annotated corpus. The top 3 mesoconcepts among the 28 mesoconcepts are condition (505/2474, 20.41%), medication (424/2474, 17.13%), and practitioner (243/2474, 9.82%). Topic modeling identified hidden topics of patient secure messages related to fatigue, prednisone, and patient visit. A total of 89.2% (107/120) of the top-ranked topic keywords are actually the health concepts of the data model. CONCLUSIONS: Our data model and annotated corpus enable us to identify and understand important medical concepts in patient secure messages and prepare us for further natural language processing analysis of such free texts. The data model could be potentially used to automatically identify other types of patient narratives, such as those in various social media and patient forums. In the future, we plan to develop a machine learning and natural language processing solution to enable automatic triaging solutions to reduce the workload of clinicians and perform more granular content analysis to understand patients' needs and improve patient-centered care.


Assuntos
Registros Eletrônicos de Saúde , Medicare , Idoso , Criança , Humanos , Uso Significativo , Processamento de Linguagem Natural , Participação do Paciente , Estados Unidos
9.
Mol Biol Rep ; 47(4): 2605-2617, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-32130618

RESUMO

Atrial fibrillation (AF) is a commonly encountered heart arrhythmia and a risk factor for cardiovascular system. The purpose of the present study was to explore the role of long non-coding RNA myocardial infarction-associated transcript (MIAT) in AF and AF-induced myocardial fibrosis and the possible mechanisms involved in this process. We successfully induced an AF rat model. Expression of MIAT presented a dramatic increase, while microRNA (miR)-133a-3p presented a dramatic decrease in atrium tissues of rats with AF induction. In addition, we also found that MIAT was highly expressed and miR-133a-3p was significantly reduced in peripheral blood leukocyte of AF patients. For biological function exploration of MIAT/miR-133a-3p axis, MIAT was knocked down using small hairpin RNA (shRNA) lentivirus injection and the rescue experiments were performed simultaneously by inhibiting miR-133a-3p using anti-miR-133a-3p lentivirus injection in rats with AF. MIAT downregulation significantly alleviated AF, increased atrial effective refractory period (AERP), and reduced the duration of AF as well as cardiomyocytes apoptosis. Whereas these effects of MIAT downregulation on AF were reversed by anti-miR-133a-3p administration. Luciferase reporter revealed that miR-133a-3p was directly regulated by MIAT. Moreover, MIAT knockdown effectively reduced AF-induced atrial fibrosis by detecting reduced collagen in the right atria and inhibited expression of fibrosis-related gene expression of collagen I, collagen III, connective tissue growth factor (CTGF) and transforming growth factor-ß1 (TGF-ß1) in rats with AF, these findings were in contrast with the findings for rats with inhibition of miR-133a-3p. In conclusion, our study demonstrated the role of MIAT downregulation in alleviating AF and AF-induced myocardial fibrosis, and the functional regulatory pathway of MIAT targeting miR-133a-3p.


Assuntos
Fibrilação Atrial/genética , MicroRNAs/genética , RNA Longo não Codificante/genética , Animais , Apoptose/genética , Fibrilação Atrial/fisiopatologia , Cardiomiopatias/genética , Cardiomiopatias/metabolismo , China , Fibrose Endomiocárdica/genética , Fibrose Endomiocárdica/metabolismo , Feminino , Fibrose/metabolismo , Humanos , Masculino , MicroRNAs/metabolismo , Infarto do Miocárdio/metabolismo , Miocárdio/metabolismo , RNA Longo não Codificante/metabolismo , Ratos , Ratos Sprague-Dawley , Transdução de Sinais/genética
10.
Nucleic Acids Res ; 46(D1): D911-D917, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-30053268

RESUMO

Delivering safe and effective therapeutic treatment to patients is one of the grand challenges in modern medicine. However, drug safety research has been progressing slowly in recent years, compared to other fields such as biotechnologies and precision medicine, due to the mechanistic complexity of adverse drug reactions (ADRs). To fill up this gap, we develop a new database, the Adverse Drug Reaction Classification System-Target Profile (ADReCS-Target, http://bioinf.xmu.edu.cn/ADReCS-Target), which provides comprehensive information about ADRs caused by drug interaction with protein, gene and genetic variation. In total, ADReCS-Target includes 66,573 pairwise relations, among which 1710 are protein-ADR associations, 2613 are genetic variation-ADR associations, and 63,298 are gene-ADR associations. In a case study of exploring the mechanism of rash, we find that HLAs, C1QA and APOA1 are the key gene players and thus can be potential targets (or biomarkers) in monitoring or countermining rashes. In summary, ADReCS-Target can be a useful resource for the biomedical scientific community by serving researchers in the fields of drug development, clinical pharmacology, precision medicine, and from web lab to high-throughput computational platform. Particularly, it helps to identify drug with better ADR profile and design safer drug therapy regimen.


Assuntos
Bases de Dados de Produtos Farmacêuticos , Sistemas de Liberação de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Biotransformação/genética , Coleta de Dados , Curadoria de Dados , Mineração de Dados/métodos , Interações Medicamentosas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Humanos , Proteínas/metabolismo , Interface Usuário-Computador
11.
Medicina (Kaunas) ; 56(4)2020 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-32260044

RESUMO

Background and objectives: It is unclear why many patients with hypothyroidism prefer the use of desiccated thyroid extract (DTE) as a thyroid hormone replacement formulation over levothyroxine (LT4) treatment, as recommended by clinical practice guidelines. We analyzed patient-reported information from patient online forums to better understand patient preferences for and attitudes toward the use of DTE to treat hypothyroidism. Materials and Methods: We conducted a mixed-methods study by evaluating the content of online posts from three popular hypothyroidism forums from patients currently taking DTE (n = 673). From these posts, we extracted descriptive information on patient demographics and clinical characteristics and qualitatively analyzed posts' content to explore patient perceptions on DTE and other therapies further. Results: Nearly half (46%) of the patients reported that a clinician initially drove their interest in trying DTE. Patients described many reasons for switching from a previous therapeutic approach to DTE, including lack of improvement in hypothyroidism-related symptoms (58%) and the development of side effects (22%). The majority of patients described DTE as moderately to majorly effective overall (81%) and more effective than the previous therapy (77%). The most frequently described benefits associated with DTE use were an improvement in symptoms (56%) and a change in overall well-being (34%). One-fifth of patients described side effects related to the use of DTE. Qualitative analysis of posts' content supported these findings and raised additional issues around the need for individualizing therapy approaches for hypothyroidism (e.g., a sense of each patient has different needs), as well as difficulties obtaining DTE (e.g., issues with pharmacy availability). Conclusions: Lack of individualized treatment and a feeling of not been listened to were recurrent themes among DTE users. A subset of patients may prefer DTE to LT4 for many reasons, including perceived better effectiveness and improved overall well-being, despite the risks associated with DTE.


Assuntos
Pacientes/psicologia , Percepção , Tireoide (USP)/uso terapêutico , Adulto , Idoso , Feminino , Humanos , Hipotireoidismo/tratamento farmacológico , Hipotireoidismo/psicologia , Masculino , Pessoa de Meia-Idade , Mídias Sociais/instrumentação , Mídias Sociais/estatística & dados numéricos , Tireoide (USP)/efeitos adversos , Tireoide (USP)/farmacologia
12.
Plant Physiol ; 177(1): 422-433, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29530937

RESUMO

An advanced functional understanding of omics data is important for elucidating the design logic of physiological processes in plants and effectively controlling desired traits in plants. We present the latest versions of the Predicted Arabidopsis Interactome Resource (PAIR) and of the gene set linkage analysis (GSLA) tool, which enable the interpretation of an observed transcriptomic change (differentially expressed genes [DEGs]) in Arabidopsis (Arabidopsis thaliana) with respect to its functional impact for biological processes. PAIR version 5.0 integrates functional association data between genes in multiple forms and infers 335,301 putative functional interactions. GSLA relies on this high-confidence inferred functional association network to expand our perception of the functional impacts of an observed transcriptomic change. GSLA then interprets the biological significance of the observed DEGs using established biological concepts (annotation terms), describing not only the DEGs themselves but also their potential functional impacts. This unique analytical capability can help researchers gain deeper insights into their experimental results and highlight prospective directions for further investigation. We demonstrate the utility of GSLA with two case studies in which GSLA uncovered how molecular events may have caused physiological changes through their collective functional influence on biological processes. Furthermore, we showed that typical annotation-enrichment tools were unable to produce similar insights to PAIR/GSLA. The PAIR version 5.0-inferred interactome and GSLA Web tool both can be accessed at http://public.synergylab.cn/pair/.


Assuntos
Arabidopsis/genética , Perfilação da Expressão Gênica , Genes de Plantas , Ácido Abscísico/farmacologia , Algoritmos , Arabidopsis/efeitos dos fármacos , Regulação da Expressão Gênica de Plantas/efeitos dos fármacos , Ontologia Genética , Mutação/genética , Fases de Leitura Aberta/genética
13.
Epilepsy Behav ; 94: 65-71, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30893617

RESUMO

OBJECTIVE: Epilepsy is among the most common chronic neurologic diseases. There is a need for more data on patient perspectives of treatment to guide patient-centered care initiatives. Patients with epilepsy share their experiences on social media anonymously, but little is known about those discussions. Our aim was to learn what patients with epilepsy discuss regarding their condition and identify treatment-related themes from online patient support groups. METHODS: A total of 355,838 posts were collected from three online support groups for patients with epilepsy through a crawling script, and an analytical pipeline was built to identify patient conversation content through leveraging of multiple text mining methods. Results were also displayed by network visualization methods. RESULTS: Patients with epilepsy sought information about medical treatments, shared their treatment experiences, and sought help from other posters. Key themes related to treatments included the search for optimal personalized treatment strategies as well as identifying and coping with adverse effects. SIGNIFICANCE: This study showed the feasibility of learning about concerns of patients with epilepsy, especially treatment issues, through text mining methods. However, some manual selection and filtering were necessary to ensure quality results for the treatment analysis. Providers should be aware of online discussions and use analyses of such discussions to help guide effective patient engagement during care.


Assuntos
Epilepsia/psicologia , Grupos de Autoajuda , Mídias Sociais , Rede Social , Adaptação Psicológica/fisiologia , Doença Crônica/psicologia , Doença Crônica/terapia , Mineração de Dados/métodos , Epilepsia/terapia , Humanos , Participação do Paciente
15.
J Med Internet Res ; 21(4): e13316, 2019 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-31038462

RESUMO

BACKGROUND: Patents are important intellectual property protecting technological innovations that inspire efficient research and development in biomedicine. The number of awarded patents serves as an important indicator of economic growth and technological innovation. Researchers have mined patents to characterize the focuses and trends of technological innovations in many fields. OBJECTIVE: To expand patent mining to biomedicine and facilitate future resource allocation in biomedical research for the United States, we analyzed US patent documents to determine the focuses and trends of protected technological innovations across the entire disease landscape. METHODS: We analyzed more than 5 million US patent documents between 1995 and 2017, using summary statistics and dynamic topic modeling. More specifically, we investigated the disease coverage and latent topics in patent documents over time. We also incorporated the patent data into the calculation of our recently developed Research Opportunity Index (ROI) and Public Health Index (PHI), to recalibrate the resource allocation in biomedical research. RESULTS: Our analysis showed that protected technological innovations have been primarily focused on socioeconomically critical diseases such as "other cancers" (malignant neoplasm of head, face, neck, abdomen, pelvis, or limb; disseminated malignant neoplasm; Merkel cell carcinoma; and malignant neoplasm, malignant carcinoid tumors, neuroendocrine tumor, and carcinoma in situ of an unspecified site), diabetes mellitus, and obesity. The United States has significantly improved resource allocation to biomedical research and development over the past 17 years, as illustrated by the decreasing PHI. Diseases with positive ROI, such as ankle and foot fracture, indicate potential research opportunities for the future. Development of novel chemical or biological drugs and electrical devices for diagnosis and disease management is the dominating topic in patented inventions. CONCLUSIONS: This multifaceted analysis of patent documents provides a deep understanding of the focuses and trends of technological innovations in disease management in patents. Our findings offer insights into future research and innovation opportunities and provide actionable information to facilitate policy makers, payers, and investors to make better evidence-based decisions regarding resource allocation in biomedicine.


Assuntos
Mineração de Dados/métodos , Gerenciamento Clínico , História do Século XX , História do Século XXI , Humanos , Invenções , Tecnologia , Estados Unidos
16.
BMC Med Inform Decis Mak ; 19(Suppl 6): 263, 2019 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-31856819

RESUMO

BACKGROUND: Sequence alignment is a way of arranging sequences (e.g., DNA, RNA, protein, natural language, financial data, or medical events) to identify the relatedness between two or more sequences and regions of similarity. For Electronic Health Records (EHR) data, sequence alignment helps to identify patients of similar disease trajectory for more relevant and precise prognosis, diagnosis and treatment of patients. METHODS: We tested two cutting-edge global sequence alignment methods, namely dynamic time warping (DTW) and Needleman-Wunsch algorithm (NWA), together with their local modifications, DTW for Local alignment (DTWL) and Smith-Waterman algorithm (SWA), for aligning patient medical records. We also used 4 sets of synthetic patient medical records generated from a large real-world EHR database as gold standard data, to objectively evaluate these sequence alignment algorithms. RESULTS: For global sequence alignments, 47 out of 80 DTW alignments and 11 out of 80 NWA alignments had superior similarity scores than reference alignments while the rest 33 DTW alignments and 69 NWA alignments had the same similarity scores as reference alignments. Forty-six out of 80 DTW alignments had better similarity scores than NWA alignments with the rest 34 cases having the equal similarity scores from both algorithms. For local sequence alignments, 70 out of 80 DTWL alignments and 68 out of 80 SWA alignments had larger coverage and higher similarity scores than reference alignments while the rest DTWL alignments and SWA alignments received the same coverage and similarity scores as reference alignments. Six out of 80 DTWL alignments showed larger coverage and higher similarity scores than SWA alignments. Thirty DTWL alignments had the equal coverage but better similarity scores than SWA. DTWL and SWA received the equal coverage and similarity scores for the rest 44 cases. CONCLUSIONS: DTW, NWA, DTWL and SWA outperformed the reference alignments. DTW (or DTWL) seems to align better than NWA (or SWA) by inserting new daily events and identifying more similarities between patient medical records. The evaluation results could provide valuable information on the strengths and weakness of these sequence alignment methods for future development of sequence alignment methods and patient similarity-based studies.


Assuntos
Comparação Transcultural , Registros Eletrônicos de Saúde/estatística & dados numéricos , Alinhamento de Sequência , Algoritmos , Diagnóstico , Registros Eletrônicos de Saúde/classificação , Humanos , Prognóstico , Terapêutica
17.
BMC Med Inform Decis Mak ; 19(Suppl 3): 80, 2019 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-30943977

RESUMO

BACKGROUND: Accurate information in provider directories are vital in health care including health information exchange, health benefits exchange, quality reporting, and in the reimbursement and delivery of care. Maintaining provider directory data and keeping it up to date is challenging. The objective of this study is to determine the feasibility of using natural language processing (NLP) techniques to combine disparate resources and acquire accurate information on health providers. METHODS: Publically available state licensure lists in Connecticut were obtained along with National Plan and Provider Enumeration System (NPPES) public use files. Connecticut licensure lists textual information of each health professional who is licensed to practice within the state. A NLP-based system was developed based on healthcare provider taxonomy code, location, name and address information to identify textual data within the state and federal records. Qualitative and quantitative evaluation were performed, and the recall and precision were calculated. RESULTS: We identified nurse midwives, nurse practitioners, and dentists in the State of Connecticut. The recall and precision were 0.95 and 0.93 respectively. Using the system, we were able to accurately acquire 6849 of the 7177 records of health provider directory information. CONCLUSIONS: The authors demonstrated that the NLP- based approach was effective at acquiring health provider information. Furthermore, the NLP-based system can always be applied to update information further reducing processing burdens as data changes.


Assuntos
Odontólogos , Diretórios como Assunto , Tocologia , Processamento de Linguagem Natural , Profissionais de Enfermagem , Connecticut , Humanos
18.
J Med Internet Res ; 20(5): e10047, 2018 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-29739741

RESUMO

BACKGROUND: Society always has limited resources to expend on health care, or anything else. What are the unmet medical needs? How do we allocate limited resources to maximize the health and welfare of the people? These challenging questions might be re-examined systematically within an infodemiological frame on a much larger scale, leveraging the latest advancement in information technology and data science. OBJECTIVE: We expanded our previous work by investigating news media data to reveal the coverage of different diseases and medical conditions, together with their sentiments and topics in news articles over three decades. We were motivated to do so since news media plays a significant role in politics and affects the public policy making. METHODS: We analyzed over 3.5 million archive news articles from Reuters media during the periods of 1996/1997, 2008 and 2016, using summary statistics, sentiment analysis, and topic modeling. Summary statistics illustrated the coverage of various diseases and medical conditions during the last 3 decades. Sentiment analysis and topic modeling helped us automatically detect the sentiments of news articles (ie, positive versus negative) and topics (ie, a series of keywords) associated with each disease over time. RESULTS: The percentages of news articles mentioning diseases and medical conditions were 0.44%, 0.57% and 0.81% in the three time periods, suggesting that news media or the public has gradually increased its interests in medicine since 1996. Certain diseases such as other malignant neoplasm (34%), other infectious diseases (20%), and influenza (11%) represented the most covered diseases. Two hundred and twenty-six diseases and medical conditions (97.8%) were found to have neutral or negative sentiments in the news articles. Using topic modeling, we identified meaningful topics on these diseases and medical conditions. For instance, the smoking theme appeared in the news articles on other malignant neoplasm only during 1996/1997. The topic phrases HIV and Zika virus were linked to other infectious diseases during 1996/1997 and 2016, respectively. CONCLUSIONS: The multi-dimensional analysis of news media data allows the discovery of focus, sentiments and topics of news media in terms of diseases and medical conditions. These infodemiological discoveries could shed light on unmet medical needs and research priorities for future and provide guidance for the decision making in public policy.


Assuntos
Serviços de Informação/tendências , Internet/tendências , Meios de Comunicação de Massa/tendências , Opinião Pública , Humanos
19.
Molecules ; 23(11)2018 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-30463177

RESUMO

The interaction of death-associated protein kinase 1 (DAPK1) with the 2B subunit (GluN2B) C-terminus of N-methyl-D-aspartate receptor (NMDAR) plays a critical role in the pathophysiology of depression and is considered a potential target for the structure-based discovery of new antidepressants. However, the 3D structures of C-terminus residues 1290⁻1310 of GluN2B (GluN2B-CT1290-1310) remain elusive and the interaction between GluN2B-CT1290-1310 and DAPK1 is unknown. In this study, the mechanism of interaction between DAPK1 and GluN2B-CT1290-1310 was predicted by computational simulation methods including protein⁻peptide docking and molecular dynamics (MD) simulation. Based on the equilibrated MD trajectory, the total binding free energy between GluN2B-CT1290-1310 and DAPK1 was computed by the mechanics generalized born surface area (MM/GBSA) approach. The simulation results showed that hydrophobic, van der Waals, and electrostatic interactions are responsible for the binding of GluN2B-CT1290⁻1310/DAPK1. Moreover, through per-residue free energy decomposition and in silico alanine scanning analysis, hotspot residues between GluN2B-CT1290-1310 and DAPK1 interface were identified. In conclusion, this work predicted the binding mode and quantitatively characterized the protein⁻peptide interface, which will aid in the discovery of novel drugs targeting the GluN2B-CT1290-1310 and DAPK1 interface.


Assuntos
Proteínas Quinases Associadas com Morte Celular , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Receptores de N-Metil-D-Aspartato , Humanos , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Ligação Proteica , Termodinâmica
20.
BMC Med Inform Decis Mak ; 17(Suppl 2): 74, 2017 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-28699568

RESUMO

BACKGROUND: To deliver evidence-based medicine, clinicians often reference resources that are useful to their respective medical practices. Owing to their busy schedules, however, clinicians typically find it challenging to locate these relevant resources out of the rapidly growing number of journals and articles currently being published. The literature-recommender system may provide a possible solution to this issue if the individual needs of clinicians can be identified and applied. METHODS: We thus collected from the CiteULike website a sample of 96 clinicians and 6,221 scientific articles that they read. We examined the journal distributions, publication types, reading times, and geographic locations. We then compared the distributions of MeSH terms associated with these articles with those of randomly sampled MEDLINE articles using two-sample Z-test and multiple comparison correction, in order to identify the important topics relevant to clinicians. RESULTS: We determined that the sampled clinicians followed the latest literature in a timely manner and read papers that are considered landmarks in medical research history. They preferred to read scientific discoveries from human experiments instead of molecular-, cellular- or animal-model-based experiments. Furthermore, the country of publication may impact reading preferences, particularly for clinicians from Egypt, India, Norway, Senegal, and South Africa. CONCLUSION: These findings provide useful guidance for developing personalized literature-recommender systems for clinicians.


Assuntos
Bibliografias como Assunto , Pesquisa Biomédica , Comportamento de Escolha , MEDLINE , Medical Subject Headings , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA