Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Health Serv Res ; 53(2): 1110-1136, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-28295260

RESUMO

OBJECTIVE: To evaluate the prevalence of seven social factors using physician notes as compared to claims and structured electronic health records (EHRs) data and the resulting association with 30-day readmissions. STUDY SETTING: A multihospital academic health system in southeastern Massachusetts. STUDY DESIGN: An observational study of 49,319 patients with cardiovascular disease admitted from January 1, 2011, to December 31, 2013, using multivariable logistic regression to adjust for patient characteristics. DATA COLLECTION/EXTRACTION METHODS: All-payer claims, EHR data, and physician notes extracted from a centralized clinical registry. PRINCIPAL FINDINGS: All seven social characteristics were identified at the highest rates in physician notes. For example, we identified 14,872 patient admissions with poor social support in physician notes, increasing the prevalence from 0.4 percent using ICD-9 codes and structured EHR data to 16.0 percent. Compared to an 18.6 percent baseline readmission rate, risk-adjusted analysis showed higher readmission risk for patients with housing instability (readmission rate 24.5 percent; p < .001), depression (20.6 percent; p < .001), drug abuse (20.2 percent; p = .01), and poor social support (20.0 percent; p = .01). CONCLUSIONS: The seven social risk factors studied are substantially more prevalent than represented in administrative data. Automated methods for analyzing physician notes may enable better identification of patients with social needs.


Assuntos
Documentação/estatística & dados numéricos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Readmissão do Paciente/estatística & dados numéricos , Médicos , Acidentes por Quedas/estatística & dados numéricos , Adolescente , Adulto , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Depressão/epidemiologia , Feminino , Pessoas Mal Alojadas/estatística & dados numéricos , Humanos , Revisão da Utilização de Seguros/estatística & dados numéricos , Modelos Logísticos , Masculino , Massachusetts , Pessoa de Meia-Idade , Processamento de Linguagem Natural , Fatores de Risco , Fatores Sexuais , Apoio Social , Fatores Socioeconômicos , Transtornos Relacionados ao Uso de Substâncias/epidemiologia , Fatores de Tempo , Adulto Jovem
2.
IEEE Trans Vis Comput Graph ; 24(1): 215-225, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28866563

RESUMO

Finding patterns in graphs has become a vital challenge in many domains from biological systems, network security, to finance (e.g., finding money laundering rings of bankers and business owners). While there is significant interest in graph databases and querying techniques, less research has focused on helping analysts make sense of underlying patterns within a group of subgraph results. Visualizing graph query results is challenging, requiring effective summarization of a large number of subgraphs, each having potentially shared node-values, rich node features, and flexible structure across queries. We present VIGOR, a novel interactive visual analytics system, for exploring and making sense of query results. VIGOR uses multiple coordinated views, leveraging different data representations and organizations to streamline analysts sensemaking process. VIGOR contributes: (1) an exemplar-based interaction technique, where an analyst starts with a specific result and relaxes constraints to find other similar results or starts with only the structure (i.e., without node value constraints), and adds constraints to narrow in on specific results; and (2) a novel feature-aware subgraph result summarization. Through a collaboration with Symantec, we demonstrate how VIGOR helps tackle real-world problems through the discovery of security blindspots in a cybersecurity dataset with over 11,000 incidents. We also evaluate VIGOR with a within-subjects study, demonstrating VIGOR's ease of use over a leading graph database management system, and its ability to help analysts understand their results at higher speed and make fewer errors.

3.
J Biomed Inform ; 61: 267-75, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27064059

RESUMO

OBJECTIVE: A significant challenge in treating rare forms of cancer such as Glioblastoma (GBM) is to find optimal personalized treatment plans for patients. The goals of our study is to predict which patients survive longer than the median survival time for GBM based on clinical and genomic factors, and to assess the predictive power of treatment patterns. METHOD: We developed a predictive model based on the clinical and genomic data from approximately 300 newly diagnosed GBM patients for a period of 2years. We proposed sequential mining algorithms with novel clinical constraints, namely, 'exact-order' and 'temporal overlap' constraints, to extract treatment patterns as features used in predictive modeling. With diverse features from clinical, genomic information and treatment patterns, we applied both logistic regression model and Cox regression to model patient survival outcome. RESULTS: The most predictive features influencing the survival period of GBM patients included mRNA expression levels of certain genes, some clinical characteristics such as age, Karnofsky performance score, and therapeutic agents prescribed in treatment patterns. Our models achieved c-statistic of 0.85 for logistic regression and 0.84 for Cox regression. CONCLUSIONS: We demonstrated the importance of diverse sources of features in predicting GBM patient survival outcome. The predictive model presented in this study is a preliminary step in a long-term plan of developing personalized treatment plans for GBM patients that can later be extended to other types of cancers.


Assuntos
Neoplasias Encefálicas , Mineração de Dados , Marcadores Genéticos , Glioblastoma , Algoritmos , Humanos , Modelos Teóricos , Prognóstico , RNA Mensageiro/metabolismo , Taxa de Sobrevida
4.
AVI ; 2016: 272-279, 2016 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28553670

RESUMO

Extracting useful patterns from large network datasets has become a fundamental challenge in many domains. We present VISAGE, an interactive visual graph querying approach that empowers users to construct expressive queries, without writing complex code (e.g., finding money laundering rings of bankers and business owners). Our contributions are as follows: (1) we introduce graph autocomplete, an interactive approach that guides users to construct and refine queries, preventing over-specification; (2) VISAGE guides the construction of graph queries using a data-driven approach, enabling users to specify queries with varying levels of specificity, from concrete and detailed (e.g., query by example), to abstract (e.g., with "wildcard" nodes of any types), to purely structural matching; (3) a twelve-participant, within-subject user study demonstrates VISAGE's ease of use and the ability to construct graph queries significantly faster than using a conventional query language; (4) VISAGE works on real graphs with over 468K edges, achieving sub-second response times for common queries.

5.
Stud Health Technol Inform ; 216: 629-33, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26262127

RESUMO

About 1 in 10 adults are reported to exhibit clinical depression and the associated personal, societal, and economic costs are significant. In this study, we applied the MTERMS NLP system and machine learning classification algorithms to identify patients with depression using discharge summaries. Domain experts reviewed both the training and test cases, and classified these cases as depression with a high, intermediate, and low confidence. For depression cases with high confidence, all of the algorithms we tested performed similarly, with MTERMS' knowledge-based decision tree slightly better than the machine learning classifiers, achieving an F-measure of 89.6%. MTERMS also achieved the highest F-measure (70.6%) on intermediate confidence cases. The RIPPER rule learner was the best performing machine learning method, with an F-measure of 70.0%, and a higher precision but lower recall than MTERMS. The proposed NLP-based approach was able to identify a significant portion of the depression cases (about 20%) that were not on the coded diagnosis list.


Assuntos
Mineração de Dados/métodos , Sistemas de Apoio a Decisões Clínicas/organização & administração , Depressão/diagnóstico , Diagnóstico por Computador/métodos , Registros Eletrônicos de Saúde/classificação , Processamento de Linguagem Natural , Boston , Depressão/classificação , Humanos , Aprendizado de Máquina , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
6.
Int J Data Min Bioinform ; 1(1): 88-110, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-18402044

RESUMO

One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.


Assuntos
Processamento Eletrônico de Dados , Regulação Fúngica da Expressão Gênica/fisiologia , Genes Fúngicos/fisiologia , MEDLINE , Saccharomyces cerevisiae/fisiologia , Vocabulário Controlado
7.
Artigo em Inglês | MEDLINE | ID: mdl-16447994

RESUMO

Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.


Assuntos
Indexação e Redação de Resumos/métodos , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Publicações Periódicas como Assunto , PubMed , Algoritmos , Inteligência Artificial , Vocabulário Controlado
8.
Artigo em Inglês | MEDLINE | ID: mdl-17044165

RESUMO

Partitioning closely related genes into clusters has become an important element of practically all statistical analyses of microarray data. A number of computer algorithms have been developed for this task. Although these algorithms have demonstrated their usefulness for gene clustering, some basic problems remain. This paper describes our work on extracting functional keywords from MEDLINE for a set of genes that are isolated for further study from microarray experiments based on their differential expression patterns. The sharing of functional keywords among genes is used as a basis for clustering in a new approach called BEA-PARTITION in this paper. Functional keywords associated with genes were extracted from MEDLINE abstracts. We modified the Bond Energy Algorithm (BEA), which is widely accepted in psychology and database design but is virtually unknown in bioinformatics, to cluster genes by functional keyword associations. The results showed that BEA-PARTITION and hierarchical clustering algorithm outperformed k-means clustering and self-organizing map by correctly assigning 25 of 26 genes in a test set of four known gene groups. To evaluate the effectiveness of BEA-PARTITION for clustering genes identified by microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle and have been widely studied in the literature were used as a second test set. Using established measures of cluster quality, the results produced by BEA-PARTITION had higher purity, lower entropy, and higher mutual information than those produced by k-means and self-organizing map. Whereas BEA-PARTITION and the hierarchical clustering produced similar quality of clusters, BEA-PARTITION provides clear cluster boundaries compared to the hierarchical clustering. BEA-PARTITION is simple to implement and provides a powerful approach to clustering genes or to any clustering problem where starting matrices are available from experimental observations.


Assuntos
Algoritmos , MEDLINE , Família Multigênica/fisiologia , Processamento de Linguagem Natural , Publicações Periódicas como Assunto , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Indexação e Redação de Resumos/métodos , Perfilação da Expressão Gênica/métodos , Armazenamento e Recuperação da Informação/métodos , Proteínas/classificação , Vocabulário Controlado
9.
Nucleic Acids Res ; 33(Database issue): D611-3, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608272

RESUMO

MITOMAP (http://www.MITOMAP.org), a database for the human mitochondrial genome, has grown rapidly in data content over the past several years as interest in the role of mitochondrial DNA (mtDNA) variation in human origins, forensics, degenerative diseases, cancer and aging has increased dramatically. To accommodate this information explosion, MITOMAP has implemented a new relational database and an improved search engine, and all programs have been rewritten. System administrative changes have been made to improve security and efficiency, and to make MITOMAP compatible with a new automatic mtDNA sequence analyzer known as Mitomaster.


Assuntos
DNA Mitocondrial/química , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Mitocôndrias/genética , Sistemas de Gerenciamento de Base de Dados , Predisposição Genética para Doença , Variação Genética , Genômica , Humanos , Mutação , Integração de Sistemas , Interface Usuário-Computador
10.
Stud Health Technol Inform ; 107(Pt 1): 292-6, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15360821

RESUMO

Modern experimental techniques provide the ability to gather vast amounts of biological data in a single experiment (e.g. DNA microarray experiment), making it extremely difficult for the researcher to interpret the data and form conclusions about the functions of the genes. Current approaches provide useful information that organizes or relates genes, but a major shortcoming is they either do not address specific functions of the genes or are constrained by functions predefined in other databases, which can be biased, incomplete, or out-of-date. We extended Andrade and Valencia's method [1] to statistically mine functional keywords associated with genes from MEDLINE abstracts. The MEDLINE abstracts are analyzed statistically to score and rank keywords for each gene using a background set of words for baseline frequencies. We generally got very good functional keyword information about the genes we tested, which was confirmed by searching for the individual keywords in context. The keywords extracted by our algorithm reveal a wealth of potential functional concepts, which were not represented in existing public databases. We feel that this approach is general enough to apply to medical and biological literature to find other relationships: drugs vs. genes, risk-factors vs. genes, etc.


Assuntos
Genes , Armazenamento e Recuperação da Informação , Descritores , Algoritmos , Bases de Dados Genéticas , Perfilação da Expressão Gênica , MEDLINE , Análise de Sequência com Séries de Oligonucleotídeos , Estatística como Assunto
11.
Artigo em Inglês | MEDLINE | ID: mdl-16448032

RESUMO

One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.


Assuntos
Inteligência Artificial , Análise por Conglomerados , MEDLINE , Família Multigênica/genética , Processamento de Linguagem Natural , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Vocabulário Controlado , Armazenamento e Recuperação da Informação/métodos , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...