Pesquisa | Portal Regional da BVS

1.

SocioPedia+: a visual analytics system for social knowledge graph-based event exploration.

Nguyen, Tra My; Chun, Hong-Woo; Hwang, Myunggwon; Kwon, Lee-Nam; Lee, Jae-Min; Park, Kanghee; Jung, Jason J.

PeerJ Comput Sci ; 9: e1277, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37346548

RESUMO

In the recent era of information explosion, exploring event from social networks has recently been a crucial task for many applications. To derive valuable comprehensive and thorough insights on social events, visual analytics (VA) system have been broadly used as a promising solution. However, due to the enormous social data volume with highly diversity and complexity, the number of event exploration tasks which can be enabled in a conventional real-time visual analytics systems has been limited. In this article, we introduce SocioPedia+, a real-time visual analytics system for social event exploration in time and space domains. By introducing the dimension of social knowledge graph analysis into the system multivariate analysis, the process of event explorations in SocioPedia+ can be significantly enhanced and thus enabling system capability on performing full required tasks of visual analytics and social event explorations. Furthermore, SocioPedia+ has been optimized for visualizing event analysis on different levels from macroscopic (events level) to microscopic (knowledge level). The system is then implemented and investigated with a detailed case study for evaluating its usefulness and visualization effectiveness for the application of event explorations.

2.

Automated Classification of Normal Control and Early-Stage Dementia Based on Activities of Daily Living (ADL) Data Acquired from Smart Home Environment.

Kwon, Lee-Nam; Yang, Dong-Hun; Hwang, Myung-Gwon; Lim, Soo-Jin; Kim, Young-Kuk; Kim, Jae-Gyum; Cho, Kwang-Hee; Chun, Hong-Woo; Park, Kun-Woo.

Int J Environ Res Public Health ; 18(24)2021 12 15.

Artigo em Inglês | MEDLINE | ID: mdl-34948842

RESUMO

With the global trend toward an aging population, the increasing number of dementia patients and elderly living alone has emerged as a serious social issue in South Korea. The assessment of activities of daily living (ADL) is essential for diagnosing dementia. However, since the assessment is based on the ADL questionnaire, it relies on subjective judgment and lacks objectivity. Seven healthy seniors and six with early-stage dementia participated in the study to obtain ADL data. The derived ADL features were generated by smart home sensors. Statistical methods and machine learning techniques were employed to develop a model for auto-classifying the normal controls and early-stage dementia patients. The proposed approach verified the developed model as an objective ADL evaluation tool for the diagnosis of dementia. A random forest algorithm was used to compare a personalized model and a non-personalized model. The comparison result verified that the accuracy (91.20%) of the personalized model was higher than that (84.54%) of the non-personalized model. This indicates that the cognitive ability-based personalization showed encouraging performance in the classification of normal control and early-stage dementia and it is expected that the findings of this study will serve as important basic data for the objective diagnosis of dementia.

Assuntos

Atividades Cotidianas , Demência , Idoso , Envelhecimento , Cognição , Demência/diagnóstico , Demência/epidemiologia , Ambiente Domiciliar , Humanos

3.

Medical Health Records-Based Mild Cognitive Impairment (MCI) Prediction for Effective Dementia Care.

Lim, Soo-Jin; Lee, Zoonky; Kwon, Lee-Nam; Chun, Hong-Woo.

Int J Environ Res Public Health ; 18(17)2021 09 01.

Artigo em Inglês | MEDLINE | ID: mdl-34501812

RESUMO

Dementia is a cognitive impairment that poses a global threat. Current dementia treatments slow the progression of the disease. The timing of starting such treatment markedly affects the effectiveness of the treatment. Some experts mentioned that the optimal timing for starting the currently available treatment in order to delay progression to dementia is the mild cognitive impairment stage, which is the prior stage of dementia. However, medical records are typically only available at a later stage, i.e., from the early or middle stage of dementia. In order to address this limitation, this study developed a model using national health information data from 5 years prior, to predict dementia development 5 years in the future. The Senior Cohort Database, comprising 550,000 samples, were used for model development. The F-measure of the model predicting dementia development after a 5-year incubation period was 77.38%. Models for a 1- and 3-year incubation period were also developed for comparative analysis of dementia risk factors. The three models had some risk factors in common, but also had unique risk factors, depending on the stage. For the common risk factors, a difference in disease severity was confirmed. These findings indicate that the diagnostic criteria and treatment strategy for dementia should differ depending on the timing. Furthermore, since the results of this study present new dementia risk factors that have not been reported previously, this study may also contribute to identification of new dementia risk factors.

Assuntos

Doença de Alzheimer , Disfunção Cognitiva , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/epidemiologia , Progressão da Doença , Humanos , Prontuários Médicos

4.

Gender-Based Analysis of Risk Factors for Dementia Using Senior Cohort.

Choi, Jaekue; Kwon, Lee-Nam; Lim, Heuiseok; Chun, Hong-Woo.

Int J Environ Res Public Health ; 17(19)2020 10 05.

Artigo em Inglês | MEDLINE | ID: mdl-33027971

RESUMO

Globally, one of the biggest problems with the increase in the elderly population is dementia. However, dementia still has no fundamental cure. Therefore, it is important to predict and prevent dementia early. For early prediction of dementia, it is crucial to find dementia risk factors that increase a person's risk of developing dementia. In this paper, the subject of dementia risk factor analysis and discovery studies were limited to gender, because it is assumed that the difference in the prevalence of dementia in men and women will lead to differences in the risk factors for dementia among men and women. This study analyzed the Korean National Health Information System-Senior Cohort using machine-learning techniques. By using the machine-learning technique, it was possible to reveal a very small causal relationship between data that are ignored using existing statistical techniques. By using the senior cohort, it was possible to analyze 6000 data that matched the experimental conditions out of 558,147 sample subjects over 14 years. In order to analyze the difference in dementia risk factors between men and women, three machine-learning-based dementia risk factor analysis models were constructed and compared. As a result of the experiment, it was found that the risk factors for dementia in men and women are different. In addition, not only did the results include most of the known dementia risk factors, previously unknown candidates for dementia risk factors were also identified. We hope that our research will be helpful in finding new dementia risk factors.

Assuntos

Demência/epidemiologia , Idoso , Estudos de Coortes , Feminino , Humanos , Aprendizado de Máquina , Masculino , Medição de Risco , Fatores de Risco

5.

Wave2Vec: Vectorizing Electroencephalography Bio-Signal for Prediction of Brain Disease.

Kim, Seonho; Kim, Jungjoon; Chun, Hong-Woo.

Int J Environ Res Public Health ; 15(8)2018 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-30111710

RESUMO

Interest in research involving health-medical information analysis based on artificial intelligence, especially for deep learning techniques, has recently been increasing. Most of the research in this field has been focused on searching for new knowledge for predicting and diagnosing disease by revealing the relation between disease and various information features of data. These features are extracted by analyzing various clinical pathology data, such as EHR (electronic health records), and academic literature using the techniques of data analysis, natural language processing, etc. However, still needed are more research and interest in applying the latest advanced artificial intelligence-based data analysis technique to bio-signal data, which are continuous physiological records, such as EEG (electroencephalography) and ECG (electrocardiogram). Unlike the other types of data, applying deep learning to bio-signal data, which is in the form of time series of real numbers, has many issues that need to be resolved in preprocessing, learning, and analysis. Such issues include leaving feature selection, learning parts that are black boxes, difficulties in recognizing and identifying effective features, high computational complexities, etc. In this paper, to solve these issues, we provide an encoding-based Wave2vec time series classifier model, which combines signal-processing and deep learning-based natural language processing techniques. To demonstrate its advantages, we provide the results of three experiments conducted with EEG data of the University of California Irvine, which are a real-world benchmark bio-signal dataset. After converting the bio-signals (in the form of waves), which are a real number time series, into a sequence of symbols or a sequence of wavelet patterns that are converted into symbols, through encoding, the proposed model vectorizes the symbols by learning the sequence using deep learning-based natural language processing. The models of each class can be constructed through learning from the vectorized wavelet patterns and training data. The implemented models can be used for prediction and diagnosis of diseases by classifying the new data. The proposed method enhanced data readability and intuition of feature selection and learning processes by converting the time series of real number data into sequences of symbols. In addition, it facilitates intuitive and easy recognition, and identification of influential patterns. Furthermore, real-time large-capacity data analysis is facilitated, which is essential in the development of real-time analysis diagnosis systems, by drastically reducing the complexity of calculation without deterioration of analysis performance by data simplification through the encoding process.

Assuntos

Encefalopatias/diagnóstico , Eletroencefalografia , Processamento de Sinais Assistido por Computador , Algoritmos , Inteligência Artificial , Humanos

6.

Longitudinal Study-Based Dementia Prediction for Public Health.

Kim, HeeChel; Chun, Hong-Woo; Kim, Seonho; Coh, Byoung-Youl; Kwon, Oh-Jin; Moon, Yeong-Ho.

Int J Environ Res Public Health ; 14(9)2017 08 30.

Artigo em Inglês | MEDLINE | ID: mdl-28867810

RESUMO

The issue of public health in Korea has attracted significant attention given the aging of the country's population, which has created many types of social problems. The approach proposed in this article aims to address dementia, one of the most significant symptoms of aging and a public health care issue in Korea. The Korean National Health Insurance Service Senior Cohort Database contains personal medical data of every citizen in Korea. There are many different medical history patterns between individuals with dementia and normal controls. The approach used in this study involved examination of personal medical history features from personal disease history, sociodemographic data, and personal health examinations to develop a prediction model. The prediction model used a support-vector machine learning technique to perform a 10-fold cross-validation analysis. The experimental results demonstrated promising performance (80.9% F-measure). The proposed approach supported the significant influence of personal medical history features during an optimal observation period. It is anticipated that a biomedical "big data"-based disease prediction model may assist the diagnosis of any disease more correctly.

Assuntos

Demência/epidemiologia , Modelos Teóricos , Bases de Dados Factuais , Previsões , Registros de Saúde Pessoal , Humanos , Estudos Longitudinais , Programas Nacionais de Saúde , Saúde Pública , República da Coreia/epidemiologia , Máquina de Vetores de Suporte

7.

Overview of the Cancer Genetics and Pathway Curation tasks of BioNLP Shared Task 2013.

Pyysalo, Sampo; Ohta, Tomoko; Rak, Rafal; Rowley, Andrew; Chun, Hong-Woo; Jung, Sung-Jae; Choi, Sung-Pil; Tsujii, Jun'ichi; Ananiadou, Sophia.

BMC Bioinformatics ; 16 Suppl 10: S2, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26202570

RESUMO

BACKGROUND: Since their introduction in 2009, the BioNLP Shared Task events have been instrumental in advancing the development of methods and resources for the automatic extraction of information from the biomedical literature. In this paper, we present the Cancer Genetics (CG) and Pathway Curation (PC) tasks, two event extraction tasks introduced in the BioNLP Shared Task 2013. The CG task focuses on cancer, emphasizing the extraction of physiological and pathological processes at various levels of biological organization, and the PC task targets reactions relevant to the development of biomolecular pathway models, defining its extraction targets on the basis of established pathway representations and ontologies. RESULTS: Six groups participated in the CG task and two groups in the PC task, together applying a wide range of extraction approaches including both established state-of-the-art systems and newly introduced extraction methods. The best-performing systems achieved F-scores of 55% on the CG task and 53% on the PC task, demonstrating a level of performance comparable to the best results achieved in similar previously proposed tasks. CONCLUSIONS: The results indicate that existing event extraction technology can generalize to meet the novel challenges represented by the CG and PC task settings, suggesting that extraction methods are capable of supporting the construction of knowledge bases on the molecular mechanisms of cancer and the curation of biomolecular pathway models. The CG and PC tasks continue as open challenges for all interested parties, with data, tools and resources available from the shared task homepage.

Assuntos

Redes Reguladoras de Genes , Genes , Armazenamento e Recuperação da Informação , Bases de Conhecimento , Modelos Teóricos , Neoplasias/genética , Neoplasias/patologia , Humanos , Processamento de Linguagem Natural

8.

A Case of the Drug Reaction with Eosinophilia and Systemic Symptom (DRESS) Following Isoniazid Treatment.

Lee, Jin-Yong; Seol, Yun-Jae; Shin, Dong-Woo; Kim, Dae-Young; Chun, Hong-Woo; Kim, Bo-Young; Jeong, Shin-Ok; Lim, Sang-Hyok; Jang, An-Soo.

Tuberc Respir Dis (Seoul) ; 78(1): 27-30, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25653694

RESUMO

The drug reaction with eosinophilia and systemic symptom (DRESS) syndrome is a severe adverse drug-induced reaction which includes a severe skin eruption, fever, hematologic abnormalities (eosinophilia or atypical lymphocytes) and internal organ involvement. The most frequently reported drug was anticonvulsants. The diagnosis of DRESS syndrome is challenging because the pattern of cutaneous eruption and the types of organs involved are various. The treatments for DRESS syndrome are culprit drug withdrawal and corticosteroids. Here we report a 71-year-old man with skin eruption with eosinophilia and hepatic and renal involvement that appeared 4 weeks after he had taken anti-tuberculosis drugs (isoniazid, ethambutol, rifampicin, and pyrazinamide), and resolved after stopping anti-tuberculosis drugs and the administration of systemic corticosteroids. DRESS recurred after re-challenging isoniazid, we identified isoniazid was causative drug.

9.

Risk of malignancy in thyroid incidentalomas identified by fluorodeoxyglucose-positron emission tomography.

Chun, A Reum; Jo, Hye Min; Lee, Seoung Ho; Chun, Hong Woo; Park, Jung Mi; Kim, Kyu Jin; Jung, Chan Hee; Mok, Ji Oh; Kang, Sung Koo; Kim, Chul Hee; Kim, Bo Yeon.

Endocrinol Metab (Seoul) ; 30(1): 71-7, 2015 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-25325277

RESUMO

BACKGROUND: Thyroid incidentalomas detected by 2-deoxy-2-18F-fluoro-D-glucose positron emission tomography/computed tomography (18F-FDG PET/CT) have been reported in 1% to 4% of the population, with a risk of malignancy of 27.8% to 74%. We performed a retrospective review of FDG-avid thyroid incidentalomas in cancer screening subjects and patients with nonthyroid cancer. The risk of malignancy in thyroid incidentaloma and its association with the maximal standardized uptake value (SUVmax) in 18F-FDG PET/CT were evaluated to define the predictor variables in assessing risk of malignancy. METHODS: A total of 2,584 subjects underwent 18F-FDG PET/CT for metastatic evaluation or cancer screening from January 2005 to January 2010. Among them, 36 subjects with FDG-avid thyroid incidentalomas underwent further diagnostic evaluation (thyroid ultrasonography-guided fine needle aspiration cytology [FNAC] or surgical resection). We retrospectively reviewed the database of these subjects. RESULTS: Of the 2,584 subjects who underwent 18F-FDG PET/CT (319 for cancer screening and 2,265 for metastatic evaluation), 52 (2.0%) were identified as having FDG-avid thyroid incidentaloma and cytologic diagnosis was obtained by FNAC in 36 subjects. Of the subjects, 15 were proven to have malignant disease: 13 by FNAC and two by surgical resection. The positive predictive value of malignancy in FDG-avid thyroid incidentaloma was 41.7%. Median SUVmax was higher in malignancy than in benign lesions (4.7 [interquartile range (IQR), 3.4 to 6.0] vs. 2.8 [IQR, 2.6 to 4.0], P=0.001). CONCLUSION: Thyroid incidentalomas found on 18F-FDG PET/CT have a high risk of malignancy, with a positive predictive value of 41.7%. FDG-avid thyroid incidentalomas with higher SUVmax tended to be malignant.

10.

The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

Katayama, Toshiaki; Wilkinson, Mark D; Micklem, Gos; Kawashima, Shuichi; Yamaguchi, Atsuko; Nakao, Mitsuteru; Yamamoto, Yasunori; Okamoto, Shinobu; Oouchida, Kenta; Chun, Hong-Woo; Aerts, Jan; Afzal, Hammad; Antezana, Erick; Arakawa, Kazuharu; Aranda, Bruno; Belleau, Francois; Bolleman, Jerven; Bonnal, Raoul Jp; Chapman, Brad; Cock, Peter Ja; Eriksson, Tore; Gordon, Paul Mk; Goto, Naohisa; Hayashi, Kazuhiro; Horn, Heiko; Ishiwata, Ryosuke; Kaminuma, Eli; Kasprzyk, Arek; Kawaji, Hideya; Kido, Nobuhiro; Kim, Young Joo; Kinjo, Akira R; Konishi, Fumikazu; Kwon, Kyung-Hoon; Labarga, Alberto; Lamprecht, Anna-Lena; Lin, Yu; Lindenbaum, Pierre; McCarthy, Luke; Morita, Hideyuki; Murakami, Katsuhiko; Nagao, Koji; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Prins, Pjotr.

J Biomed Semantics ; 4(1): 6, 2013 Feb 11.

Artigo em Inglês | MEDLINE | ID: mdl-23398680

RESUMO

BACKGROUND: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.

11.

U-Compare bio-event meta-service: compatible BioNLP event extraction services.

Kano, Yoshinobu; Björne, Jari; Ginter, Filip; Salakoski, Tapio; Buyko, Ekaterina; Hahn, Udo; Cohen, K Bretonnel; Verspoor, Karin; Roeder, Christophe; Hunter, Lawrence E; Kilicoglu, Halil; Bergler, Sabine; Van Landeghem, Sofie; Van Parys, Thomas; Van de Peer, Yves; Miwa, Makoto; Ananiadou, Sophia; Neves, Mariana; Pascual-Montano, Alberto; Özgür, Arzucan; Radev, Dragomir R; Riedel, Sebastian; Sætre, Rune; Chun, Hong-Woo; Kim, Jin-Dong; Pyysalo, Sampo; Ohta, Tomoko; Tsujii, Jun'ichi.

BMC Bioinformatics ; 12: 481, 2011 Dec 18.

Artigo em Inglês | MEDLINE | ID: mdl-22177292

RESUMO

BACKGROUND: Bio-molecular event extraction from literature is recognized as an important task of bio text mining and, as such, many relevant systems have been developed and made available during the last decade. While such systems provide useful services individually, there is a need for a meta-service to enable comparison and ensemble of such services, offering optimal solutions for various purposes. RESULTS: We have integrated nine event extraction systems in the U-Compare framework, making them intercompatible and interoperable with other U-Compare components. The U-Compare event meta-service provides various meta-level features for comparison and ensemble of multiple event extraction systems. Experimental results show that the performance improvements achieved by the ensemble are significant. CONCLUSIONS: While individual event extraction systems themselves provide useful features for bio text mining, the U-Compare meta-service is expected to improve the accessibility to the individual systems, and to enable meta-level uses over multiple event extraction systems such as comparison and ensemble.

Assuntos

Mineração de Dados , Sistemas Computacionais , Publicações Periódicas como Assunto , Software

12.

The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.

Katayama, Toshiaki; Wilkinson, Mark D; Vos, Rutger; Kawashima, Takeshi; Kawashima, Shuichi; Nakao, Mitsuteru; Yamamoto, Yasunori; Chun, Hong-Woo; Yamaguchi, Atsuko; Kawano, Shin; Aerts, Jan; Aoki-Kinoshita, Kiyoko F; Arakawa, Kazuharu; Aranda, Bruno; Bonnal, Raoul Jp; Fernández, José M; Fujisawa, Takatomo; Gordon, Paul Mk; Goto, Naohisa; Haider, Syed; Harris, Todd; Hatakeyama, Takashi; Ho, Isaac; Itoh, Masumi; Kasprzyk, Arek; Kido, Nobuhiro; Kim, Young-Joo; Kinjo, Akira R; Konishi, Fumikazu; Kovarskaya, Yulia; von Kuster, Greg; Labarga, Alberto; Limviphuvadh, Vachiranee; McCarthy, Luke; Nakamura, Yasukazu; Nam, Yunsun; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Oinn, Tom; Okamoto, Shinobu; Okuda, Shujiro; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Putnam, Nicholas; Senger, Martin; Severin, Jessica; Shigemoto, Yasumasa.

J Biomed Semantics ; 2: 4, 2011 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-21806842

RESUMO

BACKGROUND: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. RESULTS: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. CONCLUSIONS: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

13.

The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*.

Katayama, Toshiaki; Arakawa, Kazuharu; Nakao, Mitsuteru; Ono, Keiichiro; Aoki-Kinoshita, Kiyoko F; Yamamoto, Yasunori; Yamaguchi, Atsuko; Kawashima, Shuichi; Chun, Hong-Woo; Aerts, Jan; Aranda, Bruno; Barboza, Lord Hendrix; Bonnal, Raoul Jp; Bruskiewich, Richard; Bryne, Jan C; Fernández, José M; Funahashi, Akira; Gordon, Paul Mk; Goto, Naohisa; Groscurth, Andreas; Gutteridge, Alex; Holland, Richard; Kano, Yoshinobu; Kawas, Edward A; Kerhornou, Arnaud; Kibukawa, Eri; Kinjo, Akira R; Kuhn, Michael; Lapp, Hilmar; Lehvaslaiho, Heikki; Nakamura, Hiroyuki; Nakamura, Yasukazu; Nishizawa, Tatsuya; Nobata, Chikashi; Noguchi, Tamotsu; Oinn, Thomas M; Okamoto, Shinobu; Owen, Stuart; Pafilis, Evangelos; Pocock, Matthew; Prins, Pjotr; Ranzinger, René; Reisinger, Florian; Salwinski, Lukasz; Schreiber, Mark; Senger, Martin; Shigemoto, Yasumasa; Standley, Daron M; Sugawara, Hideaki; Tashiro, Toshiyuki.

J Biomed Semantics ; 1(1): 8, 2010 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-20727200

RESUMO

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.

14.

The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts.

Yamasaki, Chisato; Murakami, Katsuhiko; Fujii, Yasuyuki; Sato, Yoshiharu; Harada, Erimi; Takeda, Jun-ichi; Taniya, Takayuki; Sakate, Ryuichi; Kikugawa, Shingo; Shimada, Makoto; Tanino, Motohiko; Koyanagi, Kanako O; Barrero, Roberto A; Gough, Craig; Chun, Hong-Woo; Habara, Takuya; Hanaoka, Hideki; Hayakawa, Yosuke; Hilton, Phillip B; Kaneko, Yayoi; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Matsuya, Akihiro; Nagata, Naoki; Nishikata, Kensaku; Noda, Akiko Ogura; Nurimoto, Shin; Saichi, Naomi; Sakai, Hiroaki; Sanbonmatsu, Ryoko; Shiba, Rie; Suzuki, Mami; Takabayashi, Kazuhiko; Takahashi, Aiko; Tamura, Takuro; Tanaka, Masayuki; Tanaka, Susumu; Todokoro, Fusano; Yamaguchi, Kaori; Yamamoto, Naoyuki; Okido, Toshihisa; Mashima, Jun; Hashizume, Aki; Jin, Lihua; Lee, Kyung-Bum; Lin, Yi-Chueh; Nozaki, Asami; Sakai, Katsunaga; Tada, Masahito.

Nucleic Acids Res ; 36(Database issue): D793-9, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18089548

RESUMO

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.

Assuntos

Bases de Dados Genéticas , Genes , RNA Mensageiro/química , Animais , Mapeamento Cromossômico , DNA Complementar/química , Humanos , Internet , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , RNA Mensageiro/genética , Interface Usuário-Computador

15.

Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts.

Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi.

BMC Bioinformatics ; 7 Suppl 3: S4, 2006 Nov 24.

Artigo em Inglês | MEDLINE | ID: mdl-17134477

RESUMO

BACKGROUND: Automatic recognition of relations between a specific disease term and its relevant genes or protein terms is an important practice of bioinformatics. Considering the utility of the results of this approach, we identified prostate cancer and gene terms with the ID tags of public biomedical databases. Moreover, considering that genetics experts will use our results, we classified them based on six topics that can be used to analyze the type of prostate cancers, genes, and their relations. METHODS: We developed a maximum entropy-based named entity recognizer and a relation recognizer and applied them to a corpus-based approach. We collected prostate cancer-related abstracts from MEDLINE, and constructed an annotated corpus of gene and prostate cancer relations based on six topics by biologists. We used it to train the maximum entropy-based named entity recognizer and relation recognizer. RESULTS: Topic-classified relation recognition achieved 92.1% precision for the relation (an increase of 11.0% from that obtained in a baseline experiment). For all topics, the precision was between 67.6 and 88.1%. CONCLUSION: A series of experimental results revealed two important findings: a carefully designed relation recognition system using named entity recognition can improve the performance of relation recognition, and topic-classified relation recognition can be effectively addressed through a corpus-based approach using manual annotation and machine learning techniques.

Assuntos

Indexação e Redação de Resumos/métodos , Inteligência Artificial , Armazenamento e Recuperação da Informação/métodos , MEDLINE , Processamento de Linguagem Natural , Proteínas de Neoplasias/classificação , Neoplasias da Próstata/classificação , Algoritmos , Bases de Dados Factuais , Genes/genética , Humanos , Masculino , Proteínas de Neoplasias/genética , Publicações Periódicas como Assunto , Neoplasias da Próstata/genética , Semântica , Software , Terminologia como Assunto , Vocabulário Controlado

16.

Extraction of gene-disease relations from Medline using domain dictionaries and machine learning.

Chun, Hong-Woo; Tsuruoka, Yoshimasa; Kim, Jin-Dong; Shiba, Rie; Nagata, Naoki; Hishiki, Teruyoshi; Tsujii, Jun'ichi.

Pac Symp Biocomput ; : 4-15, 2006.

Artigo em Inglês | MEDLINE | ID: mdl-17094223

RESUMO

We describe a system that extracts disease-gene relations from Medline. We constructed a dictionary for disease and gene names from six public databases and extracted relation candidates by dictionary matching. Since dictionary matching produces a large number of false positives, we developed a method of machine learning-based named entity recognition (NER) to filter out false recognitions of disease/gene names. We found that the performance of relation extraction is heavily dependent upon the performance of NER filtering and that the filtering improves the precision of relation extraction by 26.7% at the cost of a small reduction in recall.

Assuntos

Inteligência Artificial , Doença , Genes , MEDLINE , Animais , Metodologias Computacionais , Dicionários Médicos como Assunto , Humanos , Terminologia como Assunto , Unified Medical Language System

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA