Your browser doesn't support javascript.
loading
Indirect association and ranking hypotheses for literature based discovery.
Henry, Sam; McInnes, Bridget T.
Afiliação
  • Henry S; Department of Computer Science, Virginia Commonwealth University, 601 W. Main St. Rm 435, Richmond, 23284, USA. henryst@vcu.edu.
  • McInnes BT; Department of Computer Science, Virginia Commonwealth University, 601 W. Main St. Rm 435, Richmond, 23284, USA.
BMC Bioinformatics ; 20(1): 425, 2019 Aug 15.
Article em En | MEDLINE | ID: mdl-31416434
BACKGROUND: Literature Based Discovery (LBD) produces more potential hypotheses than can be manually reviewed, making automatically ranking these hypotheses critical. In this paper, we introduce the indirect association measures of Linking Term Association (LTA), Minimum Weight Association (MWA), and Shared B to C Set Association (SBC), and compare them to Linking Set Association (LSA), concept embeddings vector cosine, Linking Term Count (LTC), and direct co-occurrence vector cosine. Our proposed indirect association measures extend traditional association measures to quantify indirect rather than direct associations while preserving valuable statistical properties. RESULTS: We perform a comparison between several different hypothesis ranking methods for LBD, and compare them against our proposed indirect association measures. We intrinsically evaluate each method's performance using its ability to estimate semantic relatedness on standard evaluation datasets. We extrinsically evaluate each method's ability to rank hypotheses in LBD using a time-slicing dataset based on co-occurrence information, and another time-slicing dataset based on SemRep extracted-relationships. Precision and recall curves are generated by ranking term pairs and applying a threshold at each rank. CONCLUSIONS: Results differ depending on the evaluation methods and datasets, but it is unclear if this is a result of biases in the evaluation datasets or if one method is truly better than another. We conclude that LTC and SBC are the best suited methods for hypothesis ranking in LBD, but there is value in having a variety of methods to choose from.
Assuntos
Palavras-chave

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Descoberta do Conhecimento / Modelos Teóricos Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Descoberta do Conhecimento / Modelos Teóricos Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Estados Unidos