DSEATM: drug set enrichment analysis uncovering disease mechanisms by biomedical text mining.
Brief Bioinform
; 23(4)2022 07 18.
Article
in En
| MEDLINE
| ID: mdl-35679594
ABSTRACT
Disease pathogenesis is always a major topic in biomedical research. With the exponential growth of biomedical information, drug effect analysis for specific phenotypes has shown great promise in uncovering disease-associated pathways. However, this method has only been applied to a limited number of drugs. Here, we extracted the data of 4634 diseases, 3671 drugs, 112 809 disease-drug associations and 81 527 drug-gene associations by text mining of 29 168 919 publications. On this basis, we proposed a 'Drug Set Enrichment Analysis by Text Mining (DSEATM)' pipeline and applied it to 3250 diseases, which outperformed the state-of-the-art method. Furthermore, diseases pathways enriched by DSEATM were similar to those obtained using the TCGA cancer RNA-seq differentially expressed genes. In addition, the drug number, which showed a remarkable positive correlation of 0.73 with the AUC, plays a determining role in the performance of DSEATM. Taken together, DSEATM is an auspicious and accurate disease research tool that offers fresh insights.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Biomedical Research
/
Data Mining
Language:
En
Journal:
Brief Bioinform
Journal subject:
BIOLOGIA
/
INFORMATICA MEDICA
Year:
2022
Document type:
Article