Your browser doesn't support javascript.
loading
Automatic Extraction of Medication Mentions from Tweets-Overview of the BioCreative VII Shared Task 3 Competition.
Weissenbacher, Davy; O'Connor, Karen; Rawal, Siddharth; Zhang, Yu; Tsai, Richard Tzong-Han; Miller, Timothy; Xu, Dongfang; Anderson, Carol; Liu, Bo; Han, Qing; Zhang, Jinfeng; Kulev, Igor; Köprü, Berkay; Rodriguez-Esteban, Raul; Ozkirimli, Elif; Ayach, Ammer; Roller, Roland; Piccolo, Stephen; Han, Peijin; Vydiswaran, V G Vinod; Tekumalla, Ramya; Banda, Juan M; Bagherzadeh, Parsa; Bergler, Sabine; Silva, João F; Almeida, Tiago; Martinez, Paloma; Rivera-Zavala, Renzo; Wang, Chen-Kai; Dai, Hong-Jie; Alberto Robles Hernandez, Luis; Gonzalez-Hernandez, Graciela.
Affiliation
  • Weissenbacher D; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
  • O'Connor K; DBEI, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
  • Rawal S; DBEI, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
  • Zhang Y; Department of Computer Science and Information Engineering, National Central University, No. 300, Zhongda Rd, Zhongli District, Taoyuan 320, Taiwan.
  • Tsai RT; Department of Computer Science and Information Engineering, National Central University, No. 300, Zhongda Rd, Zhongli District, Taoyuan 320, Taiwan.
  • Miller T; IoX Center, National Taiwan University, Da'an District, Section 4, Roosevelt Rd, No. 1, Barry Lam Hall, Taipei 106, Taiwan.
  • Xu D; Research Center for Humanities and Social Sciences, Academia Sinica, No. 128, Section 2, Academia Rd, Nangang District, Taipei 115, Taiwan.
  • Anderson C; Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA.
  • Liu B; Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
  • Han Q; Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA.
  • Zhang J; Department of Pediatrics, Harvard Medical School, Boston, MA, USA.
  • Kulev I; NVIDIA, Santa Clara, CA, USA.
  • Köprü B; NVIDIA, Santa Clara, CA, USA.
  • Rodriguez-Esteban R; Department of Statistics, Florida State University, Tallahassee, FL, USA.
  • Ozkirimli E; Department of Statistics, Florida State University, Tallahassee, FL, USA.
  • Ayach A; Data and Analytics Chapter, F. Hoffmann-La Roche Ltd, Switzerland.
  • Roller R; Data and Analytics Chapter, F. Hoffmann-La Roche Ltd, Switzerland.
  • Piccolo S; Pharmaceutical Research and Early Development, Roche Innovation Center Basel, Switzerland.
  • Han P; Data and Analytics Chapter, F. Hoffmann-La Roche Ltd, Switzerland.
  • Vydiswaran VGV; Speech and Language Technology Lab, DFKI, Berlin, Germany.
  • Tekumalla R; Speech and Language Technology Lab, DFKI, Berlin, Germany.
  • Banda JM; Department of Biology, Brigham Young University, Provo, UT, USA.
  • Bagherzadeh P; Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, MI, USA.
  • Bergler S; Department of Learning Health Sciences, Medical School, University of Michigan, Ann Arbor, MI, USA.
  • Silva JF; School of Information, University of Michigan, Ann Arbor, MI, USA.
  • Almeida T; Department of Computer Science, Georgia State University, Atlanta, GA, USA.
  • Martinez P; Department of Computer Science, Georgia State University, Atlanta, GA, USA.
  • Rivera-Zavala R; CLaC Labs, Concordia University, Montreal, Canada.
  • Wang CK; CLaC Labs, Concordia University, Montreal, Canada.
  • Dai HJ; DETI, Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Portugal.
  • Alberto Robles Hernandez L; DETI, Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Portugal.
  • Gonzalez-Hernandez G; Department of Computation, University of A Coruña, Spain.
Database (Oxford) ; 20232023 02 03.
Article in En | MEDLINE | ID: mdl-36734300
ABSTRACT
This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available at https//biocreative.bioinformatics.udel.edu/tasks/biocreative-vii/track-3/. The methods and the results of the competing systems are analyzed with a focus on the approaches taken for learning from class-imbalanced data.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Data Mining Limits: Humans Language: En Journal: Database (Oxford) Year: 2023 Document type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Natural Language Processing / Data Mining Limits: Humans Language: En Journal: Database (Oxford) Year: 2023 Document type: Article Affiliation country: United States