Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters











Publication year range
1.
Genetics ; 217(1): 1-17, 2021 03 03.
Article in English | MEDLINE | ID: mdl-33683370

ABSTRACT

Infection with antibiotic-resistant bacteria is an emerging life-threatening issue worldwide. Enterohemorrhagic Escherichia coli O157: H7 (EHEC) causes hemorrhagic colitis and hemolytic uremic syndrome via contaminated food. Treatment of EHEC infection with antibiotics is contraindicated because of the risk of worsening the syndrome through the secreted toxins. Identifying the host factors involved in bacterial infection provides information about how to combat this pathogen. In our previous study, we showed that EHEC colonizes in the intestine of Caenorhabditis elegans. However, the host factors involved in EHEC colonization remain elusive. Thus, in this study, we aimed to identify the host factors involved in EHEC colonization. We conducted forward genetic screens to isolate mutants that enhanced EHEC colonization and named this phenotype enhanced intestinal colonization (Inc). Intriguingly, four mutants with the Inc phenotype showed significantly increased EHEC-resistant survival, which contrasts with our current knowledge. Genetic mapping and whole-genome sequencing (WGS) revealed that these mutants have loss-of-function mutations in unc-89. Furthermore, we showed that the tolerance of unc-89(wf132) to EHEC relied on HLH-30/TFEB activation. These findings suggest that hlh-30 plays a key role in pathogen tolerance in C. elegans.


Subject(s)
Basic Helix-Loop-Helix Transcription Factors/genetics , Caenorhabditis elegans Proteins/genetics , Escherichia coli Infections/genetics , Immunity, Innate , Animals , Basic Helix-Loop-Helix Transcription Factors/metabolism , Caenorhabditis elegans , Caenorhabditis elegans Proteins/metabolism , Enterohemorrhagic Escherichia coli/pathogenicity , Escherichia coli Infections/immunology , Intestines/microbiology , Muscle Proteins/genetics , Muscle Proteins/metabolism
2.
Mol Microbiol ; 116(1): 168-183, 2021 07.
Article in English | MEDLINE | ID: mdl-33567149

ABSTRACT

Enterohemorrhagic Escherichia coli (EHEC), an enteropathogen that colonizes in the intestine, causes severe diarrhea and hemorrhagic colitis in humans by the expression of the type III secretion system (T3SS) and Shiga-like toxins (Stxs). However, how EHEC can sense and respond to the changes in the alimentary tract and coordinate the expression of these virulence genes remains elusive. The T3SS-related genes are known to be regulated by the locus of enterocyte effacement (LEE)-encoded regulators, such as Ler, as well as non-LEE-encoded regulators in response to different environmental cues. Herein, we report that OmpR, which participates in the adaptation of E. coli to osmolarity and pH alterations, is required for EHEC infection in Caenorhabditis elegans. OmpR protein was able to directly bind to the promoters of ler and stx1 (Shiga-like toxin 1) and regulate the expression of T3SS and Stx1, respectively, at the transcriptional level. Moreover, we demonstrated that the expression of ler in EHEC is in response to the intestinal environment and is regulated by OmpR in C. elegans. Taken together, we reveal that OmpR is an important regulator of EHEC which coordinates the expression of virulence factors during gastrointestinal infection in vivo.


Subject(s)
Bacterial Proteins/genetics , Caenorhabditis elegans/microbiology , Enterohemorrhagic Escherichia coli/pathogenicity , Shiga Toxin 1/biosynthesis , Trans-Activators/genetics , Virulence Factors/biosynthesis , Animals , Bacterial Proteins/metabolism , Digestive System/microbiology , Enterohemorrhagic Escherichia coli/genetics , Escherichia coli Proteins/biosynthesis , Escherichia coli Proteins/genetics , Gene Expression Regulation, Bacterial/genetics , Promoter Regions, Genetic/genetics , Shiga Toxin 1/genetics , Trans-Activators/biosynthesis , Trans-Activators/metabolism , Transcription, Genetic/genetics , Transcriptional Activation/genetics , Type III Secretion Systems/biosynthesis , Type III Secretion Systems/genetics , Virulence Factors/genetics
3.
Nat Commun ; 12(1): 90, 2021 01 04.
Article in English | MEDLINE | ID: mdl-33397943

ABSTRACT

Enterohemorrhagic Escherichia coli (EHEC) induces changes to the intestinal cell cytoskeleton and formation of attaching and effacing lesions, characterized by the effacement of microvilli and then formation of actin pedestals to which the bacteria are tightly attached. Here, we use a Caenorhabditis elegans model of EHEC infection to show that microvillar effacement is mediated by a signalling pathway including mitotic cyclin-dependent kinase 1 (CDK1) and diaphanous-related formin 1 (CYK1). Similar observations are also made using EHEC-infected human intestinal cells in vitro. Our results support the use of C. elegans as a host model for studying attaching and effacing lesions in vivo, and reveal that the CDK1-formin signal axis is necessary for EHEC-induced microvillar effacement.


Subject(s)
Caenorhabditis elegans Proteins/metabolism , Cell Cycle Proteins/metabolism , Enterohemorrhagic Escherichia coli/physiology , Host-Pathogen Interactions , Microvilli/microbiology , Microvilli/pathology , Actins/metabolism , Animals , Caco-2 Cells , Caenorhabditis elegans/metabolism , Caenorhabditis elegans/microbiology , Caenorhabditis elegans/ultrastructure , Carbohydrate Epimerases/metabolism , Enterohemorrhagic Escherichia coli/pathogenicity , Formins , Humans , Intestines/microbiology , Microvilli/metabolism , Phosphorylation , Phosphothreonine/metabolism , Virulence
4.
Front Immunol ; 11: 561337, 2020.
Article in English | MEDLINE | ID: mdl-33329523

ABSTRACT

Enterohemorrhagic Escherichia coli (EHEC), a human pathogen, also infects Caenorhabditis elegans. We demonstrated previously that C. elegans activates the p38 MAPK innate immune pathway to defend against EHEC infection. However, whether a C. elegans pattern recognition receptor (PRR) exists to regulate the immune pathway remains unknown. PRRs identified in other metazoans contain several conserved domains, including the leucine-rich repeat (LRR). By screening a focused RNAi library, we identified the IGLR-2, a transmembrane protein containing the LRR domain, as a potential immune regulator in C. elegans. Our data showed that iglr-2 regulates the host susceptibility to EHEC infection. Moreover, iglr-2 is required for pathogen avoidance to EHEC. The iglr-2 overexpressed strain, which was more resistant to EHEC originally, showed hypersusceptibility to EHEC upon knockdown of the p38 MAPK pathway. Together, our data suggested that iglr-2 plays an important role in C. elegans to defend EHEC by regulating pathogen-avoidance behavior and the p38 MAPK pathway.


Subject(s)
Caenorhabditis elegans Proteins/immunology , Caenorhabditis elegans/immunology , Enterohemorrhagic Escherichia coli/pathogenicity , Escherichia coli Infections/immunology , Host Microbial Interactions/immunology , Membrane Proteins/immunology , Animals , Animals, Genetically Modified , CRISPR-Cas Systems , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Escherichia coli Infections/microbiology , Gene Knockdown Techniques , Immunity, Innate , Membrane Proteins/genetics , p38 Mitogen-Activated Protein Kinases/metabolism
5.
Virulence ; 11(1): 502-520, 2020 12.
Article in English | MEDLINE | ID: mdl-32434424

ABSTRACT

Aeromonas dhakensis is an emerging human pathogen which causes fast and severe infections worldwide. Under the gradual pressure of lacking useful antibiotics, finding a new strategy against A. dhakensis infection is urgent. To understand its pathogenesis, we created an A. dhakensis AAK1 mini-Tn10 transposon library to study the mechanism of A. dhakensis infection. By using a Caenorhabditis elegans model, we established a screening platform for the purpose of identifying attenuated mutants. The uvrY mutant, which conferred the most attenuated toxicity toward C. elegans, was identified. The uvrY mutant was also less virulent in C2C12 fibroblast and mice models, in line with in vitro results. To further elucidate the mechanism of UvrY in controlling the toxicity in A. dhakensis, we conducted a transcriptomic analysis. The RNAseq results showed that the expression of a unique hemolysin ahh1 and other virulence factors were regulated by UvrY. Complementation of Ahh1, one of the most important virulence factors, rescued the pore-formation phenotype of uvrY mutant in C. elegans; however, complementation of ahh1 endogenous promoter-driven ahh1 could not produce Ahh1 and rescue the virulence in the uvrY mutant. These findings suggest that UvrY is required for the expression of Ahh1 in A. dhakensis. Taken together, our results suggested that UvrY controls several different virulence factors and is required for the full virulence of A. dhakensis. The two-component regulator UvrY therefore a potential therapeutic target which is worthy of further study.


Subject(s)
Aeromonas/genetics , Aeromonas/pathogenicity , Bacterial Proteins/genetics , Transcription Factors/genetics , Virulence Factors/genetics , Animals , Biofilms/growth & development , Caenorhabditis elegans , Female , Fibroblasts/microbiology , Gene Expression Profiling , Hemolysin Proteins/genetics , Mice , Mice, Inbred BALB C , Mutation , Sequence Analysis, RNA , Virulence
6.
J Vis Exp ; (134)2018 04 09.
Article in English | MEDLINE | ID: mdl-29683443

ABSTRACT

Enterohemorrhagic E. coli (EHEC) O157:H7, which is a foodborne pathogen that causesdiarrhea, hemorrhagic colitis (HS), and hemolytic uremic syndrome (HUS), colonize to the intestinal tract of humans. To study the detailed mechanism of EHEC colonization in vivo, it is essential to have animal models to monitor and quantify EHEC colonization. We demonstrate here a mouse-EHEC colonization model by transforming the bioluminescent expressing plasmid to EHEC to monitor and quantify EHEC colonization in living hosts. Animals inoculated with bioluminescence-labeled EHEC show intense bioluminescent signals in mice by detection with a non-invasive in vivo imaging system. After 1 and 2 days post infection, bioluminescent signals could still be detected in infected animals, which suggests that EHEC colonize in hosts for at least 2 days. We also demonstrate that these bioluminescent EHEC locate to mouse intestine, specifically in the cecum and colon, from ex vivo images. This mouse-EHEC colonization model may serve as a tool to advance the current knowledge of the EHEC colonization mechanism.


Subject(s)
Enterohemorrhagic Escherichia coli/isolation & purification , Escherichia coli Infections/microbiology , Luminescent Measurements/methods , Animals , Disease Models, Animal , Female , Mice , Mice, Inbred C57BL
7.
Cell Death Dis ; 9(3): 381, 2018 03 07.
Article in English | MEDLINE | ID: mdl-29515100

ABSTRACT

The enteric pathogen enterohemorrhagic Escherichia coli (EHEC) is responsible for outbreaks of bloody diarrhea and hemolytic uremic syndrome (HUS) worldwide. Several molecular mechanisms have been described for the pathogenicity of EHEC; however, the role of bacterial metabolism in the virulence of EHEC during infection in vivo remains unclear. Here we show that aerobic metabolism plays an important role in the regulation of EHEC virulence in Caenorhabditis elegans. Our functional genomic analyses showed that disruption of the genes encoding the succinate dehydrogenase complex (Sdh) of EHEC, including the sdhA gene, attenuated its toxicity toward C. elegans animals. Sdh converts succinate to fumarate and links the tricarboxylic acid (TCA) cycle and the electron transport chain (ETC) simultaneously. Succinate accumulation and fumarate depletion in the EHEC sdhA mutant cells were also demonstrated to be concomitant by metabolomic analyses. Moreover, fumarate replenishment to the sdhA mutant significantly increased its virulence toward C. elegans. These results suggest that the TCA cycle, ETC, and alteration in metabolome all account for the attenuated toxicity of the sdhA mutant, and Sdh catabolite fumarate in particular plays a critical role in the regulation of EHEC virulence. In addition, we identified the tryptophanase (TnaA) as a downstream virulence determinant of SdhA using a label-free proteomic method. We demonstrated that expression of tnaA is regulated by fumarate in EHEC. Taken together, our multi-omic analyses demonstrate that sdhA is required for the virulence of EHEC, and aerobic metabolism plays important roles in the pathogenicity of EHEC infection in C. elegans. Moreover, our study highlights the potential targeting of SdhA, if druggable, as alternative preventive or therapeutic strategies by which to combat EHEC infection.


Subject(s)
Enterohemorrhagic Escherichia coli/drug effects , Enterohemorrhagic Escherichia coli/metabolism , Fumarates/pharmacology , Animals , Enterohemorrhagic Escherichia coli/pathogenicity , Humans , Mass Spectrometry , Metabolomics/methods , Proteomics/methods , Real-Time Polymerase Chain Reaction , Virulence
8.
Autophagy ; 14(2): 233-242, 2018.
Article in English | MEDLINE | ID: mdl-29130360

ABSTRACT

Macroautophagy/autophagy is a fundamental intracellular degradation process with multiple roles in immunity, including direct elimination of intracellular microorganisms via 'xenophagy.' In this review, we summarize studies from the fruit fly Drosophila melanogaster and the nematode Caenorhabditis elegans that highlight the roles of autophagy in innate immune responses to viral, bacterial, and fungal pathogens. Research from these genetically tractable invertebrates has uncovered several conserved immunological paradigms, such as direct targeting of intracellular pathogens by xenophagy and regulation of autophagy by pattern recognition receptors in D. melanogaster. Although C. elegans has no known pattern recognition receptors, this organism has been particularly useful in understanding many aspects of innate immunity. Indeed, work in C. elegans was the first to show xenophagic targeting of microsporidia, a fungal pathogen that infects all animals, and to identify TFEB/HLH-30, a helix-loop-helix transcription factor, as an evolutionarily conserved regulator of autophagy gene expression and host tolerance. Studies in C. elegans have also highlighted the more recently appreciated relationship between autophagy and tolerance to extracellular pathogens. Studies of simple, short-lived invertebrates such as flies and worms will continue to provide valuable insights into the molecular mechanisms by which autophagy and immunity pathways intersect and their contribution to organismal survival. Abbreviations Atg autophagy related BECN1 Beclin 1 CALCOCO2 calcium binding and coiled-coil domain 2 Cry5B crystal toxin 5B Daf abnormal dauer formation DKF-1 D kinase family-1 EPG-7 Ectopic P Granules-7 FuDR fluorodeoxyuridine GFP green fluorescent protein HLH-30 Helix Loop Helix-30 Imd immune deficiency ins-18 INSulin related-18; LET-363, LEThal-363 lgg-1 LC3, GABARAP and GATE-16 family-1 MAPK mitogen-activated protein kinase MATH the meprin and TRAF homology MTOR mechanistic target of rapamycin NBR1 neighbor of BRCA1 gene 1 NFKB nuclear factor of kappa light polypeptide gene enhancer in B cells NOD nucleotide-binding oligomerization domain containing OPTN optineurin PAMPs pathogen-associated molecular patterns Park2 Parkinson disease (autosomal recessive, juvenile) 2, parkin pdr-1 Parkinson disease related PFTs pore-forming toxins PGRP peptidoglycan-recognition proteins PIK3C3 phosphatidylinositol 3- kinase catalytic subunit type 3 pink-1 PINK (PTEN-I induced kinase) homolog PRKD protein kinase D; PLC, phospholipase C PRKN parkin RBR E3 ubiquitin protein ligase PRRs pattern-recognition receptors PtdIns3P phosphatidylinositol-3-phosphate rab-5 RAB family-5 RB1CC1 RB1-inducible coiled-coil 1 RNAi RNA interference sqst SeQueSTosome related SQSTM1 sequestosome 1 TBK1 TANK-binding kinase 1 TFEB transcription factor EB TGFB/TGF-ß transforming growth factor beta TLRs toll-like receptors unc-51 UNCoordinated-51 VPS vacuolar protein sorting; VSV, vesicular stomatitis virus VSV-G VSV surface glycoprotein G Wipi2 WD repeat domain, phosphoinositide interacting 2.


Subject(s)
Autophagy/immunology , Bacterial Infections/immunology , Caenorhabditis elegans , Drosophila melanogaster , Immunity, Innate , Models, Animal , Mycoses/immunology , Virus Diseases/immunology , Animals , Bacterial Infections/microbiology , Humans , Mycoses/microbiology , Receptors, Pattern Recognition/physiology , Signal Transduction/immunology , Virus Diseases/virology
9.
Autophagy ; 13(2): 371-385, 2017 Feb.
Article in English | MEDLINE | ID: mdl-27875098

ABSTRACT

Autophagy is an evolutionarily conserved intracellular system that maintains cellular homeostasis by degrading and recycling damaged cellular components. The transcription factor HLH-30/TFEB-mediated autophagy has been reported to regulate tolerance to bacterial infection, but less is known about the bona fide bacterial effector that activates HLH-30 and autophagy. Here, we reveal that bacterial membrane pore-forming toxin (PFT) induces autophagy in an HLH-30-dependent manner in Caenorhabditis elegans. Moreover, autophagy controls the susceptibility of animals to PFT toxicity through xenophagic degradation of PFT and repair of membrane-pore cell-autonomously in the PFT-targeted intestinal cells in C. elegans. These results demonstrate that autophagic pathways and autophagy are induced partly at the transcriptional level through HLH-30 activation and are required to protect metazoan upon PFT intoxication. Together, our data show a new and powerful connection between HLH-30-mediated autophagy and epithelium intrinsic cellular defense against the single most common mode of bacterial attack in vivo.


Subject(s)
Autophagy , Bacterial Proteins/toxicity , Basic Helix-Loop-Helix Transcription Factors/metabolism , Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans/cytology , Caenorhabditis elegans/microbiology , Endotoxins/toxicity , Epithelium/metabolism , Hemolysin Proteins/toxicity , Animals , Autophagy/drug effects , Autophagy-Related Proteins/genetics , Autophagy-Related Proteins/metabolism , Bacillus thuringiensis Toxins , Base Sequence , Caenorhabditis elegans/drug effects , Caenorhabditis elegans/metabolism , Epithelium/drug effects , Epithelium/ultrastructure , Gene Expression Regulation/drug effects , Intestines/microbiology , Intestines/pathology , Models, Biological , Transcription, Genetic/drug effects
10.
Article in English | MEDLINE | ID: mdl-27570746

ABSTRACT

Enterohemorrhagic Escherichia coli (EHEC) O157:H7 is an important foodborne pathogen causing severe diseases in humans worldwide. Currently, there is no specific treatment available for EHEC infection and the use of conventional antibiotics is contraindicated. Therefore, identification of potential therapeutic targets and development of effective measures to control and treat EHEC infection are needed. Lipopolysaccharides (LPS) are surface glycolipids found on the outer membrane of gram-negative bacteria, including EHEC, and LPS biosynthesis has long been considered as potential anti-bacterial target. Here, we demonstrated that the EHEC rfaD gene that functions in the biosynthesis of the LPS inner core is required for the intestinal colonization and pathogenesis of EHEC in vivo. Disruption of the EHEC rfaD confers attenuated toxicity in Caenorhabditis elegans and less bacterial colonization in the intestine of C. elegans and mouse. Moreover, rfaD is also involved in the control of susceptibility of EHEC to antimicrobial peptides and host intestinal immunity. It is worth noting that rfaD mutation did not interfere with the growth kinetics when compared to the wild-type EHEC cells. Taken together, we demonstrated that mutations of the EHEC rfaD confer hypersusceptibility to host intestinal innate immunity in vivo, and suggested that targeting the RfaD or the core LPS synthesis pathway may provide alternative therapeutic regimens for EHEC infection.


Subject(s)
Carbohydrate Epimerases/genetics , Carbohydrate Epimerases/metabolism , Escherichia coli O157/enzymology , Escherichia coli O157/genetics , Intestines/immunology , Lipopolysaccharides/biosynthesis , Sequence Deletion , Actins/immunology , Actins/metabolism , Animals , Antimicrobial Cationic Peptides/pharmacology , Caenorhabditis elegans , Caenorhabditis elegans Proteins/immunology , Caenorhabditis elegans Proteins/metabolism , Carbohydrate Epimerases/immunology , Disease Models, Animal , Escherichia coli Infections/immunology , Escherichia coli Infections/microbiology , Escherichia coli Infections/therapy , Escherichia coli Proteins/immunology , Escherichia coli Proteins/metabolism , Female , Humans , Immunity, Innate , Intestinal Diseases/immunology , Intestinal Diseases/microbiology , Intestines/microbiology , Intestines/pathology , Lipopolysaccharides/chemistry , Mice , Mice, Inbred C57BL , Virulence Factors/genetics , Virulence Factors/metabolism , Cathelicidins
11.
Biochem Biophys Res Commun ; 435(1): 107-12, 2013 May 24.
Article in English | MEDLINE | ID: mdl-23624506

ABSTRACT

Epigenetic regulation via abnormal activation of histone deacetylases (HDACs) is a mechanism that leads to cancer initiation and promotion. Activation of HDACs results in transcriptional upregulation of human telomerase reverse transcriptase (hTERT) and increases telomerase activity during cellular immortalization and tumorigenesis. However, the effects of HDAC inhibitors on the transcription of hTERT vary in different cancer cells. Here, we studied the effects of a novel HDAC inhibitor, AR42, on telomerase activity in a PTEN-null U87MG glioma cell line. AR42 increased hTERT mRNA in U87MG glioma cells, but suppressed total telomerase activity in a dose-dependent manner. Further analyses suggested that AR42 decreases the phosphorylation of hTERT via an Akt-dependent mechanism. Suppression of Akt phosphorylation and telomerase activity was also observed with PI3K inhibitor LY294002 further supporting the hypothesis that Akt signaling is involved in suppression of AR42-induced inhibition of telomerase activity. Finally, ectopic expression of a constitutive active form of Akt restored telomerase activity in AR42-treated cells. Taken together, our results demonstrate that the novel HDAC inhibitor AR42 can suppress telomerase activity by inhibiting Akt-mediated hTERT phosphorylation, indicating that the PI3K/Akt pathway plays an important role in the regulation of telomerase activity in response to this HDAC inhibitor.


Subject(s)
Histone Deacetylase Inhibitors/pharmacology , Phenylbutyrates/pharmacology , Proto-Oncogene Proteins c-akt/metabolism , Telomerase/antagonists & inhibitors , Cell Line, Tumor , Chromones/pharmacology , Enzyme-Linked Immunosorbent Assay , Gene Expression Regulation, Enzymologic/drug effects , Gene Expression Regulation, Neoplastic/drug effects , Glioma/genetics , Glioma/metabolism , Glioma/pathology , Humans , Morpholines/pharmacology , Mutation , PTEN Phosphohydrolase/genetics , Phosphatidylinositol 3-Kinases/metabolism , Phosphoinositide-3 Kinase Inhibitors , Phosphorylation/drug effects , Reverse Transcriptase Polymerase Chain Reaction , Signal Transduction/drug effects , Telomerase/genetics , Telomerase/metabolism
12.
Exp Gerontol ; 48(3): 371-9, 2013 Mar.
Article in English | MEDLINE | ID: mdl-23318476

ABSTRACT

Aging is a process of gradual functional decline leading to death. Reactive oxygen species (ROS) not only contribute to oxidative stress and cell damage that lead to aging but also serve as signaling molecules. Sestrins are evolutionarily conserved in all multicellular organisms and are required for regenerating hyperoxidized forms of peroxiredoxins and ROS clearance. However, whether sestrins regulate longevity in metazoans is still unclear. Here, we demonstrated that SESN-1, the only sestrin ortholog in Caenorhabditis elegans, is a positive regulator of lifespan. sesn-1 gene mutant worms had significantly shorter lifespans compared to wild-type animals, and overexpression of sesn-1 prolonged lifespan. Moreover, sesn-1 was found to play a key role in defense against several life stressors, including heat, hydrogen peroxide and the heavy metal copper; and sesn-1 mutants expressed higher levels of ROS and showed a decline in body muscle function. Surprisingly, loss of sesn-1 did not weaken the innate immune function of the worms. Together, these results suggest that SESN-1 is required for normal lifespan and its function in muscle cells prevents muscle degeneration over a lifetime.


Subject(s)
Caenorhabditis elegans Proteins/physiology , Caenorhabditis elegans/physiology , Heat-Shock Proteins/physiology , Longevity/physiology , Aging/physiology , Animals , Animals, Genetically Modified , Caenorhabditis elegans Proteins/genetics , Heat-Shock Proteins/deficiency , Heat-Shock Proteins/genetics , Hot Temperature , Immunity, Innate/physiology , Locomotion/physiology , Muscle Strength/physiology , Oxidative Stress/physiology , RNA Interference , Reactive Oxygen Species/metabolism , Stress, Physiological/physiology
13.
BMC Bioinformatics ; 12 Suppl 8: S2, 2011 Oct 03.
Article in English | MEDLINE | ID: mdl-22151901

ABSTRACT

BACKGROUND: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance.


Subject(s)
Algorithms , Data Mining/methods , Genes , Animals , Data Mining/standards , Humans , National Library of Medicine (U.S.) , Periodicals as Topic , United States
14.
BMC Bioinformatics ; 12 Suppl 8: S6, 2011 Oct 03.
Article in English | MEDLINE | ID: mdl-22152021

ABSTRACT

BACKGROUND: Previously, gene normalization (GN) systems are mostly focused on disambiguation using contextual information. An effective gene mention tagger is deemed unnecessary because the subsequent steps will filter out false positives and high recall is sufficient. However, unlike similar tasks in the past BioCreative challenges, the BioCreative III GN task is particularly challenging because it is not species-specific. Required to process full-length articles, an ineffective gene mention tagger may produce a huge number of ambiguous false positives that overwhelm subsequent filtering steps while still missing many true positives. RESULTS: We present our GN system participated in the BioCreative III GN task. Our system applies a typical 2-stage approach to GN but features a soft tagging gene mention tagger that generates a set of overlapping gene mention variants with a nearly perfect recall. The overlapping gene mention variants increase the chance of precise match in the dictionary and alleviate the need of disambiguation. Our GN system achieved a precision of 0.9 (F-score 0.63) on the BioCreative III GN test corpus with the silver annotation of 507 articles. Its TAP-k scores are competitive to the best results among all participants. CONCLUSIONS: We show that despite the lack of clever disambiguation in our gene normalization system, effective soft tagging of gene mention variants can indeed contribute to performance in cross-species and full-text gene normalization.


Subject(s)
Data Mining , Genes , Species Specificity , Data Mining/methods , Natural Language Processing , Periodicals as Topic , Software , Terminology as Topic
15.
J Biomed Semantics ; 2 Suppl 5: S11, 2011 Oct 06.
Article in English | MEDLINE | ID: mdl-22166494

ABSTRACT

BACKGROUND: Competitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC). The preparation of the GSC is time-consuming and costly and the final corpus consists at the most of a few thousand documents annotated with a limited set of semantic groups. To overcome these shortcomings, the CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions, the first version of the Silver Standard Corpus (SSC-I). The four semantic groups are chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). This corpus has been used for the First CALBC Challenge asking the participants to annotate the corpus with their text processing solutions. RESULTS: All four PPs from the CALBC project and in addition, 12 challenge participants (CPs) contributed annotated data sets for an evaluation against the SSC-I. CPs could ignore the training data and deliver the annotations from their genuine annotation system, or could train a machine-learning approach on the provided pre-annotated data. In general, the performances of the annotation solutions were lower for entities from the categories CHED and PRGE in comparison to the identification of entities categorized as DISO and SPE. The best performance over all semantic groups were achieved from two annotation solutions that have been trained on the SSC-I.The data sets from participants were used to generate the harmonised Silver Standard Corpus II (SSC-II), if the participant did not make use of the annotated data set from the SSC-I for training purposes. The performances of the participants' solutions were again measured against the SSC-II. The performances of the annotation solutions showed again better results for DISO and SPE in comparison to CHED and PRGE. CONCLUSIONS: The SSC-I delivers a large set of annotations (1,121,705) for a large number of documents (100,000 Medline abstracts). The annotations cover four different semantic groups and are sufficiently homogeneous to be reproduced with a trained classifier leading to an average F-measure of 85%. Benchmarking the annotation solutions against the SSC-II leads to better performance for the CPs' annotation solutions in comparison to the SSC-I.

16.
BMC Bioinformatics ; 10 Suppl 15: S7, 2009 Dec 03.
Article in English | MEDLINE | ID: mdl-19958517

ABSTRACT

BACKGROUND: To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. RESULTS: Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. CONCLUSION: By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Software , Algorithms , Data Mining/methods , Natural Language Processing , PubMed
17.
Genome Biol ; 9 Suppl 2: S2, 2008.
Article in English | MEDLINE | ID: mdl-18834493

ABSTRACT

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.


Subject(s)
Computational Biology/methods , Genes , Societies, Scientific , Congresses as Topic
18.
Genome Biol ; 9 Suppl 2: S6, 2008.
Article in English | MEDLINE | ID: mdl-18834497

ABSTRACT

We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.


Subject(s)
Biomedical Research/methods , Computational Biology/methods , Information Storage and Retrieval , Internet , Humans
19.
Bioinformatics ; 24(13): i286-94, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18586726

ABSTRACT

MOTIVATION: Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. RESULTS: We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. AVAILABILITY: Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htm.


Subject(s)
Genes/genetics , Genetic Markers/genetics , Models, Genetic , Natural Language Processing , Periodicals as Topic , Vocabulary, Controlled , Artificial Intelligence , Computer Simulation , Systems Integration
SELECTION OF CITATIONS
SEARCH DETAIL