Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
Nat Commun ; 15(1): 5545, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956024

ABSTRACT

Epithelial cells are the first point of contact for bacteria entering the respiratory tract. Streptococcus pneumoniae is an obligate human pathobiont of the nasal mucosa, carried asymptomatically but also the cause of severe pneumoniae. The role of the epithelium in maintaining homeostatic interactions or mounting an inflammatory response to invasive S. pneumoniae is currently poorly understood. However, studies have shown that chromatin modifications, at the histone level, induced by bacterial pathogens interfere with the host transcriptional program and promote infection. Here, we uncover a histone modification induced by S. pneumoniae infection maintained for at least 9 days upon clearance of bacteria with antibiotics. Di-methylation of histone H3 on lysine 4 (H3K4me2) is induced in an active manner by bacterial attachment to host cells. We show that infection establishes a unique epigenetic program affecting the transcriptional response of epithelial cells, rendering them more permissive upon secondary infection. Our results establish H3K4me2 as a unique modification induced by infection, distinct from H3K4me3 or me1, which localizes to enhancer regions genome-wide. Therefore, this study reveals evidence that bacterial infection leaves a memory in epithelial cells after bacterial clearance, in an epigenomic mark, thereby altering cellular responses to subsequent infections and promoting infection.


Subject(s)
Epithelial Cells , Histones , Pneumococcal Infections , Streptococcus pneumoniae , Histones/metabolism , Streptococcus pneumoniae/metabolism , Streptococcus pneumoniae/physiology , Epithelial Cells/microbiology , Epithelial Cells/metabolism , Methylation , Humans , Pneumococcal Infections/microbiology , Pneumococcal Infections/metabolism , Epigenesis, Genetic , Animals , Mice , Lysine/metabolism , Mice, Inbred C57BL
2.
Transl Oncol ; 44: 101940, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38537326

ABSTRACT

Precision Medicine is being increasingly used in the developed world to improve health care. While several Precision Medicine (PM) initiatives have been launched worldwide, their implementations have proven to be more challenging particularly in low- and middle-income countries. To address this issue, the "Personalized Medicine in North Africa" initiative (PerMediNA) was launched in three North African countries namely Tunisia, Algeria and Morocco. PerMediNA is coordinated by Institut Pasteur de Tunis together with the French Ministry for Europe and Foreign Affairs, with the support of Institut Pasteur in France. The project is carried out along with Institut Pasteur d'Algérie and Institut Pasteur du Maroc in collaboration with national and international leading institutions in the field of PM including Institut Gustave Roussy in Paris. PerMediNA aims to assess the readiness level of PM implementation in North Africa, to strengthen PM infrastructure, to provide workforce training, to generate genomic data on North African populations, to implement cost effective, affordable and sustainable genetic testing for cancer patients and to inform policy makers on how to translate research knowledge into health products and services. Gender equity and involvement of young scientists in this implementation process are other key goals of the PerMediNA project. In this paper, we are describing PerMediNA as the first PM implementation initiative in North Africa. Such initiatives contribute significantly in shortening existing health disparities and inequities between developed and developing countries and accelerate access to innovative treatments for global health.

3.
Aging Cell ; 22(10): e13959, 2023 10.
Article in English | MEDLINE | ID: mdl-37688320

ABSTRACT

Cockayne syndrome (CS) and UV-sensitive syndrome (UVSS) are rare genetic disorders caused by mutation of the DNA repair and multifunctional CSA or CSB protein, but only CS patients display a progeroid and neurodegenerative phenotype, providing a unique conceptual and experimental paradigm. As DNA methylation (DNAm) remodelling is a major ageing marker, we performed genome-wide analysis of DNAm of fibroblasts from healthy, UVSS and CS individuals. Differential analysis highlighted a CS-specific epigenomic signature (progeroid-related; not present in UVSS) enriched in three categories: developmental transcription factors, ion/neurotransmitter membrane transporters and synaptic neuro-developmental genes. A large fraction of CS-specific DNAm changes were associated with expression changes in CS samples, including in previously reported post-mortem cerebella. The progeroid phenotype of CS was further supported by epigenomic hallmarks of ageing: the prediction of DNAm of repetitive elements suggested an hypomethylation of Alu sequences in CS, and the epigenetic clock returned a marked increase in CS biological age respect to healthy and UVSS cells. The epigenomic remodelling of accelerated ageing in CS displayed both commonalities and differences with other progeroid diseases and regular ageing. CS shared DNAm changes with normal ageing more than other progeroid diseases do, and included genes functionally validated for regular ageing. Collectively, our results support the existence of an epigenomic basis of accelerated ageing in CS and unveil new genes and pathways that are potentially associated with the progeroid/degenerative phenotype.


Subject(s)
Cockayne Syndrome , Humans , Cockayne Syndrome/genetics , Cockayne Syndrome/metabolism , Epigenomics , DNA Repair Enzymes/genetics , DNA Repair Enzymes/metabolism , DNA Repair , Aging/genetics , Mutation
4.
NAR Genom Bioinform ; 4(2): lqac041, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35664802

ABSTRACT

We present ePeak, a Snakemake-based pipeline for the identification and quantification of reproducible peaks from raw ChIP-seq, CUT&RUN and CUT&Tag epigenomic profiling techniques. It also includes a statistical module to perform tailored differential marking and binding analysis with state of the art methods. ePeak streamlines critical steps like the quality assessment of the immunoprecipitation, spike-in calibration and the selection of reproducible peaks between replicates for both narrow and broad peaks. It generates complete reports for data quality control assessment and optimal interpretation of the results. We advocate for a differential analysis that accounts for the biological dynamics of each chromatin factor. Thus, ePeak provides linear and nonlinear methods for normalisation as well as conservative and stringent models for variance estimation and significance testing of the observed marking/binding differences. Using a published ChIP-seq dataset, we show that distinct populations of differentially marked/bound peaks can be identified. We study their dynamics in terms of read coverage and summit position, as well as the expression of the neighbouring genes. We propose that ePeak can be used to measure the richness of the epigenomic landscape underlying a biological process by identifying diverse regulatory regimes.

5.
J Immunol Methods ; 499: 113176, 2021 12.
Article in English | MEDLINE | ID: mdl-34742775

ABSTRACT

Single-cell RNA-sequencing (scRNAseq) experiments are becoming a standard tool for bench-scientists to explore the cellular diversity present in all tissues. Data produced by scRNAseq is technically complex and requires analytical workflows that are an active field of bioinformatics research, whereas a wealth of biological background knowledge is needed to guide the investigation. Thus, there is an increasing need to develop applications geared towards bench-scientists to help them abstract the technical challenges of the analysis so that they can focus on the science at play. It is also expected that such applications should support closer collaboration between bioinformaticians and bench-scientists by providing reproducible science tools. We present SCHNAPPs, a Graphical User Interface (GUI), designed to enable bench-scientists to autonomously explore and interpret scRNAseq data and associated annotations. The R/Shiny-based application allows following different steps of scRNAseq analysis workflows from Seurat or Scran packages: performing quality control on cells and genes, normalizing the expression matrix, integrating different samples, dimension reduction, clustering, and differential gene expression analysis. Visualization tools for exploring each step of the process include violin plots, 2D projections, Box-plots, alluvial plots, and histograms. An R-markdown report can be generated that tracks modifications and selected visualizations. The modular design of the tool allows it to easily integrate new visualizations and analyses by bioinformaticians. We illustrate the main features of the tool by applying it to the characterization of T cells in a scRNAseq and Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-Seq) experiment of two healthy individuals.


Subject(s)
Leukocytes, Mononuclear/cytology , Sequence Analysis, RNA , Single-Cell Analysis , Software , Humans , Leukocytes, Mononuclear/immunology
6.
PLoS Genet ; 17(9): e1009761, 2021 09.
Article in English | MEDLINE | ID: mdl-34491998

ABSTRACT

Virulence of the neonatal pathogen Group B Streptococcus is under the control of the master regulator CovR. Inactivation of CovR is associated with large-scale transcriptome remodeling and impairs almost every step of the interaction between the pathogen and the host. However, transcriptome analyses suggested a plasticity of the CovR signaling pathway in clinical isolates leading to phenotypic heterogeneity in the bacterial population. In this study, we characterized the CovR regulatory network in a strain representative of the CC-17 hypervirulent lineage responsible of the majority of neonatal meningitis. Transcriptome and genome-wide binding analysis reveal the architecture of the CovR network characterized by the direct repression of a large array of virulence-associated genes and the extent of co-regulation at specific loci. Comparative functional analysis of the signaling network links strain-specificities to the regulation of the pan-genome, including the two specific hypervirulent adhesins and horizontally acquired genes, to mutations in CovR-regulated promoters, and to variability in CovR activation by phosphorylation. This regulatory adaptation occurs at the level of genes, promoters, and of CovR itself, and allows to globally reshape the expression of virulence genes. Overall, our results reveal the direct, coordinated, and strain-specific regulation of virulence genes by the master regulator CovR and suggest that the intra-species evolution of the signaling network is as important as the expression of specific virulence factors in the emergence of clone associated with specific diseases.


Subject(s)
Bacterial Proteins/physiology , Gene Regulatory Networks , Streptococcus agalactiae/pathogenicity , Virulence Factors/physiology , Virulence/genetics , Bacterial Proteins/genetics , Chromosomes, Bacterial , Genes, Bacterial , Host-Pathogen Interactions , Humans , Promoter Regions, Genetic , Prophages/genetics , Streptococcus agalactiae/genetics , Transcription, Genetic/physiology , Virulence Factors/genetics
7.
Medicine (Baltimore) ; 98(11): e14879, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30882696

ABSTRACT

RATIONALE: Nocardia species are not commonly referred as primary infectious entities but rather as opportunistic pathogens. Infectious cases of Nocardia spp. in immunocompetent individuals are rare. PATIENT CONCERNS: An immunocompetent 58-year-old patient presented with recurrent headaches. DIAGNOSIS: A brain abscess was found and surgically drained. Matrix-assisted laser desorption ionization-time-of-flight mass spectrometry and heat shock protein 65/16S-23S rRNA gene intergenic spacer genotyping from the sample revealed the etiological agent as Nocardia beijingensis. INTERVENTIONS: Meropenem/amikacin/Trimethoprim-sulfamethoxazole were administered. OUTCOMES: The infection persisted leading to the patient's death. LESSONS: Here we present the first case of N. beijingensis infection of the central nervous system in an immunocompetent patient from Latin America. Further inquiry is needed to establish whether this species is more virulent than other Nocardia isolates.


Subject(s)
Brain Abscess/diagnosis , Nocardia Infections/complications , Anti-Bacterial Agents/therapeutic use , Brain Abscess/etiology , Humans , Latin America , Male , Meropenem/therapeutic use , Middle Aged , Nocardia/pathogenicity , Nocardia Infections/diagnosis , Tomography, X-Ray Computed/methods , Trimethoprim, Sulfamethoxazole Drug Combination/therapeutic use
8.
BMC Infect Dis ; 19(1): 258, 2019 Mar 15.
Article in English | MEDLINE | ID: mdl-30876395

ABSTRACT

BACKGROUND: Enterococcus faecium is ranked worldwide as one of the top ten pathogens identified in healthcare-associated infections (HAI) and is classified as one of the high priority pathogens for research and development of new antibiotics worldwide. Due to molecular biology techniques' higher costs, the approach for identifying and controlling infectious diseases in developing countries has been based on clinical and epidemiological perspectives. Nevertheless, after an abrupt vancomycin-resistant Enterococcus faecium dissemination in the Méderi teaching hospital, ending up in an outbreak, further measures needed to be taken into consideration. The present study describes the vancomycin-resistant Enterococcus faecium pattern within Colombian's largest installed-bed capacity hospital in 2016. METHODS: Thirty-three vancomycin-resistant Enterococcus faecium isolates were recovered during a 5-month period in 2016. Multilocus variable-number tandem-repeat analysis was used for molecular typing to determine clonality amongst strains. A modified time-place-sequence algorithm was used to trace VREfm spread patterns during the outbreak period and estimate transmission routes. RESULTS: Four clonal profiles were identified. Chronological clonal profile follow-up suggested a transitional spread from profile "A" to profile "B", returning to a higher prevalence of "A" by the end of the study. Antibiotic susceptibility indicated high-level vancomycin-resistance in most isolates frequently matching vanA gene identification. DISCUSSION: Transmission analysis suggested cross-contamination via healthcare workers. Despite epidemiological control of the outbreak, post-outbreak isolates were still being identified as having outbreak-related clonal profile (A), indicating reduction but not eradication of this clonality. This study supports the use of combined molecular and epidemiological strategies in an approach to controlling infectious diseases. It contributes towards a more accurate evaluation of the effectiveness of the epidemiological measures taken regarding outbreak control and estimates the main cause related to the spread of this microorganism.


Subject(s)
Disease Outbreaks , Enterococcus faecium/genetics , Gram-Positive Bacterial Infections/epidemiology , Gram-Positive Bacterial Infections/microbiology , Vancomycin-Resistant Enterococci/genetics , Anti-Bacterial Agents/pharmacology , Bacterial Proteins/genetics , Bacterial Typing Techniques , Colombia/epidemiology , Enterococcus faecium/classification , Enterococcus faecium/drug effects , Enterococcus faecium/isolation & purification , Gram-Positive Bacterial Infections/transmission , Hospitals, Teaching , Humans , Microbial Sensitivity Tests , Molecular Epidemiology , Multilocus Sequence Typing , Vancomycin/pharmacology , Vancomycin-Resistant Enterococci/classification , Vancomycin-Resistant Enterococci/drug effects , Vancomycin-Resistant Enterococci/isolation & purification
9.
Cell Stem Cell ; 23(5): 742-757.e8, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30401455

ABSTRACT

Understanding general principles that safeguard cellular identity should reveal critical insights into common mechanisms underlying specification of varied cell types. Here, we show that SUMO modification acts to stabilize cell fate in a variety of contexts. Hyposumoylation enhances pluripotency reprogramming in vitro and in vivo, increases lineage transdifferentiation, and facilitates leukemic cell differentiation. Suppressing sumoylation in embryonic stem cells (ESCs) promotes their conversion into 2-cell-embryo-like (2C-like) cells. During reprogramming to pluripotency, SUMO functions on fibroblastic enhancers to retain somatic transcription factors together with Oct4, Sox2, and Klf4, thus impeding somatic enhancer inactivation. In contrast, in ESCs, SUMO functions on heterochromatin to silence the 2C program, maintaining both proper H3K9me3 levels genome-wide and repression of the Dux locus by triggering recruitment of the sumoylated PRC1.6 and Kap/Setdb1 repressive complexes. Together, these studies show that SUMO acts on chromatin as a glue to stabilize key determinants of somatic and pluripotent states.


Subject(s)
Chromatin/metabolism , Mouse Embryonic Stem Cells/cytology , Mouse Embryonic Stem Cells/metabolism , Small Ubiquitin-Related Modifier Proteins/metabolism , Animals , Cells, Cultured , Cellular Reprogramming , Kruppel-Like Factor 4 , Mice , Mice, Inbred C57BL , Transcription Factors/metabolism
10.
Genome Biol ; 18(1): 207, 2017 10 31.
Article in English | MEDLINE | ID: mdl-29084582

ABSTRACT

BACKGROUND: Polycomb Repressive Complexes 2 (PRC2) are multi-protein chromatin modifiers that are evolutionarily conserved among eukaryotes and play key roles in the regulation of gene expression, notably through the trimethylation of lysine 27 of histone H3 (H3K27me3). Although PRC2-mediated gene regulation has been studied in many organisms, few studies have explored in depth the evolutionary conservation of PRC2 targets. RESULTS: Here, we compare the H3K27me3 epigenomic profiles for the two closely related species Arabidopsis thaliana and Arabidopsis lyrata and the more distant species Arabis alpina, three Brassicaceae that diverged from each other within the past 24 million years. Using a robust set of gene orthologs present in the three species, we identify two classes of evolutionarily conserved PRC2 targets, which are characterized by either developmentally plastic or developmentally constrained H3K27me3 marking across species. Constrained H3K27me3 marking is associated with higher conservation of promoter sequence information content and higher nucleosome occupancy compared to plastic H3K27me3 marking. Moreover, gene orthologs with constrained H3K27me3 marking exhibit a higher degree of tissue specificity and tend to be involved in developmental functions, whereas gene orthologs with plastic H3K27me3 marking preferentially encode proteins associated with metabolism and stress responses. In addition, gene orthologs with constrained H3K27me3 marking are the predominant contributors to higher-order chromosome organization. CONCLUSIONS: Our findings indicate that developmentally plastic and constrained H3K27me3 marking define two evolutionarily conserved modes of PRC2-mediated gene regulation that are associated with distinct selective pressures operating at multiple scales, from DNA sequence to gene function and chromosome architecture.


Subject(s)
Brassicaceae/genetics , Epigenesis, Genetic , Evolution, Molecular , Gene Expression Regulation, Plant , Histone Code , Polycomb Repressive Complex 2/metabolism , Arabidopsis/genetics , Arabis/genetics , Base Sequence , Chromosomes, Plant , Conserved Sequence , Gene Duplication , Promoter Regions, Genetic , Transcriptome
11.
Nat Plants ; 1: 14023, 2015 Feb 02.
Article in English | MEDLINE | ID: mdl-27246759

ABSTRACT

Despite evolutionary conserved mechanisms to silence transposable element activity, there are drastic differences in the abundance of transposable elements even among closely related plant species. We conducted a de novo assembly for the 375 Mb genome of the perennial model plant, Arabis alpina. Analysing this genome revealed long-lasting and recent transposable element activity predominately driven by Gypsy long terminal repeat retrotransposons, which extended the low-recombining pericentromeres and transformed large formerly euchromatic regions into repeat-rich pericentromeric regions. This reduced capacity for long terminal repeat retrotransposon silencing and removal in A. alpina co-occurs with unexpectedly low levels of DNA methylation. Most remarkably, the striking reduction of symmetrical CG and CHG methylation suggests weakened DNA methylation maintenance in A. alpina compared with Arabidopsis thaliana. Phylogenetic analyses indicate a highly dynamic evolution of some components of methylation maintenance machinery that might be related to the unique methylation in A. alpina.

12.
Nucleic Acids Res ; 40(Database issue): D242-51, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22110040

ABSTRACT

Linear motifs are short, evolutionarily plastic components of regulatory proteins and provide low-affinity interaction interfaces. These compact modules play central roles in mediating every aspect of the regulatory functionality of the cell. They are particularly prominent in mediating cell signaling, controlling protein turnover and directing protein localization. Given their importance, our understanding of motifs is surprisingly limited, largely as a result of the difficulty of discovery, both experimentally and computationally. The Eukaryotic Linear Motif (ELM) resource at http://elm.eu.org provides the biological community with a comprehensive database of known experimentally validated motifs, and an exploratory tool to discover putative linear motifs in user-submitted protein sequences. The current update of the ELM database comprises 1800 annotated motif instances representing 170 distinct functional classes, including approximately 500 novel instances and 24 novel classes. Several older motif class entries have been also revisited, improving annotation and adding novel instances. Furthermore, addition of full-text search capabilities, an enhanced interface and simplified batch download has improved the overall accessibility of the ELM data. The motif discovery portion of the ELM resource has added conservation, and structural attributes have been incorporated to aid users to discriminate biologically relevant motifs from stochastically occurring non-functional instances.


Subject(s)
Amino Acid Motifs , Databases, Protein , Computer Graphics , Disease/genetics , Eukaryota , Sequence Analysis, Protein , User-Computer Interface , Viral Proteins/chemistry
13.
Nucleic Acids Res ; 39(Database issue): D261-7, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21062810

ABSTRACT

The Phospho.ELM resource (http://phospho.elm.eu.org) is a relational database designed to store in vivo and in vitro phosphorylation data extracted from the scientific literature and phosphoproteomic analyses. The resource has been actively developed for more than 7 years and currently comprises 42,574 serine, threonine and tyrosine non-redundant phosphorylation sites. Several new features have been implemented, such as structural disorder/order and accessibility information and a conservation score. Additionally, the conservation of the phosphosites can now be visualized directly on the multiple sequence alignment used for the score calculation. Finally, special emphasis has been put on linking to external resources such as interaction networks and other databases.


Subject(s)
Databases, Protein , Phosphoproteins/chemistry , Amino Acid Sequence , Animals , Conserved Sequence , Humans , Mice , Phosphorylation , Protein Conformation , Sequence Analysis, Protein , Serine/metabolism , Threonine/metabolism , Tyrosine/metabolism
14.
Nucleic Acids Res ; 38(Database issue): D167-80, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19920119

ABSTRACT

Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a 'Bar Code' format, which also displays known instances from homologous proteins through a novel 'Instance Mapper' protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation.


Subject(s)
Amino Acid Motifs/genetics , Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Eukaryotic Cells/chemistry , Amino Acid Sequence , Animals , Computational Biology/trends , Databases, Protein , Humans , Information Storage and Retrieval/methods , Internet , Molecular Sequence Data , Protein Structure, Tertiary , Sequence Homology, Amino Acid , Software
15.
PLoS One ; 4(7): e6052, 2009 Jul 08.
Article in English | MEDLINE | ID: mdl-19584925

ABSTRACT

BACKGROUND: Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein-protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. RESULTS: The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co-evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif-mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. CONCLUSION: The results suggest that flanking regions are relevant for linear motif-mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise.


Subject(s)
Amino Acid Motifs , Evolution, Molecular , Proteins/genetics , Gene Expression Profiling , Proteins/chemistry , Sequence Alignment
16.
Bioinformatics ; 25(1): 1-5, 2009 Jan 01.
Article in English | MEDLINE | ID: mdl-19033273

ABSTRACT

MOTIVATION: We noted that the sumoylation site in C/EBP homologues is conserved beyond the canonical consensus sequence for sumoylation. Therefore, we investigated whether this pattern might define a more general protein motif. RESULTS: We undertook a survey of the human proteome using a regular expression based on the C/EBP motif. This revealed significant enrichment of the motif using different Gene Ontology terms (e.g. 'transcription') that pertain to the nucleus. When considering requirements for the motif to be functional (evolutionary conservation, structural accessibility of the motif and proper cell localization of the protein), more than 130 human proteins were retrieved from the UniProt/Swiss-Prot database. These candidates were particularly enriched in transcription factors, including FOS, JUN, Hif-1alpha, MLL2 and members of the KLF, MAF and NFATC families; chromatin modifiers like CHD-8, HDAC4 and DNA Top1; and the transcriptional regulatory kinases HIPK1 and HIPK2. The KEPEmotif appears to be restricted to the metazoan lineage and has three length variants-short, medium and long-which do not appear to interchange.


Subject(s)
Chromatin/metabolism , Nucleoproteins/chemistry , Small Ubiquitin-Related Modifier Proteins/chemistry , Transcription Factors/chemistry , Amino Acid Motifs , Amino Acid Sequence , Animals , Conserved Sequence , Databases, Protein , Gene Expression Regulation , Humans , Molecular Sequence Data , Mutation/genetics , Proteome/chemistry , Saccharomyces cerevisiae/chemistry , Transcription Factors/metabolism
17.
BMC Bioinformatics ; 9: 229, 2008 May 06.
Article in English | MEDLINE | ID: mdl-18460207

ABSTRACT

BACKGROUND: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant. RESULTS: We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e.g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface. CONCLUSION: The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.


Subject(s)
Algorithms , Conserved Sequence , Proteins/chemistry , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acid Motifs , Amino Acid Sequence , Linear Models , Models, Chemical , Molecular Sequence Data
18.
Front Biosci ; 13: 6580-603, 2008 May 01.
Article in English | MEDLINE | ID: mdl-18508681

ABSTRACT

It is now clear that a detailed picture of cell regulation requires a comprehensive understanding of the abundant short protein motifs through which signaling is channeled. The current body of knowledge has slowly accumulated through piecemeal experimental investigation of individual motifs in signaling. Computational methods contributed little to this process. A new generation of bioinformatics tools will aid the future investigation of motifs in regulatory proteins, and the disordered polypeptide regions in which they frequently reside. Allied to high throughput methods such as phosphoproteomics, signaling networks are becoming amenable to experimental deconstruction. In this review, we summarise the current state of linear motif biology, which uses low affinity interactions to create cooperative, combinatorial and highly dynamic regulatory protein complexes. The discrete deterministic properties implicit to these assemblies suggest that models for cell regulatory networks in systems biology should neither be overly dependent on stochastic nor on smooth deterministic approximations.


Subject(s)
Cell Physiological Phenomena , Signal Transduction , Animals , Endoplasmic Reticulum/physiology , Homeostasis , Mammals , Models, Biological , Proteins/physiology , Reproducibility of Results
19.
BMC Bioinformatics ; 9: 213, 2008 Apr 25.
Article in English | MEDLINE | ID: mdl-18439277

ABSTRACT

BACKGROUND: Linear motifs (LMs) are abundant short regulatory sites used for modulating the functions of many eukaryotic proteins. They play important roles in post-translational modification, cell compartment targeting, docking sites for regulatory complex assembly and protein processing and cleavage. Methods for LM detection are now being developed that are strongly dependent on scores for motif conservation in homologous proteins. However, most LMs are found in natively disordered polypeptide segments that evolve rapidly, unhindered by structural constraints on the sequence. These regions of modular proteins are difficult to align using classical multiple sequence alignment programs that are specifically optimised to align the globular domains. As a consequence, poor motif alignment quality is hindering efforts to detect new LMs. RESULTS: We have developed a new benchmark, as part of the BAliBASE suite, designed to assess the ability of standard multiple alignment methods to detect and align LMs. The reference alignments are organised into different test sets representing real alignment problems and contain examples of experimentally verified functional motifs, extracted from the Eukaryotic Linear Motif (ELM) database. The benchmark has been used to evaluate and compare a number of multiple alignment programs. With distantly related proteins, the worst alignment program correctly aligns 48% of LMs compared to 73% for the best program. However, the performance of all the programs is adversely affected by the introduction of other sequences containing false positive motifs. The ranking of the alignment programs based on LM alignment quality is similar to that observed when considering full-length protein alignments, however little correlation was observed between LM and overall alignment quality for individual alignment test cases. CONCLUSION: We have shown that none of the programs currently available is capable of reliably aligning LMs in distantly related sequences and we have highlighted a number of specific problems. The results of the tests suggest possible ways to improve program accuracy for difficult, divergent sequences.


Subject(s)
Amino Acid Motifs , Sequence Alignment/standards , Software Validation , User-Computer Interface , Artificial Intelligence , Pattern Recognition, Automated/methods , Pattern Recognition, Automated/standards , Proteins/analysis , Proteins/ultrastructure , Proteomics/methods , Quality Control , Reproducibility of Results , Sequence Alignment/methods , Sequence Homology, Amino Acid
20.
Bioinformatics ; 24(4): 453-7, 2008 Feb 15.
Article in English | MEDLINE | ID: mdl-18184688

ABSTRACT

MOTIVATION: KEN-box-mediated target selection is one of the mechanisms used in the proteasomal destruction of mitotic cell cycle proteins via the APC/C complex. While annotating the Eukaryotic Linear Motif resource (ELM, http://elm.eu.org/), we found that KEN motifs were significantly enriched in human protein entries with cell cycle keywords in the UniProt/Swiss-Prot database-implying that KEN-boxes might be more common than reported. RESULTS: Matches to short linear motifs in protein database searches are not, per se, significant. KEN-box enrichment with cell cycle Gene Ontology terms suggests that collectively these motifs are functional but does not prove that any given instance is so. Candidates were surveyed for native disorder prediction using GlobPlot and IUPred and for motif conservation in homologues. Among >25 strong new candidates, the most notable are human HIPK2, CHFR, CDC27, Dab2, Upf2, kinesin Eg5, DNA Topoisomerase 1 and yeast Cdc5 and Swi5. A similar number of weaker candidates were present. These proteins have yet to be tested for APC/C targeted destruction, providing potential new avenues of research.


Subject(s)
Cell Cycle Proteins/chemistry , Conserved Sequence , Amino Acid Motifs , Amino Acid Sequence , Humans , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL