RESUMO
X-linked mental retardation (XLMR) is a complex human disease that causes intellectual disability. Causal mutations have been found in approximately 90 X-linked genes; however, molecular and biological functions of many of these genetically defined XLMR genes remain unknown. PHF8 (PHD (plant homeo domain) finger protein 8) is a JmjC domain-containing protein and its mutations have been found in patients with XLMR and craniofacial deformities. Here we provide multiple lines of evidence establishing PHF8 as the first mono-methyl histone H4 lysine 20 (H4K20me1) demethylase, with additional activities towards histone H3K9me1 and me2. PHF8 is located around the transcription start sites (TSS) of approximately 7,000 RefSeq genes and in gene bodies and intergenic regions (non-TSS). PHF8 depletion resulted in upregulation of H4K20me1 and H3K9me1 at the TSS and H3K9me2 in the non-TSS sites, respectively, demonstrating differential substrate specificities at different target locations. PHF8 positively regulates gene expression, which is dependent on its H3K4me3-binding PHD and catalytic domains. Importantly, patient mutations significantly compromised PHF8 catalytic function. PHF8 regulates cell survival in the zebrafish brain and jaw development, thus providing a potentially relevant biological context for understanding the clinical symptoms associated with PHF8 patients. Lastly, genetic and molecular evidence supports a model whereby PHF8 regulates zebrafish neuronal cell survival and jaw development in part by directly regulating the expression of the homeodomain transcription factor MSX1/MSXB, which functions downstream of multiple signalling and developmental pathways. Our findings indicate that an imbalance of histone methylation dynamics has a critical role in XLMR.
Assuntos
Encéfalo/embriologia , Encéfalo/enzimologia , Cabeça/embriologia , Histona Desmetilases/metabolismo , Histonas/metabolismo , Fatores de Transcrição/metabolismo , Proteínas de Peixe-Zebra/metabolismo , Peixe-Zebra/embriologia , Animais , Biocatálise , Encéfalo/citologia , Domínio Catalítico , Ciclo Celular , Linhagem Celular Tumoral , Sobrevivência Celular , DNA Intergênico/genética , Regulação da Expressão Gênica , Histona Desmetilases/genética , Histonas/química , Proteínas de Homeodomínio/genética , Humanos , Arcada Osseodentária/citologia , Arcada Osseodentária/embriologia , Lisina/metabolismo , Deficiência Intelectual Ligada ao Cromossomo X/enzimologia , Deficiência Intelectual Ligada ao Cromossomo X/genética , Metilação , Neurônios/citologia , Neurônios/enzimologia , Regiões Promotoras Genéticas , Fatores de Transcrição/deficiência , Fatores de Transcrição/genética , Sítio de Iniciação de Transcrição , Peixe-Zebra/metabolismo , Proteínas de Peixe-Zebra/genéticaRESUMO
The use of DNA microarrays to identify nucleotide variation is almost 20 years old. A variety of improvements in probe design and experimental conditions have brought this technology to the point that single-nucleotide differences can be efficiently detected in unmixed samples, although developing reliable methods for detection of mixed sequences (e.g., heterozygotes) remains challenging. Surprisingly, a comprehensive study of the probe design parameters and experimental conditions that optimize discrimination of single-nucleotide polymorphisms (SNPs) has yet to be reported, so the limits of this technology remain uncertain. By targeting 24,549 SNPs that differ between two Saccharomyces cerevisiae strains, we studied the effect of SNPs on hybridization efficiency to DNA microarray probes of different lengths under different hybridization conditions. We found that the critical parameter for optimization of sequence discrimination is the relationship between probe melting temperature (T(m)) and the temperature at which the hybridization reaction is performed. This relationship can be exploited through the design of microarrays containing probes of equal T(m) by varying the length of probes. We demonstrate using such a microarray that we detect >90% homozygous SNPs and >80% heterozygous SNPs using the SNPScanner algorithm. The optimized design and experimental parameters determined in this study should guide DNA microarray designs for applications that require sequence discrimination such as mutation detection, genotyping of unmixed and mixed samples, and allele-specific gene expression. Moreover, designing microarray probes with optimized sensitivity to mismatches should increase the accuracy of standard microarray applications such as copy-number variation detection and gene expression analysis.
Assuntos
Sondas de DNA/análise , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Algoritmos , Desnaturação de Ácido Nucleico , Saccharomyces cerevisiae/genéticaRESUMO
DNA methylation stabilizes developmentally programmed gene expression states. Aberrant methylation is associated with disease progression and is a common feature of cancer genomes. Presently, few methods enable quantitative, large-scale, single-base resolution mapping of DNA methylation states in desired regions of a complex mammalian genome. Here, we present an approach that combines array-based hybrid selection and massively parallel bisulfite sequencing to profile DNA methylation in genomic regions spanning hundreds of thousands of bases. This single molecule strategy enables methylation variable positions to be quantitatively examined with high sampling precision. Using bisulfite capture, we assessed methylation patterns across 324 randomly selected CpG islands (CGI) representing more than 25,000 CpG sites. A single lane of Illumina sequencing permitted methylation states to be definitively called for >90% of target sties. The accuracy of the hybrid-selection approach was verified using conventional bisulfite capillary sequencing of cloned PCR products amplified from a subset of the selected regions. This confirmed that even partially methylated states could be successfully called. A comparison of human primary and cancer cells revealed multiple differentially methylated regions. More than 25% of islands showed complex methylation patterns either with partial methylation states defining the entire CGI or with contrasting methylation states appearing in specific regional blocks within the island. We observed that transitions in methylation state often correlate with genomic landmarks, including transcriptional start sites and intron-exon junctions. Methylation, along with specific histone marks, was enriched in exonic regions, suggesting that chromatin states can foreshadow the content of mature mRNAs.
Assuntos
Ilhas de CpG/genética , Metilação de DNA , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Sulfitos/química , Neoplasias da Mama/genética , Linhagem Celular Tumoral , Feminino , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Polimorfismo de Nucleotídeo Único , Neoplasias Cutâneas/genéticaRESUMO
Thanks to the results of the multiple completed and ongoing genome sequencing projects and to the newly available recombination-based cloning techniques, it is now possible to build gene repositories with no precedent in their composition, formatting, and potential. This new type of gene repository is necessary to address the challenges imposed by the post-genomic era, i.e., experimentation on a genome-wide scale. We are building the FLEXGene (Full Length EXpression-ready) repository. This unique resource will contain clones representing the complete ORFeome of different organisms, including Homo sapiens as well as several pathogens and model organisms. It will consist of a comprehensive, characterized (sequence-verified), and arrayed gene repository. This resource will allow full exploitation of the genomic information by enabling genome-wide scale experimentation at the level of functional/phenotypic assays as well as at the level of protein expression, purification, and analysis. Here we describe the rationale and construction of this resource and focus on the data obtained from the Saccharomyces cerevisiae project.
Assuntos
Clonagem Molecular , Genoma , Fases de Leitura Aberta , Proteômica , Animais , Biologia Computacional , Bases de Dados Genéticas , Genoma Humano , Genômica , HumanosRESUMO
Immune cells are somewhat unique in that activation responses can alter quantitative phenotypes upwards of 100,000-fold. To date little is known about the metabolic adaptations necessary to mount such dramatic phenotypic shifts. Screening for novel regulators of macrophage activation, we found nonprotein kinases of glucose metabolism among the most enriched classes of candidate immune modulators. We find that one of these, the carbohydrate kinase-like protein CARKL, is rapidly downregulated in vitro and in vivo upon LPS stimulation in both mice and humans. Interestingly, CARKL catalyzes an orphan reaction in the pentose phosphate pathway, refocusing cellular metabolism to a high-redox state upon physiological or artificial downregulation. We find that CARKL-dependent metabolic reprogramming is required for proper M1- and M2-like macrophage polarization and uncover a rate-limiting requirement for appropriate glucose flux in macrophage polarization.
Assuntos
Polaridade Celular , Glucose/metabolismo , Macrófagos/enzimologia , Fosfotransferases (Aceptor do Grupo Álcool)/metabolismo , Fatores de Transcrição/metabolismo , Sequência de Aminoácidos , Animais , Metabolismo dos Carboidratos , Linhagem Celular , Método Duplo-Cego , Regulação para Baixo , Endotoxemia/enzimologia , Endotoxemia/imunologia , Metabolismo Energético , Regulação Enzimológica da Expressão Gênica , Humanos , Interleucina-6/genética , Interleucina-6/metabolismo , Lipopolissacarídeos/farmacologia , Macrófagos/imunologia , Macrófagos/metabolismo , Macrófagos/fisiologia , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Modelos Moleculares , Fenótipo , Fosfotransferases (Aceptor do Grupo Álcool)/química , Fosfotransferases (Aceptor do Grupo Álcool)/genética , Estrutura Terciária de Proteína , Receptores Imunológicos/genética , Receptores Imunológicos/metabolismo , Fatores de Transcrição/química , Fatores de Transcrição/genética , Fator de Necrose Tumoral alfa/genética , Fator de Necrose Tumoral alfa/metabolismoRESUMO
BACKGROUND: The classical candidate-gene approach has failed to identify novel breast cancer susceptibility genes. Nowadays, massive parallel sequencing technology allows the development of studies unaffordable a few years ago. However, analysis protocols are not yet sufficiently developed to extract all information from the huge amount of data obtained. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we performed high throughput sequencing in two regions located on chromosomes 3 and 6, recently identified by linkage studies by our group as candidate regions for harbouring breast cancer susceptibility genes. In order to enrich for the coding regions of all described genes located in both candidate regions, a hybrid-selection method on tiling microarrays was performed. CONCLUSIONS/SIGNIFICANCE: We developed an analysis pipeline based on SOAP aligner to identify candidate variants with a high real positive confirmation rate (0.89), with which we identified eight variants considered candidates for functional studies. The results suggest that the present strategy might be a valid second step for identifying high penetrance genes.
Assuntos
Neoplasias da Mama/genética , Ligação Genética , Análise de Sequência de DNA/métodos , Neoplasias da Mama/epidemiologia , Cromossomos Humanos Par 3 , Cromossomos Humanos Par 6 , Saúde da Família , Feminino , Predisposição Genética para Doença , Variação Genética , Humanos , Penetrância , Polimorfismo de Nucleotídeo ÚnicoRESUMO
It is now possible to perform whole-genome shotgun sequencing as well as capture of specific genomic regions for extinct organisms. However, targeted resequencing of large parts of nuclear genomes has yet to be demonstrated for ancient DNA. Here we show that hybridization capture on microarrays can successfully recover more than a megabase of target regions from Neandertal DNA even in the presence of approximately 99.8% microbial DNA. Using this approach, we have sequenced approximately 14,000 protein-coding positions inferred to have changed on the human lineage since the last common ancestor shared with chimpanzees. By generating the sequence of one Neandertal and 50 present-day humans at these positions, we have identified 88 amino acid substitutions that have become fixed in humans since our divergence from the Neandertals.
Assuntos
Genoma Humano , Genoma , Hominidae/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Substituição de Aminoácidos , Animais , Fósseis , Genes , Humanos , Hibridização de Ácido Nucleico , Pan troglodytes/genética , Proteínas/química , Proteínas/genética , Alinhamento de SequênciaRESUMO
Complementary techniques that deepen information content and minimize reagent costs are required to realize the full potential of massively parallel sequencing. Here, we describe a resequencing approach that directs focus to genomic regions of high interest by combining hybridization-based purification of multi-megabase regions with sequencing on the Illumina Genome Analyzer (GA). The capture matrix is created by a microarray on which probes can be programmed as desired to target any non-repeat portion of the genome, while the method requires only a basic familiarity with microarray hybridization. We present a detailed protocol suitable for 1-2 microg of input genomic DNA and highlight key design tips in which high specificity (>65% of reads stem from enriched exons) and high sensitivity (98% targeted base pair coverage) can be achieved. We have successfully applied this to the enrichment of coding regions, in both human and mouse, ranging from 0.5 to 4 Mb in length. From genomic DNA library production to base-called sequences, this procedure takes approximately 9-10 d inclusive of array captures and one Illumina flow cell run.
Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Animais , Sequência de Bases , Primers do DNA , Biblioteca Gênica , Genômica/instrumentação , Genômica/métodos , Genômica/estatística & dados numéricos , Humanos , Camundongos , Hibridização de Ácido Nucleico/métodos , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Reação em Cadeia da Polimerase , Sensibilidade e Especificidade , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/estatística & dados numéricosRESUMO
The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment. Mixtures of human genomic DNA and "spike-ins" comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups. Blind to the number of spike-ins, their locations, and the range of concentrations, each group made predictions of the spike-in locations. We found that microarray platform choice is not the primary determinant of overall performance. In fact, variation in performance between labs, protocols, and algorithms within the same array platform was greater than the variation in performance between array platforms. However, each array platform had unique performance characteristics that varied with tiling resolution and the number of replicates, which have implications for cost versus detection power. Long oligonucleotide arrays were slightly more sensitive at detecting very low enrichment. On all platforms, simple sequence repeats and genome redundancy tended to result in false positives. LM-PCR and WGA, the most popular sample amplification techniques, reproduced relative enrichment levels with high fidelity. Performance among signal detection algorithms was heavily dependent on array platform. The spike-in DNA samples and the data presented here provide a stable benchmark against which future ChIP platforms, protocol improvements, and analysis methods can be evaluated.
Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Aberrações Cromossômicas , DNA/química , Genoma Humano , Humanos , Sondas de Oligonucleotídeos , Reação em Cadeia da Polimerase , Curva ROC , Reprodutibilidade dos Testes , Sequências de Repetição em TandemRESUMO
The rapid development of new technologies for the high throughput (HT) study of proteins has increased the demand for comprehensive plasmid clone resources that support protein expression. These clones must be full-length, sequence-verified and in a flexible format. The generation of these resources requires automated pipelines supported by software management systems. Although the availability of clone resources is growing, current collections are either not complete or not fully sequence-verified. We report an automated pipeline, supported by several software applications that enabled the construction of the first comprehensive sequence-verified plasmid clone resource for more than 96% of protein coding sequences of the genome of F. tularensis, a highly virulent human pathogen and the causative agent of tularemia. This clone resource was applied to a HT protein purification pipeline successfully producing recombinant proteins for 72% of the genes. These methods and resources represent significant technological steps towards exploiting the genomic information of F. tularensis in discovery applications.
Assuntos
Proteínas de Bactérias/genética , DNA Complementar/genética , Francisella tularensis/genética , Genes/genética , Genoma Bacteriano , Fases de Leitura Aberta/genética , Tularemia/genética , Proteínas de Bactérias/isolamento & purificação , Proteínas de Bactérias/metabolismo , Clonagem Molecular , Francisella tularensis/crescimento & desenvolvimento , Humanos , Tularemia/microbiologia , Tularemia/patologiaRESUMO
Kinases catalyze the phosphorylation of proteins, lipids, sugars, nucleosides, and other important cellular metabolites and play key regulatory roles in all aspects of eukaryotic cell physiology. Here, we describe the mining of public databases to collect the sequence information of all identified human kinase genes and the cloning of the corresponding ORFs. We identified 663 genes, 511 encoding protein kinases, and 152 encoding nonprotein kinases. We describe the successful cloning and sequence verification of 270 of these genes. Subcloning of this gene set in mammalian expression vectors and their use in high-throughput cell-based screens allowed the validation of the clones at the level of expression and the identification of previously uncharacterized modulators of the survivin promoter. Moreover, expressions of the kinase genes in bacteria, followed by autophosphorylation assays, identified 21 protein kinases that showed autocatalytic activity. The work described here will facilitate the functional assaying of this important gene family in phenotypic screens and their use in biochemical and structural studies.
Assuntos
Clonagem Molecular , Biologia Computacional , Bases de Dados Genéticas , Proteínas Quinases/genética , Proteínas Quinases/metabolismo , Animais , Bioensaio , Catálise , Linhagem Celular , Células/metabolismo , Regulação da Expressão Gênica , Vetores Genéticos/genética , Humanos , Dados de Sequência Molecular , Fenótipo , Fosforilação , Plasmídeos/genética , Regiões Promotoras Genéticas/genética , Proteínas Quinases/isolamento & purificação , Proteínas Recombinantes/genética , Proteínas Recombinantes/isolamento & purificação , Proteínas Recombinantes/metabolismo , Reprodutibilidade dos TestesRESUMO
Cyclin-dependent kinases (CDKs) play a key role in regulating the cell cycle. The cyclins, their activating agents, and endogenous CDK inhibitors are frequently mutated in human cancers, making CDKs interesting targets for cancer chemotherapy. Our aim is the discovery of selective CDK4/cyclin D1 inhibitors. An ATP-competitive pyrazolopyrimidinone CDK inhibitor was identified by HTS and docked into a CDK4 homology model. The resulting binding model was consistent with available SAR and was validated by a subsequent CDK2/inhibitor crystal structure. An iterative cycle of chemistry and modeling led to a 70-fold improvement in potency. Small substituent changes resulted in large CDK4/CDK2 selectivity changes. The modeling revealed that selectivity is largely due to hydrogen-bonded interactions with only two kinase residues. This demonstrates that small differences between enzymes can efficiently be exploited in the design of selective inhibitors.
Assuntos
Quinases relacionadas a CDC2 e CDC28/antagonistas & inibidores , Ciclina A/antagonistas & inibidores , Ciclina D1/antagonistas & inibidores , Quinases Ciclina-Dependentes/antagonistas & inibidores , Inibidores Enzimáticos/farmacologia , Proteínas Proto-Oncogênicas/antagonistas & inibidores , Pirimidinonas/farmacologia , Sequência de Aminoácidos , Quinases relacionadas a CDC2 e CDC28/química , Quinase 2 Dependente de Ciclina , Quinase 4 Dependente de Ciclina , Quinases Ciclina-Dependentes/química , Avaliação Pré-Clínica de Medicamentos , Inibidores Enzimáticos/química , Ligação de Hidrogênio , Modelos Moleculares , Dados de Sequência Molecular , Proteínas Proto-Oncogênicas/química , Pirimidinonas/química , Homologia de Sequência de Aminoácidos , Especificidade por SubstratoRESUMO
Sixty-three proteins of Pseudomonas aeruginosa in the size range of 18-159 kDa were tested for expression in a bacterial cell-free system. Fifty-one of the 63 proteins could be expressed and partially purified under denaturing conditions. Most of the expressed proteins showed yields greater than 500 ng after a single affinity purification step from 50 microl in vitro protein synthesis reactions. The in vitro protein expression plus purification in a 96-well format and analysis of the proteins by SDS-PAGE were performed by one person in 4 h. A comparison of in vitro and in vivo expression suggests that despite lower yields and less pure protein preparations, bacterial in vitro protein expression coupled with single-step affinity purification offers a rapid, efficient alternative for the high-throughput screening of clones for protein expression and solubility.
Assuntos
Proteínas de Bactérias/química , Proteínas de Bactérias/isolamento & purificação , Sistema Livre de Células/química , Escherichia coli/química , Expressão Gênica , Pseudomonas aeruginosa/química , Proteínas de Bactérias/genética , Cromatografia de Afinidade , Eletroforese em Gel de Poliacrilamida , Escherichia coli/genética , Pseudomonas aeruginosa/genéticaRESUMO
Pseudomonas aeruginosa, a common inhabitant of soil and water, is an opportunistic pathogen of growing clinical relevance. Its genome, one of the largest among bacteria [5570 open reading frames (ORFs)] approaches that of simple eukaryotes. We have constructed a comprehensive gene collection for this organism utilizing the annotated genome of P. aeruginosa PA01 and a highly automated and laboratory information management system (LIMS)-supported production line. All the individual ORFs have been successfully PCR-amplified and cloned into a recombination-based cloning system. We have isolated and archived four independent isolates of each individual ORF. Full sequence analysis of the first isolate for one-third of the ORFs in the collection has been completed. We used two sets of genes from this repository for high-throughput expression and purification of recombinant proteins in different systems. The purified proteins have been used to set up biochemical and immunological assays directed towards characterization of histidine kinases and identification of bacterial proteins involved in the immune response of cystic fibrosis patients. This gene repository provides a powerful tool for proteome- and genome-scale research of this organism, and the strategies adopted to generate this repository serve as a model for building clone sets for other bacteria.
Assuntos
Proteínas de Bactérias , Biologia Computacional , Genes Bacterianos/fisiologia , Genoma Bacteriano , Fases de Leitura Aberta/fisiologia , Pseudomonas aeruginosa/química , Pseudomonas aeruginosa/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Clonagem Molecular , Biologia Computacional/métodos , DNA Bacteriano , DNA Complementar/genética , DNA Complementar/metabolismo , Expressão Gênica , Vetores Genéticos , Reação em Cadeia da Polimerase , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Análise de Sequência de DNARESUMO
Large-scale functional genomics studies for malaria vaccine and drug development will depend on the generation of molecular tools to study protein expression. We examined the feasibility of a high-throughput cloning approach using the Gateway system to create a large set of expression clones encoding Plasmodium falciparum single-exon genes. Master clones and their ORFs were transferred en masse to multiple expression vectors. Target genes (n = 303) were selected using specific sets of criteria, including stage expression and secondary structure. Upon screening four colonies per capture reaction, we achieved 84% cloning efficiency. The genes were subcloned in parallel into three expression vectors: a DNA vaccine vector and two protein expression vectors. These transfers yielded a 100% success rate without any observed recombination based on single colony screening. The functional expression of 95 genes was evaluated in mice with DNA vaccine constructs to generate antibody against various stages of the parasite. From these, 19 induced antibody titers against the erythrocytic stages and three against sporozoite stages. We have overcome the potential limitation of producing large P. falciparum clone sets in multiple expression vectors. This approach represents a powerful technique for the production of molecular reagents for genome-wide functional analysis of the P. falciparum genome and will provide for a resource for the malaria resource community distributed through public repositories.