Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Proc Natl Acad Sci U S A ; 118(7)2021 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-33579822

RESUMO

Polycistronic gene expression, common in prokaryotes, was thought to be extremely rare in eukaryotes. The development of long-read sequencing of full-length transcript isomers (Iso-Seq) has facilitated a reexamination of that dogma. Using Iso-Seq, we discovered hundreds of examples of polycistronic expression of nuclear genes in two divergent species of green algae: Chlamydomonas reinhardtii and Chromochloris zofingiensis Here, we employ a range of independent approaches to validate that multiple proteins are translated from a common transcript for hundreds of loci. A chromatin immunoprecipitation analysis using trimethylation of lysine 4 on histone H3 marks confirmed that transcription begins exclusively at the upstream gene. Quantification of polyadenylated [poly(A)] tails and poly(A) signal sequences confirmed that transcription ends exclusively after the downstream gene. Coexpression analysis found nearly perfect correlation for open reading frames (ORFs) within polycistronic loci, consistent with expression in a shared transcript. For many polycistronic loci, terminal peptides from both ORFs were identified from proteomics datasets, consistent with independent translation. Synthetic polycistronic gene pairs were transcribed and translated in vitro to recapitulate the production of two distinct proteins from a common transcript. The relative abundance of these two proteins can be modified by altering the Kozak-like sequence of the upstream gene. Replacement of the ORFs with selectable markers or reporters allows production of such heterologous proteins, speaking to utility in synthetic biology approaches. Conservation of a significant number of polycistronic gene pairs between C. reinhardtii, C. zofingiensis, and five other species suggests that this mechanism may be evolutionarily ancient and biologically important in the green algal lineage.


Assuntos
Clorófitas/genética , Regulação Bacteriana da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Proteínas de Plantas/genética , Fases de Leitura Aberta , Proteínas de Plantas/metabolismo , RNA Mensageiro/genética , Transcrição Gênica
2.
Nucleic Acids Res ; 49(D1): D575-D588, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-32986834

RESUMO

For over 10 years, ModelSEED has been a primary resource for the construction of draft genome-scale metabolic models based on annotated microbial or plant genomes. Now being released, the biochemistry database serves as the foundation of biochemical data underlying ModelSEED and KBase. The biochemistry database embodies several properties that, taken together, distinguish it from other published biochemistry resources by: (i) including compartmentalization, transport reactions, charged molecules and proton balancing on reactions; (ii) being extensible by the user community, with all data stored in GitHub; and (iii) design as a biochemical 'Rosetta Stone' to facilitate comparison and integration of annotations from many different tools and databases. The database was constructed by combining chemical data from many resources, applying standard transformations, identifying redundancies and computing thermodynamic properties. The ModelSEED biochemistry is continually tested using flux balance analysis to ensure the biochemical network is modeling-ready and capable of simulating diverse phenotypes. Ontologies can be designed to aid in comparing and reconciling metabolic reconstructions that differ in how they represent various metabolic pathways. ModelSEED now includes 33,978 compounds and 36,645 reactions, available as a set of extensible files on GitHub, and available to search at https://modelseed.org/biochem and KBase.


Assuntos
Bactérias/metabolismo , Bases de Dados Factuais , Fungos/metabolismo , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Plantas/metabolismo , Bactérias/genética , Genoma Bacteriano , Termodinâmica
4.
Nucleic Acids Res ; 38(6): 1997-2005, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20015968

RESUMO

The Escherichia coli McrA protein, a putative C(5)-methylcytosine/C(5)-hydroxyl methylcytosine-specific nuclease, binds DNA with symmetrically methylated HpaII sequences (Cm5CGG), but its precise recognition sequence remains undefined. To determine McrA's binding specificity, we cloned and expressed recombinant McrA with a C-terminal StrepII tag (rMcrA-S) to facilitate protein purification and affinity capture of human DNA fragments with m5C residues. Sequence analysis of a subset of these fragments and electrophoretic mobility shift assays with model methylated and unmethylated oligonucleotides suggest that N(Y > R) m5CGR is the canonical binding site for rMcrA-S. In addition to binding HpaII-methylated double-stranded DNA, rMcrA-S binds DNA containing a single, hemimethylated HpaII site; however, it does not bind if A, C, T or U is placed across from the m5C residue, but does if I is opposite the m5C. These results provide the first systematic analysis of McrA's in vitro binding specificity.


Assuntos
Ilhas de CpG , Metilação de DNA , Enzimas de Restrição do DNA/metabolismo , Proteínas de Escherichia coli/metabolismo , 5-Metilcitosina/análise , Sequência de Bases , Sítios de Ligação , DNA/química , DNA/metabolismo , Humanos
5.
J Neurosci ; 27(25): 6729-39, 2007 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-17581960

RESUMO

The repressor element 1 (RE1) silencing transcription factor (REST) helps preserve the identity of nervous tissue by silencing neuronal genes in non-neural tissues. Moreover, in an epithelial model of tumorigenesis, loss of REST function is associated with loss of adhesion, suggesting the aberrant expression of REST-controlled genes encoding this property. To date, no adhesion molecules under REST control have been identified. Here, we used serial analysis of chromatin occupancy to perform genome-wide identification of REST-occupied target sequences (RE1 sites) in a kidney cell line. We discovered novel REST-binding motifs and found that the number of RE1 sites far exceeded previous estimates. A large family of targets encoding adhesion proteins was identified, as were genes encoding signature proteins of neuroendocrine tumors. Unexpectedly, genes considered exclusively non-neuronal also contained an RE1 motif and were expressed in neurons. This supports the model that REST binding is a critical determinant of neuronal phenotype.


Assuntos
Redes Reguladoras de Genes/fisiologia , Neurônios/fisiologia , Proteínas Repressoras/genética , Proteínas Repressoras/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Motivos de Aminoácidos , Animais , Sítios de Ligação/fisiologia , Linhagem Celular , Perfilação da Expressão Gênica , Camundongos , Neurônios/metabolismo , Proteínas Repressoras/biossíntese , Fatores de Transcrição/biossíntese
6.
Nucleic Acids Res ; 34(8): 2238-46, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16670430

RESUMO

Transcription factor binding sites (TFBSs) are short DNA sequences interacting with transcription factors (TFs), which regulate gene expression. Due to the relatively short length of such binding sites, it is largely unclear how the specificity of protein-DNA interaction is achieved. Here, we have performed a genome-wide analysis of TFBS-like sequences for the transcriptional repressor, RE1 Silencing Transcription Factor (REST), as well as for several other representative mammalian TFs (c-myc, p53, HNF-1 and CREB). We find a nonrandom distribution of inexact sites for these TFs, referred to as highly-degenerate TFBSs, that are enriched around the cognate binding sites. Comparisons among human, mouse and rat orthologous promoters reveal that these highly-degenerate sites are conserved significantly more than expected by random chance, suggesting their positive selection during evolution. We propose that this arrangement provides a favorable genomic landscape for functional target site selection.


Assuntos
Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo , Animais , Sequência de Bases , Sítios de Ligação , Sequência Conservada , Genômica , Humanos , Camundongos , Ratos , Proteínas Repressoras/metabolismo
7.
Genet Eng (N Y) ; 28: 159-73, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17153938

RESUMO

Because paired-end genomic signature tags are sequenced-based, they have the potential to become an alternate tool to tiled microarray hybridization as a method for genome-wide localization of transcription factors and other sequence-specific DNA binding proteins. As outlined here the method also can be used for global analysis of DNA methylation. One advantage of this approach is the ability to easily switch between different genome types without having to fabricate a new microarray for each and every DNA type. However, the method does have some disadvantages. Among the most rate-limiting steps of our PE-GST protocol are the need to concatemerize the diTAGs, size fractionate them and then clone them prior to sequencing. This is usually followed by additional steps to amplify and size select for long (> or = 500) concatemer inserts prior to sequencing. These time-consuming steps are important for standard DNA sequencing as they increase efficiency approximately 20-30-fold since each amplified concatemer can now provide information on multiple tags; the limitation on data acqui- sition is read length during sequencing. However, the development of new sequencing methods such as Life Sciences' 454 new nanotechnology-based sequencing instrument (41) could increase tag sequencing efficiency by several orders of magnitude (> or = 100,000 diTAG reads/run), which is sufficient to provide in-depth global analysis of all ChIP PE-GSTs in a single run. This is because the lengths of our paired-end diTAGs (approximately 60 bp) fall well within the region of high accuracy for read lengths on this instrument. In principle, sequence analysis of diTAGs could begin as soon as they are generated, thereby completely bypassing the need for the concatemerization, sizing, downstream cloning steps and sequencing template purification. In addition, our protocol places any one of several unique four-base long nucleotide sequences, such as GATC, between each and every diTAG pair, which could be used to help the instrument's software keep base register and also provide a well-located peak height indicator in the middle of every sequence run. This additional feature could permit multiplexing of the data by simultaneous sequencing of several pooled libraries if each used a different linker sequence during diTAG formation (Figure 4).


Assuntos
Genômica/métodos , Sequência de Bases , Imunoprecipitação da Cromatina , Ilhas de CpG , DNA/química , DNA/genética , Metilação de DNA , Enzimas de Restrição do DNA , Epigênese Genética , Engenharia Genética , Genoma , Dados de Sequência Molecular
8.
BMC Bioinformatics ; 6: 284, 2005 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-16318626

RESUMO

BACKGROUND: Families of homologous enzymes evolved from common progenitors. The availability of multiple sequences representing each activity presents an opportunity for extracting information specifying the functionality of individual homologs. We present a straightforward method for the identification of residues likely to determine class specific functionality in which multiple sequence alignments are converted to an annotated graphical form by the Conserved Property Difference Locator (CPDL) program. RESULTS: Three test cases, each comprised of two groups of functionally-distinct homologs, are presented. Of the test cases, one is a membrane and two are soluble enzyme families. The desaturase/hydroxylase data was used to design and test the CPDL algorithm because a comparative sequence approach had been successfully applied to manipulate the specificity of these enzymes. The other two cases, ATP/GTP cyclases, and MurD/MurE synthases were chosen because they are well characterized structurally and biochemically. For the desaturase/hydroxylase enzymes, the ATP/GTP cyclases and the MurD/MurE synthases, groups of 8 (of approximately 400), 4 (of approximately 150) and 10 (of >400) residues, respectively, of interest were identified that contain empirically defined specificity determining positions. CONCLUSION: CPDL consistently identifies positions near enzyme active sites that include those predicted from structural and/or biochemical studies to be important for specificity and/or function. This suggests that CPDL will have broad utility for the identification of potential class determining residues based on multiple sequence analysis of groups of homologous proteins. Because the method is sequence, rather than structure, based it is equally well suited for designing structure-function experiments to investigate membrane and soluble proteins.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Algoritmos , Sequência de Aminoácidos , Animais , Arabidopsis/enzimologia , Sítios de Ligação , Dipeptídeos/química , Modelos Biológicos , Modelos Moleculares , Dados de Sequência Molecular , Peptidoglicano/química , Linguagens de Programação , Estrutura Terciária de Proteína , Software , Relação Estrutura-Atividade , Uridina Difosfato Ácido N-Acetilmurâmico/química
9.
PLoS One ; 9(11): e113492, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25415302

RESUMO

The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We report distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). Our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.


Assuntos
DNA/metabolismo , Genoma Humano , Elementos Nucleotídeos Longos e Dispersos , Proteína Supressora de Tumor p53/metabolismo , Sítios de Ligação , Linhagem Celular , Imunoprecipitação da Cromatina , Epigênese Genética , Células HCT116 , Humanos , Especificidade de Órgãos , Proteína Supressora de Tumor p53/genética
10.
Cell Cycle ; 10(24): 4237-49, 2011 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-22127205

RESUMO

We report here genome-wide analysis of the tumor suppressor p53 binding sites in normal human cells. 743 high-confidence ChIP-seq peaks representing putative genomic binding sites were identified in normal IMR90 fibroblasts using a reference chromatin sample. More than 40% were located within 2 kb of a transcription start site (TSS), a distribution similar to that documented for individually studied, functional p53 binding sites and, to date, not observed by previous p53 genome-wide studies. Nearly half of the high-confidence binding sites in the IMR90 cells reside in CpG islands, in marked contrast to sites reported in cancer-derived cells. The distinct genomic features of the IMR90 binding sites do not reflect a distinct preference for specific sequences, since the de novo developed p53 motif based on our study is similar to those reported by genome-wide studies of cancer cells. More likely, the different chromatin landscape in normal, compared with cancer-derived cells, influences p53 binding via modulating availability of the sites. We compared the IMR90 ChIPseq peaks to the recently published IMR90 methylome and demonstrated that they are enriched at hypomethylated DNA. Our study represents the first genome-wide, de novo mapping of p53 binding sites in normal human cells and reveals that p53 binding sites reside in distinct genomic landscapes in normal and cancer-derived human cells.


Assuntos
DNA/genética , Proteína Supressora de Tumor p53/metabolismo , Sequência de Bases , Sítios de Ligação/genética , Imunoprecipitação da Cromatina , Ilhas de CpG/genética , DNA/metabolismo , Metilação de DNA/genética , Fibroblastos , Genômica/métodos , Humanos , Dados de Sequência Molecular , Análise de Sequência de DNA , Proteína Supressora de Tumor p53/genética
11.
Biotechnol Biofuels ; 2: 10, 2009 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-19450243

RESUMO

Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.

12.
Blood ; 101(6): 2285-93, 2003 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-12433680

RESUMO

Human platelets are anucleate blood cells that retain cytoplasmic mRNA and maintain functionally intact protein translational capabilities. We have adapted complementary techniques of microarray and serial analysis of gene expression (SAGE) for genetic profiling of highly purified human blood platelets. Microarray analysis using the Affymetrix HG-U95Av2 approximately 12 600-probe set maximally identified the expression of 2147 (range, 13%-17%) platelet-expressed transcripts, with approximately 22% collectively involved in metabolism and receptor/signaling, and an overrepresentation of genes with unassigned function (32%). In contrast, a modified SAGE protocol using the Type IIS restriction enzyme MmeI (generating 21-base pair [bp] or 22-bp tags) demonstrated that 89% of tags represented mitochondrial (mt) transcripts (enriched in 16S and 12S ribosomal RNAs), presumably related to persistent mt-transcription in the absence of nuclear-derived transcripts. The frequency of non-mt SAGE tags paralleled average difference values (relative expression) for the most "abundant" transcripts as determined by microarray analysis, establishing the concordance of both techniques for platelet profiling. Quantitative reverse transcription-polymerase chain reaction (PCR) confirmed the highest frequency of mt-derived transcripts, along with the mRNAs for neurogranin (NGN, a protein kinase C substrate) and the complement lysis inhibitor clusterin among the top 5 most abundant transcripts. For confirmatory characterization, immunoblots and flow cytometric analyses were performed, establishing abundant cell-surface expression of clusterin and intracellular expression of NGN. These observations demonstrate a strong correlation between high transcript abundance and protein expression, and they establish the validity of transcript analysis as a tool for identifying novel platelet proteins that may regulate normal and pathologic platelet (and/or megakaryocyte) functions.


Assuntos
Plaquetas/química , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/sangue , Sequência de Bases , Plaquetas/metabolismo , Proteínas de Ligação a Calmodulina/sangue , Proteínas de Ligação a Calmodulina/genética , Separação Celular , Clusterina , Desoxirribonucleases de Sítio Específico do Tipo II/metabolismo , Biblioteca Gênica , Glicoproteínas/sangue , Glicoproteínas/genética , Humanos , Mitocôndrias/química , Chaperonas Moleculares/sangue , Chaperonas Moleculares/genética , Proteínas do Tecido Nervoso/sangue , Proteínas do Tecido Nervoso/genética , Neurogranina , Reação em Cadeia da Polimerase Via Transcriptase Reversa
13.
Genome Res ; 12(11): 1756-65, 2002 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-12421763

RESUMO

Genomic signature tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated, and then cloned and sequenced. The tag sequences and abundances are used to create a high-resolution GST sequence profile of the genomic DNA. GSTs are shown to be long enough for use as oligonucleotide primers to amplify adjacent segments of the DNA, which can then be sequenced to provide additional nucleotide information or used as probes to identify specific clones in metagenomic libraries. GST analysis of the 4.7-Mb Yersinia pestis EV766 genome using BamHI as the fragmenting enzyme and NlaIII as the tagging enzyme validated the precision of our approach. The GST profile predicts that this strain has several changes relative to the archetype CO92 strain, including deletion of a 57-kb region of the chromosome known to be an unstable pathogenicity island.


Assuntos
Impressões Digitais de DNA/métodos , DNA Bacteriano/análise , Sítios de Ligação/genética , Fragmentação do DNA/genética , DNA Bacteriano/metabolismo , Desoxirribonuclease BamHI/metabolismo , Desoxirribonucleases de Sítio Específico do Tipo II/genética , Biblioteca Gênica , Genoma Bacteriano , Ligases/metabolismo , Técnicas de Amplificação de Ácido Nucleico/métodos , Oligonucleotídeos/genética , Reação em Cadeia da Polimerase/métodos , Yersinia pestis/genética
14.
Cell ; 119(7): 1041-54, 2004 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-15620361

RESUMO

The CREB transcription factor regulates differentiation, survival, and synaptic plasticity. The complement of CREB targets responsible for these responses has not been identified, however. We developed a novel approach to identify CREB targets, termed serial analysis of chromatin occupancy (SACO), by combining chromatin immunoprecipitation (ChIP) with a modification of SAGE. Using a SACO library derived from rat PC12 cells, we identified approximately 41,000 genomic signature tags (GSTs) that mapped to unique genomic loci. CREB binding was confirmed for all loci supported by multiple GSTs. Of the 6302 loci identified by multiple GSTs, 40% were within 2 kb of the transcriptional start of an annotated gene, 49% were within 1 kb of a CpG island, and 72% were within 1 kb of a putative cAMP-response element (CRE). A large fraction of the SACO loci delineated bidirectional promoters and novel antisense transcripts. This study represents the most comprehensive definition of transcription factor binding sites in a metazoan species.


Assuntos
Proteína de Ligação ao Elemento de Resposta ao AMP Cíclico/metabolismo , Genômica , Regulon/genética , Elementos de Resposta/genética , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Imunoprecipitação da Cromatina/métodos , Ilhas de CpG/genética , AMP Cíclico/metabolismo , Proteína de Ligação ao Elemento de Resposta ao AMP Cíclico/genética , DNA/genética , DNA/metabolismo , Regulação da Expressão Gênica/efeitos dos fármacos , Regulação da Expressão Gênica/efeitos da radiação , Biblioteca Gênica , Genoma , Análise de Sequência com Séries de Oligonucleotídeos , Células PC12 , RNA Antissenso/genética , RNA Antissenso/metabolismo , Ratos , Reprodutibilidade dos Testes , Fatores de Transcrição/genética , Transcrição Gênica/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA