Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Commun ; 14(1): 2351, 2023 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-37100781

RESUMEN

For the past half-century, structural biologists relied on the notion that similar protein sequences give rise to similar structures and functions. While this assumption has driven research to explore certain parts of the protein universe, it disregards spaces that don't rely on this assumption. Here we explore areas of the protein universe where similar protein functions can be achieved by different sequences and different structures. We predict ~200,000 structures for diverse protein sequences from 1,003 representative genomes across the microbial tree of life and annotate them functionally on a per-residue basis. Structure prediction is accomplished using the World Community Grid, a large-scale citizen science initiative. The resulting database of structural models is complementary to the AlphaFold database, with regards to domains of life as well as sequence diversity and sequence length. We identify 148 novel folds and describe examples where we map specific functions to structural motifs. We also show that the structural space is continuous and largely saturated, highlighting the need for a shift in focus across all branches of biology, from obtaining structures to putting them into context and from sequence-based to sequence-structure-function based meta-omics analyses.


Asunto(s)
Pliegue de Proteína , Proteínas , Proteínas/metabolismo , Secuencia de Aminoácidos , Relación Estructura-Actividad , Bases de Datos de Proteínas
2.
Genome Res ; 29(3): 449-463, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30696696

RESUMEN

Transcriptional regulatory networks (TRNs) provide insight into cellular behavior by describing interactions between transcription factors (TFs) and their gene targets. The assay for transposase-accessible chromatin (ATAC)-seq, coupled with TF motif analysis, provides indirect evidence of chromatin binding for hundreds of TFs genome-wide. Here, we propose methods for TRN inference in a mammalian setting, using ATAC-seq data to improve gene expression modeling. We test our methods in the context of T Helper Cell Type 17 (Th17) differentiation, generating new ATAC-seq data to complement existing Th17 genomic resources. In this resource-rich mammalian setting, our extensive benchmarking provides quantitative, genome-scale evaluation of TRN inference, combining ATAC-seq and RNA-seq data. We refine and extend our previous Th17 TRN, using our new TRN inference methods to integrate all Th17 data (gene expression, ATAC-seq, TF knockouts, and ChIP-seq). We highlight newly discovered roles for individual TFs and groups of TFs ("TF-TF modules") in Th17 gene regulation. Given the popularity of ATAC-seq, which provides high-resolution with low sample input requirements, we anticipate that our methods will improve TRN inference in new mammalian systems, especially in vivo, for cells directly from humans and animal models.


Asunto(s)
Cromatina/genética , Redes Reguladoras de Genes , Células Th17/metabolismo , Factores de Transcripción/metabolismo , Diferenciación Celular , Cromatina/química , Ensamble y Desensamble de Cromatina , Humanos , Unión Proteica , Programas Informáticos , Células Th17/citología
3.
J Mol Biol ; 349(1): 27-45, 2005 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-15876366

RESUMEN

Pseudogenes are inheritable genetic elements formally defined by two properties: their similarity to functioning genes and their presumed lack of activity. However, their precise characterization, particularly with respect to the latter quality, has proven elusive. An opportunity to explore this issue arises from the recent emergence of tiling-microarray data showing that intergenic regions (containing pseudogenes) are transcribed to a great degree. Here we focus on the transcriptional activity of pseudogenes on human chromosome 22. First, we integrated several sets of annotation to define a unified list of 525 pseudogenes on the chromosome. To characterize these further, we developed a comprehensive list of genomic features based on conservation in related organisms, expression evidence, and the presence of upstream regulatory sites. Of the 525 unified pseudogenes we could confidently classify 154 as processed and 49 as duplicated. Using data from tiling microarrays, especially from recent high-resolution oligonucleotide arrays, we found some evidence that up to a fifth of the 525 pseudogenes are potentially transcribed. Expressed sequence tags (EST) comparison further validated a number of these, and overall we found 17 pseudogenes with strong support for transcription. In particular, one of the pseudogenes with both EST and microarray evidence for transcription turned out to be a duplicated pseudogene in the cat eye syndrome critical region. Although we could not identify a meaningful number of transcription factor-binding sites (based on chromatin immunoprecipitation-chip data) near pseudogenes, we did find that approximately 12% of the pseudogenes had upstream CpG islands. Finally, analysis of corresponding syntenic regions in the mouse, rat and chimp genomes indicates, as previously suggested, that pseudogenes are less conserved than genes, but more preserved than the intergenic background (all notation is available from http://www.pseudogene.org).


Asunto(s)
Cromosomas Humanos Par 22 , Seudogenes/fisiología , Transcripción Genética/fisiología , Animales , Secuencia de Bases , Sitios de Unión , Mapeo Cromosómico , Etiquetas de Secuencia Expresada , Humanos , Ratones , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Pan troglodytes/genética , Mutación Puntual , Ratas , Factores de Transcripción/metabolismo
4.
Trends Genet ; 20(2): 62-7, 2004 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-14746985

RESUMEN

Pseudogenes are important resources in evolutionary and comparative genomics because they provide molecular records of the ancient genes that existed in the genome millions of years ago. We have systematically identified approximately 5000 processed pseudogenes in the mouse genome, and estimated that approximately 60% are lineage specific, created after the mouse and human diverged. In both mouse and human genomes, similar types of genes give rise to many processed pseudogenes. These tend to be housekeeping genes, which are highly expressed in the germ line. Ribosomal-protein genes, in particular, form the largest sub-group. The processed pseudogenes in the mouse occur with a distinctly different chromosomal distribution than LINEs or SINEs - preferentially in GC-poor regions. Finally, the age distribution of mouse-processed pseudogenes closely resembles that of LINEs, in contrast to human, where the age distribution closely follows Alus (SINEs).


Asunto(s)
Evolución Molecular , Genoma , Seudogenes/genética , Elementos Alu , Animales , Mapeo Cromosómico , Biología Computacional/métodos , Bases de Datos Genéticas , Humanos , Elementos de Nucleótido Esparcido Largo , Ratones , Elementos de Nucleótido Esparcido Corto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...