Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 630(8015): 158-165, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38693268

RESUMEN

The liver has a unique ability to regenerate1,2; however, in the setting of acute liver failure (ALF), this regenerative capacity is often overwhelmed, leaving emergency liver transplantation as the only curative option3-5. Here, to advance understanding of human liver regeneration, we use paired single-nucleus RNA sequencing combined with spatial profiling of healthy and ALF explant human livers to generate a single-cell, pan-lineage atlas of human liver regeneration. We uncover a novel ANXA2+ migratory hepatocyte subpopulation, which emerges during human liver regeneration, and a corollary subpopulation in a mouse model of acetaminophen (APAP)-induced liver regeneration. Interrogation of necrotic wound closure and hepatocyte proliferation across multiple timepoints following APAP-induced liver injury in mice demonstrates that wound closure precedes hepatocyte proliferation. Four-dimensional intravital imaging of APAP-induced mouse liver injury identifies motile hepatocytes at the edge of the necrotic area, enabling collective migration of the hepatocyte sheet to effect wound closure. Depletion of hepatocyte ANXA2 reduces hepatocyte growth factor-induced human and mouse hepatocyte migration in vitro, and abrogates necrotic wound closure following APAP-induced mouse liver injury. Together, our work dissects unanticipated aspects of liver regeneration, demonstrating an uncoupling of wound closure and hepatocyte proliferation and uncovering a novel migratory hepatocyte subpopulation that mediates wound closure following liver injury. Therapies designed to promote rapid reconstitution of normal hepatic microarchitecture and reparation of the gut-liver barrier may advance new areas of therapeutic discovery in regenerative medicine.


Asunto(s)
Fallo Hepático Agudo , Regeneración Hepática , Animales , Femenino , Humanos , Masculino , Ratones , Acetaminofén/farmacología , Linaje de la Célula , Movimiento Celular/efectos de los fármacos , Proliferación Celular/efectos de los fármacos , Enfermedad Hepática Inducida por Sustancias y Drogas/patología , Modelos Animales de Enfermedad , Factor de Crecimiento de Hepatocito/metabolismo , Factor de Crecimiento de Hepatocito/farmacología , Hepatocitos/citología , Hepatocitos/efectos de los fármacos , Hepatocitos/metabolismo , Hepatocitos/patología , Hígado/citología , Hígado/efectos de los fármacos , Hígado/patología , Fallo Hepático Agudo/patología , Fallo Hepático Agudo/inducido químicamente , Regeneración Hepática/efectos de los fármacos , Ratones Endogámicos C57BL , Necrosis/inducido químicamente , Medicina Regenerativa , Análisis de Expresión Génica de una Sola Célula , Cicatrización de Heridas
2.
Nature ; 575(7783): 512-518, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31597160

RESUMEN

Liver cirrhosis is a major cause of death worldwide and is characterized by extensive fibrosis. There are currently no effective antifibrotic therapies available. To obtain a better understanding of the cellular and molecular mechanisms involved in disease pathogenesis and enable the discovery of therapeutic targets, here we profile the transcriptomes of more than 100,000 single human cells, yielding molecular definitions for non-parenchymal cell types that are found in healthy and cirrhotic human liver. We identify a scar-associated TREM2+CD9+ subpopulation of macrophages, which expands in liver fibrosis, differentiates from circulating monocytes and is pro-fibrogenic. We also define ACKR1+ and PLVAP+ endothelial cells that expand in cirrhosis, are topographically restricted to the fibrotic niche and enhance the transmigration of leucocytes. Multi-lineage modelling of ligand and receptor interactions between the scar-associated macrophages, endothelial cells and PDGFRα+ collagen-producing mesenchymal cells reveals intra-scar activity of several pro-fibrogenic pathways including TNFRSF12A, PDGFR and NOTCH signalling. Our work dissects unanticipated aspects of the cellular and molecular basis of human organ fibrosis at a single-cell level, and provides a conceptual framework for the discovery of rational therapeutic targets in liver cirrhosis.


Asunto(s)
Células Endoteliales/patología , Cirrosis Hepática/patología , Hígado/patología , Macrófagos/patología , Análisis de la Célula Individual , Animales , Estudios de Casos y Controles , Linaje de la Célula , Sistema del Grupo Sanguíneo Duffy/metabolismo , Células Endoteliales/metabolismo , Femenino , Células Estrelladas Hepáticas/citología , Células Estrelladas Hepáticas/metabolismo , Células Estrelladas Hepáticas/patología , Hepatocitos/citología , Hepatocitos/metabolismo , Hepatocitos/patología , Humanos , Hígado/citología , Cirrosis Hepática/genética , Macrófagos/metabolismo , Masculino , Glicoproteínas de Membrana/metabolismo , Proteínas de la Membrana/metabolismo , Ratones , Fenotipo , Receptor alfa de Factor de Crecimiento Derivado de Plaquetas/metabolismo , Receptores de Superficie Celular/metabolismo , Receptores Inmunológicos/metabolismo , Tetraspanina 29/metabolismo , Transcriptoma , Migración Transendotelial y Transepitelial
3.
Science ; 376(6594): eabl5197, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35549406

RESUMEN

Despite their crucial role in health and disease, our knowledge of immune cells within human tissues remains limited. We surveyed the immune compartment of 16 tissues from 12 adult donors by single-cell RNA sequencing and VDJ sequencing generating a dataset of ~360,000 cells. To systematically resolve immune cell heterogeneity across tissues, we developed CellTypist, a machine learning tool for rapid and precise cell type annotation. Using this approach, combined with detailed curation, we determined the tissue distribution of finely phenotyped immune cell types, revealing hitherto unappreciated tissue-specific features and clonal architecture of T and B cells. Our multitissue approach lays the foundation for identifying highly resolved immune cell types by leveraging a common reference dataset, tissue-integrated expression analysis, and antigen receptor sequencing.


Asunto(s)
Linfocitos B , Aprendizaje Automático , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Linfocitos T , Transcriptoma , Células Cultivadas , Humanos , Especificidad de Órganos
4.
Curr Opin Syst Biol ; 18: 87-94, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-32984660

RESUMEN

Single-cell RNA-sequencing has uncovered immune heterogeneity, including novel cell types, states and lineages that have expanded our understanding of the immune system as a whole. More recently, studies involving both immune and non-immune cells have demonstrated the importance of immune microenvironment in development, homeostasis and disease. This review focuses on the single-cell studies mapping cell-cell interactions for variety of tissues in development, health and disease. In addition, we address the need to generate a comprehensive interaction map to answer fundamental questions in immunology as well as experimental and computational strategies required for this purpose.

5.
Genome Biol ; 21(1): 1, 2019 12 31.
Artículo en Inglés | MEDLINE | ID: mdl-31892341

RESUMEN

BACKGROUND: The Human Cell Atlas is a large international collaborative effort to map all cell types of the human body. Single-cell RNA sequencing can generate high-quality data for the delivery of such an atlas. However, delays between fresh sample collection and processing may lead to poor data and difficulties in experimental design. RESULTS: This study assesses the effect of cold storage on fresh healthy spleen, esophagus, and lung from ≥ 5 donors over 72 h. We collect 240,000 high-quality single-cell transcriptomes with detailed cell type annotations and whole genome sequences of donors, enabling future eQTL studies. Our data provide a valuable resource for the study of these 3 organs and will allow cross-organ comparison of cell types. We see little effect of cold ischemic time on cell yield, total number of reads per cell, and other quality control metrics in any of the tissues within the first 24 h. However, we observe a decrease in the proportions of lung T cells at 72 h, higher percentage of mitochondrial reads, and increased contamination by background ambient RNA reads in the 72-h samples in the spleen, which is cell type specific. CONCLUSIONS: In conclusion, we present robust protocols for tissue preservation for up to 24 h prior to scRNA-seq analysis. This greatly facilitates the logistics of sample collection for Human Cell Atlas or clinical studies since it increases the time frames for sample processing.


Asunto(s)
Análisis de Secuencia de ARN , Análisis de la Célula Individual , Conservación de Tejido/métodos , Frío , Esófago/citología , Humanos , Pulmón/citología , Refrigeración , Bazo/citología
7.
Curr Opin Struct Biol ; 9(3): 390-9, 1999 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-10361097

RESUMEN

New computational techniques have allowed protein folds to be assigned to all or parts of between a quarter (Caenorhabditis elegans) and a half (Mycoplasma genitalium) of the individual protein sequences in different genomes. These assignments give a new perspective on domain structures, gene duplications, protein families and protein folds in genome sequences.


Asunto(s)
Biología Computacional/métodos , Biología Computacional/tendencias , Genoma , Proteínas/química , Proteínas/genética , Animales , Conformación Proteica
8.
Curr Opin Struct Biol ; 11(3): 354-63, 2001 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-11406387

RESUMEN

The genome sequencing projects and knowledge of the entire protein repertoires of many organisms have prompted new procedures and techniques for the large-scale determination of protein structure, function and interactions. Recently, new work has been carried out on the determination of the function and evolutionary relationships of proteins by experimental structural genomics, and the discovery of protein-protein interactions by computational structural genomics.


Asunto(s)
Evolución Molecular , Genómica/métodos , Proteínas/fisiología , Orden Génico , Filogenia , Estructura Terciaria de Proteína , Proteínas/química
9.
Curr Opin Struct Biol ; 9(1): 56-65, 1999 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-10047586

RESUMEN

Telomerases are RNA-dependent polymerases that catalyse the synthesis of the telomeric DNA at the tips of eukaryotic chromosomes. The recent identification of the catalytic subunit of telomerases from several different species suggests that the core of the telomerase is conserved. The proposed sequence and structural homology between the telomerase catalytic subunit and reverse transcriptases, together with a wealth of genetic and biochemical information, has led to significant advances in our understanding of the mechanism by which telomerases synthesise telomeric DNA.


Asunto(s)
Telomerasa/química , Telomerasa/metabolismo , Secuencia de Aminoácidos , Animales , Dominio Catalítico , ADN/biosíntesis , Proteínas de Unión al ADN , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Conformación Proteica , ARN/química , ARN/metabolismo , ADN Polimerasa Dirigida por ARN/química , ADN Polimerasa Dirigida por ARN/genética , ADN Polimerasa Dirigida por ARN/metabolismo , Homología de Secuencia de Aminoácido , Telomerasa/genética
10.
Nucleic Acids Res ; 29(8): 1750-64, 2001 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-11292848

RESUMEN

As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is available on the web at http://bioinfo.mbb.yale.edu/partslist and http://www.partslist.org. The system is based on the existing fold classifications and functions as a form of companion annotation for them, providing 'global views' of many already completed fold surveys. The central idea in the system is that of comparison through ranking; PartsList will rank the approximately 420 folds based on more than 180 attributes. These include: (i) occurrence in a number of completely sequenced genomes (e.g. it will show the most common folds in the worm versus yeast); (ii) occurrence in the structure databank (e.g. most common folds in the PDB); (iii) both absolute and relative gene expression information (e.g. most changing folds in expression over the cell cycle); (iv) protein-protein interactions, based on experimental data in yeast and comprehensive PDB surveys (e.g. most interacting fold); (v) sensitivity to inserted transposons; (vi) the number of functions associated with the fold (e.g. most multi-functional folds); (vii) amino acid composition (e.g. most Cys-rich folds); (viii) protein motions (e.g. most mobile folds); and (ix) the level of similarity based on a comprehensive set of structural alignments (e.g. most structurally variable folds). The integration of whole-genome expression and protein-protein interaction data with structural information is a particularly novel feature of our system. We provide three ways of visualizing the rankings: a profiler emphasizing the progression of high and low ranks across many pre-selected attributes, a dynamic comparer for custom comparisons and a numerical rankings correlator. These allow one to directly compare very different attributes of a fold (e.g. expression level, genome occurrence and maximum motion) in the uniform numerical format of ranks. This uniform framework, in turn, highlights the way that the frequency of many of the attributes falls off with approximate power-law behavior (i.e. according to V(-b), for attribute value V and constant exponent b), with a few folds having large values and most having small values.


Asunto(s)
Perfilación de la Expresión Génica , Genoma , Internet , Pliegue de Proteína , Proteínas/química , Programas Informáticos , Cisteína/análisis , Elementos Transponibles de ADN/genética , Bases de Datos como Asunto , Movimiento (Física) , Unión Proteica , Proteínas/clasificación , Proteínas/metabolismo , Proteoma , Proyectos de Investigación , Alineación de Secuencia
11.
J Mol Biol ; 296(5): 1367-83, 2000 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-10698639

RESUMEN

The predicted proteins of the genome of Caenorhabditis elegans were analysed by various sequence comparison methods to identify the repertoire of proteins that are members of the immunoglobulin superfamily (IgSF). The IgSF is one of the largest families of protein domain in this genome and likely to be one of the major families in other multicellular eukaryotes too. This is because members of the superfamily are involved in a variety of functions including cell-cell recognition, cell-surface receptors, muscle structure and, in higher organisms, the immune system. Sixty-four proteins with 488 I set IgSF domains were identified largely by using Hidden Markov models. The domain architectures of the protein products of these 64 genes are described. Twenty-one of these had been characterised previously. We show that another 25 are related to proteins of known function. The C. elegans IgSF proteins can be classified into five broad categories: muscle proteins, protein kinases and phosphatases, three categories of proteins involved in the development of the nervous system, leucine-rich repeat containing proteins and proteins without homologues of known function, of which there are 18. The 19 proteins involved in nervous system development that are not kinases or phosphatases are homologues of neuroglian, axonin, NCAM, wrapper, klingon, ICCR and nephrin or belong to the recently identified zig gene family. Out of the set of 64 genes, 22 are on the X chromosome. This study should be seen as an initial description of the IgSF repertoire in C. elegans, because the current gene definitions may contain a number of errors, especially in the case of long sequences, and there may be IgSF genes that have not yet been detected. However, the proteins described here do provide an overview of the bulk of the repertoire of immunoglobulin superfamily members in C. elegans, a framework for refinement and extension of the repertoire as gene and protein definitions improve, and the basis for investigations of their function and for comparisons with the repertoires of other organisms.


Asunto(s)
Caenorhabditis elegans/química , Biología Computacional , Proteínas del Helminto/química , Inmunoglobulinas/química , Familia de Multigenes , Homología de Secuencia , Animales , Caenorhabditis elegans/enzimología , Caenorhabditis elegans/genética , Moléculas de Adhesión Celular Neuronal/química , Moléculas de Adhesión Celular Neuronal/genética , Genes de Helminto/genética , Proteínas del Helminto/genética , Humanos , Inmunoglobulinas/genética , Leucina/genética , Leucina/metabolismo , Cadenas de Markov , Familia de Multigenes/genética , Proteínas Musculares/química , Proteínas Musculares/genética , Proteínas del Tejido Nervioso/química , Proteínas del Tejido Nervioso/genética , Mapeo Físico de Cromosoma , Estructura Terciaria de Proteína , Proteínas Tirosina Fosfatasas/química , Proteínas Tirosina Fosfatasas/genética , Proteínas Tirosina Quinasas/química , Proteínas Tirosina Quinasas/genética , Alineación de Secuencia , Cromosoma X/genética
12.
J Mol Biol ; 307(3): 929-38, 2001 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-11273711

RESUMEN

In the postgenomic era, one of the most interesting and important challenges is to understand protein interactions on a large scale. The physical interactions between protein domains are fundamental to the workings of a cell: in multi-domain polypeptide chains, in multi-subunit proteins and in transient complexes between proteins that also exist independently. To study the large-scale patterns and evolution of interactions between protein domains, we view interactions between protein domains in terms of the interactions between structural families of evolutionarily related domains. This allows us to classify 8151 interactions between individual domains in the Protein Data Bank and the yeast Saccharomyces cerevisiae in terms of 664 types of interactions, between protein families. At least 51 interactions do not occur in the Protein Data Bank and can only be derived from the yeast data. The map of interactions between protein families has the form of a scale-free network, meaning that most protein families only interact with one or two other families, while a few families are extremely versatile in their interactions and are connected to many families. We observe that almost half of all known families engage in interactions with domains from their own family. We also see that the repertoires of interactions of domains within and between polypeptide chains overlap mostly for two specific types of protein families: enzymes and same-family interactions. This suggests that different types of protein interaction repertoires exist for structural, functional and regulatory reasons.


Asunto(s)
Bases de Datos como Asunto , Proteínas/química , Proteínas/metabolismo , Levaduras/química , Sitios de Unión , Evolución Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Genoma Fúngico , Genómica , Modelos Moleculares , Unión Proteica , Estructura Terciaria de Proteína , Homología de Secuencia de Aminoácido , Levaduras/genética
13.
J Mol Biol ; 310(2): 311-25, 2001 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-11428892

RESUMEN

There is a limited repertoire of domain families that are duplicated and combined in different ways to form the set of proteins in a genome. Proteins are gene products, and at the level of genes, duplication, recombination, fusion and fission are the processes that produce new genes. We attempt to gain an overview of these processes by studying the evolutionary units in proteins, domains, in the protein sequences of 40 genomes. The domain and superfamily definitions in the Structural Classification of Proteins Database are used, so that we can view all pairs of adjacent domains in genome sequences in terms of their superfamily combinations. We find 783 out of the 859 superfamilies in SCOP in these genomes, and the 783 families occur in 1307 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their combinatorial behaviour; 209 families do not make combinations with other families. This type of pattern can be described as a scale-free network. We also study the N to C-terminal orientation of domain pairs and domain repeats. The phylogenetic distribution of domain combinations is surveyed, to establish the extent of common and kingdom-specific combinations. Of the kingdom-specific combinations, significantly more combinations consist of families present in all three kingdoms than of families present in one or two kingdoms. Hence, we are led to conclude that recombination between common families, as compared to the invention of new families and recombination among these, has also been a major contribution to the evolution of kingdom-specific and species-specific functions in organisms in all three kingdoms. Finally, we compare the set of the domain combinations in the genomes to those in the RCSB Protein Data Bank, and discuss the implications for structural genomics.


Asunto(s)
Archaea , Eubacterium , Células Eucariotas , Evolución Molecular , Proteoma/química , Proteoma/genética , Secuencias Repetitivas de Aminoácido/genética , Animales , Archaea/química , Archaea/genética , Secuencia Conservada/genética , Bases de Datos como Asunto , Eubacterium/química , Eubacterium/genética , Células Eucariotas/química , Células Eucariotas/metabolismo , Duplicación de Gen , Genoma , Genómica , Humanos , Familia de Multigenes/genética , Mutación/genética , Filogenia , Estructura Terciaria de Proteína , Proteoma/clasificación , Recombinación Genética/genética , Secuencias Repetidas en Tándem/genética , Levaduras/química , Levaduras/genética
14.
J Mol Biol ; 273(1): 349-54, 1997 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-9367767

RESUMEN

Two homologous sequences, which have diverged beyond the point where their homology can be recognised by a simple direct comparison, can be related through a third sequence that is suitably intermediate between the two. High scores, for a sequence match between the first and third sequences and between the second and the third sequences, imply that the first and second sequences are related even though their own match score is low. We have tested the usefulness of this idea using a database that contains the sequences of 971 protein domains whose structures are known and whose residue identities with each other are some 40% or less (PDB40D). On the basis of sequence and structural information, 2143 pairs of these sequences are known to have an evolutionary relationship. FASTA, in an all-against-all comparison of the sequences in the database, detected 320 (15%) of these relationships as well as three false positive (i.e. 1% error rate). Using intermediate sequences found by FASTA matches of PDB40D sequences to those in the large non-redundant OWL database we could detect 550 evolutionary relationships with an error rate of 1%. This means the intermediate sequence procedure increases the ability to recognise the evolutionary relationships amongst the PDB40D sequences by 70%.


Asunto(s)
Proteínas/química , Homología de Secuencia de Aminoácido , Secuencia de Aminoácidos , Ascorbato Oxidasa/química , Proteínas Bacterianas/química , Bases de Datos como Asunto , Evolución Molecular , Datos de Secuencia Molecular , Plastocianina/química , Programas Informáticos
15.
J Mol Biol ; 311(4): 693-708, 2001 Aug 24.
Artículo en Inglés | MEDLINE | ID: mdl-11518524

RESUMEN

The 106 small molecule metabolic (SMM) pathways in Escherichia coli are formed by the protein products of 581 genes. We can define 722 domains, nearly all of which are homologous to proteins of known structure, that form all or part of 510 of these proteins. This information allows us to answer general questions on the structural anatomy of the SMM pathway proteins and to trace family relationships and recruitment events within and across pathways. Half the gene products contain a single domain and half are formed by combinations of between two and six domains. The 722 domains belong to one of 213 families that have between one and 51 members. Family members usually conserve their catalytic or cofactor binding properties; substrate recognition is rarely conserved. Of the 213 families, members of only a quarter occur in isolation, i.e. they form single-domain proteins. Most members of the other families combine with domains from just one or two other families and a few more versatile families can combine with several different partners. Excluding isoenzymes, more than twice as many homologues are distributed across pathways as within pathways. However, serial recruitment, with two consecutive enzymes both being recruited to another pathway, is rare and recruitment of three consecutive enzymes is not observed. Only eight of the 106 pathways have a high number of homologues. Homology between consecutive pairs of enzymes with conservation of the main substrate-binding site but change in catalytic mechanism (which would support a simple model of retrograde pathway evolution) occurs only six times in the whole set of enzymes. Most of the domains that form SMM pathways have homologues in non-SMM pathways. Taken together, these results imply a pervasive "mosaic" model for the formation of protein repertoires and pathways.


Asunto(s)
Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Escherichia coli/química , Escherichia coli/metabolismo , Evolución Molecular , Sitios de Unión , Secuencia Conservada , Genes Duplicados , Gluconeogénesis , Glucógeno/metabolismo , Histidina/biosíntesis , Cadenas de Markov , Familia de Multigenes , Nucleótidos/metabolismo , Ácidos Fosfatidicos/biosíntesis , Polisacáridos/biosíntesis , Estructura Terciaria de Proteína , Proteoma , Purinas/biosíntesis , Pirimidinas/biosíntesis , Homología de Secuencia de Aminoácido
16.
Trends Biotechnol ; 19(12): 482-6, 2001 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-11711174

RESUMEN

Escherichia coli has been a popular organism for studying metabolic pathways. In an attempt to find out more about how these pathways are constructed, the enzymes were analysed by defining their protein domains. Structural assignments and sequence comparisons were used to show that 213 domain families constitute approximately 90% of the enzymes in the small-molecule metabolic pathways. Catalytic or cofactor-binding properties between family members are often conserved, while recognition of the main substrate with change in catalytic mechanism is only observed in a few cases of consecutive enzymes in a pathway. Recruitment of domains across pathways is very common, but there is little regularity in the pattern of domains in metabolic pathways. This is analogous to a mosaic in which a stone of a certain colour is selected to fill a position in the picture.


Asunto(s)
Enzimas/química , Enzimas/metabolismo , Escherichia coli/enzimología , Sitios de Unión/fisiología , Coenzimas/metabolismo , Escherichia coli/metabolismo , Evolución Molecular , Fucosa/metabolismo , Nucleósidos/metabolismo , Nucleótidos/metabolismo , Estructura Terciaria de Proteína/fisiología , Purinas/biosíntesis , Pirimidinas/biosíntesis , Ácido Pirúvico/metabolismo , Homología de Secuencia , Especificidad por Sustrato/fisiología , Triptófano/biosíntesis
17.
Protein Sci ; 7(6): 1477-80, 1998 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-9655353

RESUMEN

We report the discovery of a novel family of proteins, each member contains tandem pentapeptide (five residue) repeats, described by the motif A(D/N)LXX. Members of this family are both membrane bound and cytoplasmic. The function of these repeats is uncertain, but they may have a targeting or structural function rather than enzymatic activity. This family is most common in cyanobacteria, suggesting a function related to cyanobacterial-specific metabolism. Although no experimental information is available for the structure of this family, it is predicted that the tandem pentapeptide repeats will form a right-handed beta-helical structure. A structural model of the pentapeptide repeats is presented.


Asunto(s)
Proteínas Bacterianas/química , Cianobacterias/química , Oligopéptidos/química , Secuencias Repetitivas de Ácidos Nucleicos , Secuencia de Aminoácidos , Proteínas HSP70 de Choque Térmico/química , Modelos Moleculares , Datos de Secuencia Molecular , Proteínas Quinasas/química , Estructura Secundaria de Proteína
19.
Cell Mol Life Sci ; 62(4): 435-45, 2005 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-15719170

RESUMEN

Proteins are composed of domains, which are conserved evolutionary units that often also correspond to functional units and can frequently be detected with reasonable reliability using computational methods. Most proteins consist of two or more domains, giving rise to a variety of combinations of domains. Another level of complexity arises because proteins themselves can form complexes with small molecules, nucleic acids and other proteins. The networks of both domain combinations and protein interactions can be conceptualised as graphs, and these graphs can be analysed conveniently by computational methods. In this review we summarise facts and hypotheses about the evolution of domains in multi-domain proteins and protein complexes, and the tools and data resources available to study them.


Asunto(s)
Evolución Molecular , Estructura Terciaria de Proteína/genética , Proteínas/genética , Secuencia de Aminoácidos , Animales , Biología Computacional , Secuencia Conservada/genética , Secuencia Conservada/fisiología , Variación Genética , Humanos , Complejos Multiproteicos/química , Complejos Multiproteicos/genética , Estructura Terciaria de Proteína/fisiología , Proteínas/fisiología
20.
Bioinformatics ; 14(2): 144-50, 1998.
Artículo en Inglés | MEDLINE | ID: mdl-9545446

RESUMEN

MOTIVATION: Large-scale determination of relationships between the proteins produced by genome sequences is now common. All protein sequences are matched and those that have high match scores are clustered into families. In cases where the proteins are built of several domains or duplication modules, this can lead to misleading results. Consider the very simple example of three proteins: 1, formed by duplication modules A and B; 2, formed by duplication modules B' and C; and 3, formed by duplication modules C' and D. Duplication modules B and B' are homologous, as are C and C'. Matching the sequences of 1, 2 and 3 followed by simple single-linkage clustering would put all three in the same family, even though proteins 1 and 3 are not related. This is because the different parts of 2 match 1 and 3. This paper describes a procedure, DIVCLUS, that divides such complex clusters of partially related sequences into simple clusters that contain only related duplication modules. In the example just given, it would produce two groups of sequences: the first with domains B of sequence 1 and B of sequence 2, and the second with domain C of sequence 2 and C of sequence 3. DIVCLUS is part of a package called GEANFAMMER, for GEnome ANalysis and protein FAMily MakER. The package automates the detection of families of duplication modules from a protein sequence database. RESULTS: DIVCLUS has been applied to the division of single-linkage clusters generated from the protein sequences of six completely sequenced bacterial genomes. Out of 12 013 genes in these six genomes, 4563 single- and multi-domain sequences formed 1071 complex clusters. Application of the DIVCLUS program resolved these clusters into 2113 clusters corresponding to single duplication modules. AVAILABILITY: The perl5 program and its documentation are available at the following address: http://www.mrc-lmb.cam.ac.uk/genomes/ and by anonymous ftp at ftp.mrc-lmb.cam.ac.uk in the directory /pub/genomes/Software/. CONTACT: sat@mrc-lmb.cam.ac.uk; jong@mrc-lmb. cam.ac.uk


Asunto(s)
Proteínas/química , Proteínas/genética , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos , Análisis por Conglomerados , Biología Computacional , Bases de Datos Factuales , Datos de Secuencia Molecular , Familia de Multigenes , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Diseño de Software
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA