Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 24
1.
Nature ; 630(8015): 158-165, 2024 Jun.
Article En | MEDLINE | ID: mdl-38693268

The liver has a unique ability to regenerate1,2; however, in the setting of acute liver failure (ALF), this regenerative capacity is often overwhelmed, leaving emergency liver transplantation as the only curative option3-5. Here, to advance understanding of human liver regeneration, we use paired single-nucleus RNA sequencing combined with spatial profiling of healthy and ALF explant human livers to generate a single-cell, pan-lineage atlas of human liver regeneration. We uncover a novel ANXA2+ migratory hepatocyte subpopulation, which emerges during human liver regeneration, and a corollary subpopulation in a mouse model of acetaminophen (APAP)-induced liver regeneration. Interrogation of necrotic wound closure and hepatocyte proliferation across multiple timepoints following APAP-induced liver injury in mice demonstrates that wound closure precedes hepatocyte proliferation. Four-dimensional intravital imaging of APAP-induced mouse liver injury identifies motile hepatocytes at the edge of the necrotic area, enabling collective migration of the hepatocyte sheet to effect wound closure. Depletion of hepatocyte ANXA2 reduces hepatocyte growth factor-induced human and mouse hepatocyte migration in vitro, and abrogates necrotic wound closure following APAP-induced mouse liver injury. Together, our work dissects unanticipated aspects of liver regeneration, demonstrating an uncoupling of wound closure and hepatocyte proliferation and uncovering a novel migratory hepatocyte subpopulation that mediates wound closure following liver injury. Therapies designed to promote rapid reconstitution of normal hepatic microarchitecture and reparation of the gut-liver barrier may advance new areas of therapeutic discovery in regenerative medicine.


Acetaminophen , Cell Movement , Cell Proliferation , Hepatocytes , Liver Failure, Acute , Liver Regeneration , Liver , Liver Regeneration/physiology , Humans , Animals , Acetaminophen/pharmacology , Mice , Hepatocytes/cytology , Male , Liver Failure, Acute/pathology , Liver Failure, Acute/chemically induced , Liver/cytology , Necrosis , Disease Models, Animal , Wound Healing , Female , Single-Cell Analysis , Hepatocyte Growth Factor/metabolism , Hepatocyte Growth Factor/pharmacology , Chemical and Drug Induced Liver Injury/pathology , Cell Lineage , Mice, Inbred C57BL
2.
Science ; 376(6594): eabl5197, 2022 05 13.
Article En | MEDLINE | ID: mdl-35549406

Despite their crucial role in health and disease, our knowledge of immune cells within human tissues remains limited. We surveyed the immune compartment of 16 tissues from 12 adult donors by single-cell RNA sequencing and VDJ sequencing generating a dataset of ~360,000 cells. To systematically resolve immune cell heterogeneity across tissues, we developed CellTypist, a machine learning tool for rapid and precise cell type annotation. Using this approach, combined with detailed curation, we determined the tissue distribution of finely phenotyped immune cell types, revealing hitherto unappreciated tissue-specific features and clonal architecture of T and B cells. Our multitissue approach lays the foundation for identifying highly resolved immune cell types by leveraging a common reference dataset, tissue-integrated expression analysis, and antigen receptor sequencing.


B-Lymphocytes , Machine Learning , Sequence Analysis, RNA , Single-Cell Analysis , T-Lymphocytes , Transcriptome , Cells, Cultured , Humans , Organ Specificity
3.
Nature ; 575(7783): 512-518, 2019 11.
Article En | MEDLINE | ID: mdl-31597160

Liver cirrhosis is a major cause of death worldwide and is characterized by extensive fibrosis. There are currently no effective antifibrotic therapies available. To obtain a better understanding of the cellular and molecular mechanisms involved in disease pathogenesis and enable the discovery of therapeutic targets, here we profile the transcriptomes of more than 100,000 single human cells, yielding molecular definitions for non-parenchymal cell types that are found in healthy and cirrhotic human liver. We identify a scar-associated TREM2+CD9+ subpopulation of macrophages, which expands in liver fibrosis, differentiates from circulating monocytes and is pro-fibrogenic. We also define ACKR1+ and PLVAP+ endothelial cells that expand in cirrhosis, are topographically restricted to the fibrotic niche and enhance the transmigration of leucocytes. Multi-lineage modelling of ligand and receptor interactions between the scar-associated macrophages, endothelial cells and PDGFRα+ collagen-producing mesenchymal cells reveals intra-scar activity of several pro-fibrogenic pathways including TNFRSF12A, PDGFR and NOTCH signalling. Our work dissects unanticipated aspects of the cellular and molecular basis of human organ fibrosis at a single-cell level, and provides a conceptual framework for the discovery of rational therapeutic targets in liver cirrhosis.


Endothelial Cells/pathology , Liver Cirrhosis/pathology , Liver/pathology , Macrophages/pathology , Single-Cell Analysis , Animals , Case-Control Studies , Cell Lineage , Duffy Blood-Group System/metabolism , Endothelial Cells/metabolism , Female , Hepatic Stellate Cells/cytology , Hepatic Stellate Cells/metabolism , Hepatic Stellate Cells/pathology , Hepatocytes/cytology , Hepatocytes/metabolism , Hepatocytes/pathology , Humans , Liver/cytology , Liver Cirrhosis/genetics , Macrophages/metabolism , Male , Membrane Glycoproteins/metabolism , Membrane Proteins/metabolism , Mice , Phenotype , Receptor, Platelet-Derived Growth Factor alpha/metabolism , Receptors, Cell Surface/metabolism , Receptors, Immunologic/metabolism , Tetraspanin 29/metabolism , Transcriptome , Transendothelial and Transepithelial Migration
4.
Curr Opin Syst Biol ; 18: 87-94, 2019 Dec.
Article En | MEDLINE | ID: mdl-32984660

Single-cell RNA-sequencing has uncovered immune heterogeneity, including novel cell types, states and lineages that have expanded our understanding of the immune system as a whole. More recently, studies involving both immune and non-immune cells have demonstrated the importance of immune microenvironment in development, homeostasis and disease. This review focuses on the single-cell studies mapping cell-cell interactions for variety of tissues in development, health and disease. In addition, we address the need to generate a comprehensive interaction map to answer fundamental questions in immunology as well as experimental and computational strategies required for this purpose.

5.
Genome Biol ; 21(1): 1, 2019 12 31.
Article En | MEDLINE | ID: mdl-31892341

BACKGROUND: The Human Cell Atlas is a large international collaborative effort to map all cell types of the human body. Single-cell RNA sequencing can generate high-quality data for the delivery of such an atlas. However, delays between fresh sample collection and processing may lead to poor data and difficulties in experimental design. RESULTS: This study assesses the effect of cold storage on fresh healthy spleen, esophagus, and lung from ≥ 5 donors over 72 h. We collect 240,000 high-quality single-cell transcriptomes with detailed cell type annotations and whole genome sequences of donors, enabling future eQTL studies. Our data provide a valuable resource for the study of these 3 organs and will allow cross-organ comparison of cell types. We see little effect of cold ischemic time on cell yield, total number of reads per cell, and other quality control metrics in any of the tissues within the first 24 h. However, we observe a decrease in the proportions of lung T cells at 72 h, higher percentage of mitochondrial reads, and increased contamination by background ambient RNA reads in the 72-h samples in the spleen, which is cell type specific. CONCLUSIONS: In conclusion, we present robust protocols for tissue preservation for up to 24 h prior to scRNA-seq analysis. This greatly facilitates the logistics of sample collection for Human Cell Atlas or clinical studies since it increases the time frames for sample processing.


Sequence Analysis, RNA , Single-Cell Analysis , Tissue Preservation/methods , Cold Temperature , Esophagus/cytology , Humans , Lung/cytology , Refrigeration , Spleen/cytology
6.
Cell Mol Life Sci ; 62(4): 435-45, 2005 Feb.
Article En | MEDLINE | ID: mdl-15719170

Proteins are composed of domains, which are conserved evolutionary units that often also correspond to functional units and can frequently be detected with reasonable reliability using computational methods. Most proteins consist of two or more domains, giving rise to a variety of combinations of domains. Another level of complexity arises because proteins themselves can form complexes with small molecules, nucleic acids and other proteins. The networks of both domain combinations and protein interactions can be conceptualised as graphs, and these graphs can be analysed conveniently by computational methods. In this review we summarise facts and hypotheses about the evolution of domains in multi-domain proteins and protein complexes, and the tools and data resources available to study them.


Evolution, Molecular , Protein Structure, Tertiary/genetics , Proteins/genetics , Amino Acid Sequence , Animals , Computational Biology , Conserved Sequence/genetics , Conserved Sequence/physiology , Genetic Variation , Humans , Multiprotein Complexes/chemistry , Multiprotein Complexes/genetics , Protein Structure, Tertiary/physiology , Proteins/physiology
7.
Trends Biotechnol ; 19(12): 482-6, 2001 Dec.
Article En | MEDLINE | ID: mdl-11711174

Escherichia coli has been a popular organism for studying metabolic pathways. In an attempt to find out more about how these pathways are constructed, the enzymes were analysed by defining their protein domains. Structural assignments and sequence comparisons were used to show that 213 domain families constitute approximately 90% of the enzymes in the small-molecule metabolic pathways. Catalytic or cofactor-binding properties between family members are often conserved, while recognition of the main substrate with change in catalytic mechanism is only observed in a few cases of consecutive enzymes in a pathway. Recruitment of domains across pathways is very common, but there is little regularity in the pattern of domains in metabolic pathways. This is analogous to a mosaic in which a stone of a certain colour is selected to fill a position in the picture.


Enzymes/chemistry , Enzymes/metabolism , Escherichia coli/enzymology , Binding Sites/physiology , Coenzymes/metabolism , Escherichia coli/metabolism , Evolution, Molecular , Fucose/metabolism , Nucleosides/metabolism , Nucleotides/metabolism , Protein Structure, Tertiary/physiology , Purines/biosynthesis , Pyrimidines/biosynthesis , Pyruvic Acid/metabolism , Sequence Homology , Substrate Specificity/physiology , Tryptophan/biosynthesis
8.
J Mol Biol ; 311(4): 693-708, 2001 Aug 24.
Article En | MEDLINE | ID: mdl-11518524

The 106 small molecule metabolic (SMM) pathways in Escherichia coli are formed by the protein products of 581 genes. We can define 722 domains, nearly all of which are homologous to proteins of known structure, that form all or part of 510 of these proteins. This information allows us to answer general questions on the structural anatomy of the SMM pathway proteins and to trace family relationships and recruitment events within and across pathways. Half the gene products contain a single domain and half are formed by combinations of between two and six domains. The 722 domains belong to one of 213 families that have between one and 51 members. Family members usually conserve their catalytic or cofactor binding properties; substrate recognition is rarely conserved. Of the 213 families, members of only a quarter occur in isolation, i.e. they form single-domain proteins. Most members of the other families combine with domains from just one or two other families and a few more versatile families can combine with several different partners. Excluding isoenzymes, more than twice as many homologues are distributed across pathways as within pathways. However, serial recruitment, with two consecutive enzymes both being recruited to another pathway, is rare and recruitment of three consecutive enzymes is not observed. Only eight of the 106 pathways have a high number of homologues. Homology between consecutive pairs of enzymes with conservation of the main substrate-binding site but change in catalytic mechanism (which would support a simple model of retrograde pathway evolution) occurs only six times in the whole set of enzymes. Most of the domains that form SMM pathways have homologues in non-SMM pathways. Taken together, these results imply a pervasive "mosaic" model for the formation of protein repertoires and pathways.


Bacterial Proteins/chemistry , Bacterial Proteins/metabolism , Escherichia coli/chemistry , Escherichia coli/metabolism , Evolution, Molecular , Binding Sites , Conserved Sequence , Genes, Duplicate , Gluconeogenesis , Glycogen/metabolism , Histidine/biosynthesis , Markov Chains , Multigene Family , Nucleotides/metabolism , Phosphatidic Acids/biosynthesis , Polysaccharides/biosynthesis , Protein Structure, Tertiary , Proteome , Purines/biosynthesis , Pyrimidines/biosynthesis , Sequence Homology, Amino Acid
9.
Bioinformatics ; 17 Suppl 1: S83-9, 2001.
Article En | MEDLINE | ID: mdl-11472996

Domains are the building blocks of all globular proteins, and are units of compact three-dimensional structure as well as evolutionary units. There is a limited repertoire of domain families, so that these domain families are duplicated and combined in different ways to form the set of proteins in a genome. Proteins are gene products. The processes that produce new genes are duplication and recombination as well as gene fusion and fission. We attempt to gain an overview of these processes by studying the structural domains in the proteins of seven genomes from the three kingdoms of life: Eubacteria, Archaea and Eukaryota. We use here the domain and superfamily definitions in Structural Classification of Proteins Database (SCOP) in order to map pairs of adjacent domains in genome sequences in terms of their superfamily combinations. We find 624 out of the 764 superfamilies in SCOP in these genomes, and the 624 families occur in 585 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their combinatorial behaviour. This type of pattern can be described by a scale-free network. Finally, we study domain repeats and we compare the set of the domain combinations in the genomes to those in PDB, and discuss the implications for structural genomics.


Databases, Protein , Protein Structure, Tertiary , Computational Biology , Genome , Peptides/chemistry , Peptides/genetics , Phylogeny , Protein Structure, Tertiary/genetics
10.
J Mol Biol ; 310(2): 311-25, 2001 Jul 06.
Article En | MEDLINE | ID: mdl-11428892

There is a limited repertoire of domain families that are duplicated and combined in different ways to form the set of proteins in a genome. Proteins are gene products, and at the level of genes, duplication, recombination, fusion and fission are the processes that produce new genes. We attempt to gain an overview of these processes by studying the evolutionary units in proteins, domains, in the protein sequences of 40 genomes. The domain and superfamily definitions in the Structural Classification of Proteins Database are used, so that we can view all pairs of adjacent domains in genome sequences in terms of their superfamily combinations. We find 783 out of the 859 superfamilies in SCOP in these genomes, and the 783 families occur in 1307 pairwise combinations. Most families are observed in combination with one or two other families, while a few families are very versatile in their combinatorial behaviour; 209 families do not make combinations with other families. This type of pattern can be described as a scale-free network. We also study the N to C-terminal orientation of domain pairs and domain repeats. The phylogenetic distribution of domain combinations is surveyed, to establish the extent of common and kingdom-specific combinations. Of the kingdom-specific combinations, significantly more combinations consist of families present in all three kingdoms than of families present in one or two kingdoms. Hence, we are led to conclude that recombination between common families, as compared to the invention of new families and recombination among these, has also been a major contribution to the evolution of kingdom-specific and species-specific functions in organisms in all three kingdoms. Finally, we compare the set of the domain combinations in the genomes to those in the RCSB Protein Data Bank, and discuss the implications for structural genomics.


Archaea , Eubacterium , Eukaryotic Cells , Evolution, Molecular , Proteome/chemistry , Proteome/genetics , Repetitive Sequences, Amino Acid/genetics , Animals , Archaea/chemistry , Archaea/genetics , Conserved Sequence/genetics , Databases as Topic , Eubacterium/chemistry , Eubacterium/genetics , Eukaryotic Cells/chemistry , Eukaryotic Cells/metabolism , Gene Duplication , Genome , Genomics , Humans , Multigene Family/genetics , Mutation/genetics , Phylogeny , Protein Structure, Tertiary , Proteome/classification , Recombination, Genetic/genetics , Tandem Repeat Sequences/genetics , Yeasts/chemistry , Yeasts/genetics
11.
Curr Opin Struct Biol ; 11(3): 354-63, 2001 Jun.
Article En | MEDLINE | ID: mdl-11406387

The genome sequencing projects and knowledge of the entire protein repertoires of many organisms have prompted new procedures and techniques for the large-scale determination of protein structure, function and interactions. Recently, new work has been carried out on the determination of the function and evolutionary relationships of proteins by experimental structural genomics, and the discovery of protein-protein interactions by computational structural genomics.


Evolution, Molecular , Genomics/methods , Proteins/physiology , Gene Order , Phylogeny , Protein Structure, Tertiary , Proteins/chemistry
12.
Nucleic Acids Res ; 29(8): 1750-64, 2001 Apr 15.
Article En | MEDLINE | ID: mdl-11292848

As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is available on the web at http://bioinfo.mbb.yale.edu/partslist and http://www.partslist.org. The system is based on the existing fold classifications and functions as a form of companion annotation for them, providing 'global views' of many already completed fold surveys. The central idea in the system is that of comparison through ranking; PartsList will rank the approximately 420 folds based on more than 180 attributes. These include: (i) occurrence in a number of completely sequenced genomes (e.g. it will show the most common folds in the worm versus yeast); (ii) occurrence in the structure databank (e.g. most common folds in the PDB); (iii) both absolute and relative gene expression information (e.g. most changing folds in expression over the cell cycle); (iv) protein-protein interactions, based on experimental data in yeast and comprehensive PDB surveys (e.g. most interacting fold); (v) sensitivity to inserted transposons; (vi) the number of functions associated with the fold (e.g. most multi-functional folds); (vii) amino acid composition (e.g. most Cys-rich folds); (viii) protein motions (e.g. most mobile folds); and (ix) the level of similarity based on a comprehensive set of structural alignments (e.g. most structurally variable folds). The integration of whole-genome expression and protein-protein interaction data with structural information is a particularly novel feature of our system. We provide three ways of visualizing the rankings: a profiler emphasizing the progression of high and low ranks across many pre-selected attributes, a dynamic comparer for custom comparisons and a numerical rankings correlator. These allow one to directly compare very different attributes of a fold (e.g. expression level, genome occurrence and maximum motion) in the uniform numerical format of ranks. This uniform framework, in turn, highlights the way that the frequency of many of the attributes falls off with approximate power-law behavior (i.e. according to V(-b), for attribute value V and constant exponent b), with a few folds having large values and most having small values.


Gene Expression Profiling , Genome , Internet , Protein Folding , Proteins/chemistry , Software , Cysteine/analysis , DNA Transposable Elements/genetics , Databases as Topic , Motion , Protein Binding , Proteins/classification , Proteins/metabolism , Proteome , Research Design , Sequence Alignment
13.
J Mol Biol ; 307(3): 929-38, 2001 Mar 30.
Article En | MEDLINE | ID: mdl-11273711

In the postgenomic era, one of the most interesting and important challenges is to understand protein interactions on a large scale. The physical interactions between protein domains are fundamental to the workings of a cell: in multi-domain polypeptide chains, in multi-subunit proteins and in transient complexes between proteins that also exist independently. To study the large-scale patterns and evolution of interactions between protein domains, we view interactions between protein domains in terms of the interactions between structural families of evolutionarily related domains. This allows us to classify 8151 interactions between individual domains in the Protein Data Bank and the yeast Saccharomyces cerevisiae in terms of 664 types of interactions, between protein families. At least 51 interactions do not occur in the Protein Data Bank and can only be derived from the yeast data. The map of interactions between protein families has the form of a scale-free network, meaning that most protein families only interact with one or two other families, while a few families are extremely versatile in their interactions and are connected to many families. We observe that almost half of all known families engage in interactions with domains from their own family. We also see that the repertoires of interactions of domains within and between polypeptide chains overlap mostly for two specific types of protein families: enzymes and same-family interactions. This suggests that different types of protein interaction repertoires exist for structural, functional and regulatory reasons.


Databases as Topic , Proteins/chemistry , Proteins/metabolism , Yeasts/chemistry , Binding Sites , Evolution, Molecular , Fungal Proteins/chemistry , Fungal Proteins/metabolism , Genome, Fungal , Genomics , Models, Molecular , Protein Binding , Protein Structure, Tertiary , Sequence Homology, Amino Acid , Yeasts/genetics
14.
Bioinformatics ; 16(2): 117-24, 2000 Feb.
Article En | MEDLINE | ID: mdl-10842732

MOTIVATION: For large-scale structural assignment to sequences, as in computational structural genomics, a fast yet sensitive sequence search procedure is essential. A new approach using intermediate sequences was tested as a shortcut to iterative multiple sequence search methods such as PSI-BLAST. RESULTS: A library containing potential intermediate sequences for proteins of known structure (PDB-ISL) was constructed. The sequences in the library were collected from a large sequence database using the sequences of the domains of proteins of known structure as the query sequences and the program PSI-BLAST. Sequences of proteins of unknown structure can be matched to distantly related proteins of known structure by using pairwise sequence comparison methods to find homologues in PDB-ISL. Searches of PDB-ISL were calibrated, and the number of correct matches found at a given error rate was the same as that found by PSI-BLAST. The advantage of this library is that it uses pairwise sequence comparison methods, such as FASTA or BLAST2, and can, therefore, be searched easily and, in many cases, much more quickly than an iterative multiple sequence comparison method. The procedure is roughly 20 times faster than PSI-BLAST for small genomes and several hundred times for large genomes. AVAILABILITY: Sequences can be submitted to the PDB-ISL servers at http://stash.mrc-lmb.cam.ac.uk/PDB_ISL/ or http://cyrah.ebi.ac.uk:1111/Serv/PDB_ISL/ and can be downloaded from ftp://ftp.ebi.ac.uk/pub/contrib/jong/PDB_+ ++ISL/ CONTACT: sat@mrc-lmb.cam.ac.uk and jong@ebi.ac.uk


Proteins/analysis , Sequence Analysis/methods , Peptide Library , Time Factors
15.
J Mol Biol ; 296(5): 1367-83, 2000 Mar 10.
Article En | MEDLINE | ID: mdl-10698639

The predicted proteins of the genome of Caenorhabditis elegans were analysed by various sequence comparison methods to identify the repertoire of proteins that are members of the immunoglobulin superfamily (IgSF). The IgSF is one of the largest families of protein domain in this genome and likely to be one of the major families in other multicellular eukaryotes too. This is because members of the superfamily are involved in a variety of functions including cell-cell recognition, cell-surface receptors, muscle structure and, in higher organisms, the immune system. Sixty-four proteins with 488 I set IgSF domains were identified largely by using Hidden Markov models. The domain architectures of the protein products of these 64 genes are described. Twenty-one of these had been characterised previously. We show that another 25 are related to proteins of known function. The C. elegans IgSF proteins can be classified into five broad categories: muscle proteins, protein kinases and phosphatases, three categories of proteins involved in the development of the nervous system, leucine-rich repeat containing proteins and proteins without homologues of known function, of which there are 18. The 19 proteins involved in nervous system development that are not kinases or phosphatases are homologues of neuroglian, axonin, NCAM, wrapper, klingon, ICCR and nephrin or belong to the recently identified zig gene family. Out of the set of 64 genes, 22 are on the X chromosome. This study should be seen as an initial description of the IgSF repertoire in C. elegans, because the current gene definitions may contain a number of errors, especially in the case of long sequences, and there may be IgSF genes that have not yet been detected. However, the proteins described here do provide an overview of the bulk of the repertoire of immunoglobulin superfamily members in C. elegans, a framework for refinement and extension of the repertoire as gene and protein definitions improve, and the basis for investigations of their function and for comparisons with the repertoires of other organisms.


Caenorhabditis elegans/chemistry , Computational Biology , Helminth Proteins/chemistry , Immunoglobulins/chemistry , Multigene Family , Sequence Homology , Animals , Caenorhabditis elegans/enzymology , Caenorhabditis elegans/genetics , Cell Adhesion Molecules, Neuronal/chemistry , Cell Adhesion Molecules, Neuronal/genetics , Genes, Helminth/genetics , Helminth Proteins/genetics , Humans , Immunoglobulins/genetics , Leucine/genetics , Leucine/metabolism , Markov Chains , Multigene Family/genetics , Muscle Proteins/chemistry , Muscle Proteins/genetics , Nerve Tissue Proteins/chemistry , Nerve Tissue Proteins/genetics , Physical Chromosome Mapping , Protein Structure, Tertiary , Protein Tyrosine Phosphatases/chemistry , Protein Tyrosine Phosphatases/genetics , Protein-Tyrosine Kinases/chemistry , Protein-Tyrosine Kinases/genetics , Sequence Alignment , X Chromosome/genetics
17.
J Mol Evol ; 49(1): 98-107, 1999 Jul.
Article En | MEDLINE | ID: mdl-10368438

Using the sequence information from nine completely sequenced bacterial genomes, we extract 32 protein families that are thought to contain orthologous proteins from each genome. The alignments of these 32 families are used to construct a phylogeny with the neighbor-joining algorithm. This tree has several topological features that are different from the conventional phylogeny, yet it is highly reliable according to its bootstrap values. Upon closer study of the individual families used, it is clear that the strong phylogenetic signal comes from three families, at least two of which are good candidates for horizontal transfer. The tree from the remaining 29 families consists almost entirely of noise at the level of bacterial phylum divisions, indicating that, even with large amounts of data, it may not be possible to reconstruct the prokaryote phylogeny using standard sequence-based methods.


Bacterial Proteins/genetics , Phylogeny , Arginine-tRNA Ligase/genetics , Genome, Bacterial , Models, Biological , Phenylalanine-tRNA Ligase/genetics , Phosphoglycerate Kinase/genetics , RNA, Ribosomal/genetics , RNA, Ribosomal, 16S/genetics
18.
Curr Opin Struct Biol ; 9(3): 390-9, 1999 Jun.
Article En | MEDLINE | ID: mdl-10361097

New computational techniques have allowed protein folds to be assigned to all or parts of between a quarter (Caenorhabditis elegans) and a half (Mycoplasma genitalium) of the individual protein sequences in different genomes. These assignments give a new perspective on domain structures, gene duplications, protein families and protein folds in genome sequences.


Computational Biology/methods , Computational Biology/trends , Genome , Proteins/chemistry , Proteins/genetics , Animals , Protein Conformation
19.
Curr Opin Struct Biol ; 9(1): 56-65, 1999 Feb.
Article En | MEDLINE | ID: mdl-10047586

Telomerases are RNA-dependent polymerases that catalyse the synthesis of the telomeric DNA at the tips of eukaryotic chromosomes. The recent identification of the catalytic subunit of telomerases from several different species suggests that the core of the telomerase is conserved. The proposed sequence and structural homology between the telomerase catalytic subunit and reverse transcriptases, together with a wealth of genetic and biochemical information, has led to significant advances in our understanding of the mechanism by which telomerases synthesise telomeric DNA.


Telomerase/chemistry , Telomerase/metabolism , Amino Acid Sequence , Animals , Catalytic Domain , DNA/biosynthesis , DNA-Binding Proteins , Humans , Models, Molecular , Molecular Sequence Data , Protein Conformation , RNA/chemistry , RNA/metabolism , RNA-Directed DNA Polymerase/chemistry , RNA-Directed DNA Polymerase/genetics , RNA-Directed DNA Polymerase/metabolism , Sequence Homology, Amino Acid , Telomerase/genetics
...