Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Proc Natl Acad Sci U S A ; 101(26): 9734-9, 2004 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-15210992

RESUMO

Investigation of sequence variation in common inbred mouse strains has revealed a segmented pattern in which regions of high and low variant density are intermixed. Furthermore, it has been suggested that allelic strain distribution patterns also occur in well defined blocks and consequently could be used to map quantitative trait loci (QTL) in comparisons between inbred strains. We report a detailed analysis of polymorphism distribution in multiple inbred mouse strains over a 4.8-megabase region containing a QTL influencing anxiety. Our analysis indicates that it is only partly true that the genomes of inbred strains exist as a patchwork of segments of sequence identity and difference. We show that the definition of haplotype blocks is not robust and that methods for QTL mapping may fail if they assume a simple block-like structure.


Assuntos
Variação Genética/genética , Haplótipos/genética , Camundongos Endogâmicos/genética , Alelos , Animais , Ansiedade/genética , Camundongos , Repetições de Microssatélites/genética , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Análise de Sequência de DNA
3.
Genome Res ; 11(12): 1996-2008, 2001 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-11731489

RESUMO

Sequence database searching methods such as BLAST, are invaluable for predicting molecular function on the basis of sequence similarities among single regions of proteins. Searches of whole databases however, are not optimized to detect multiple homologous regions within a single polypeptide. Here we have used the prospero algorithm to perform self-comparisons of all predicted Drosophila melanogaster gene products. Predicted repeats, and their homologs from all species, were analyzed further to detect hitherto unappreciated evolutionary relationships. Results included the identification of novel tandem repeats in the human X-linked retinitis pigmentosa type-2 gene product, repeated segments in cystinosin, associated with a defect in cystine transport, and 'nested' homologous domains in dysferlin, whose gene is mutated in limb girdle muscular dystrophy. Novel signaling domain families were found that may regulate the microtubule-based cytoskeleton and ubiquitin-mediated proteolysis, respectively. Two families of glycosyl hydrolases were shown to contain internal repetitions that hint at their evolution via a piecemeal, modular approach. In addition, three examples of fruit fly genes were detected with tandem exons that appear to have arisen via internal duplication. These findings demonstrate how completely sequenced genomes can be exploited to further understand the relationships between molecular structure, function, and evolution.


Assuntos
Proteínas de Drosophila/química , Proteínas de Drosophila/fisiologia , Drosophila melanogaster/química , Evolução Molecular , Proteínas do Olho , Glicoproteínas , Sequências Repetitivas de Aminoácidos , Sequência de Aminoácidos/genética , Sistemas de Transporte de Aminoácidos Neutros , Animais , Antígenos de Diferenciação de Linfócitos B/química , Antígenos de Diferenciação de Linfócitos B/genética , Antígenos de Diferenciação de Linfócitos B/fisiologia , Aspartato-tRNA Ligase/química , Aspartato-tRNA Ligase/genética , Aspartato-tRNA Ligase/fisiologia , Cistinose/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/enzimologia , Drosophila melanogaster/genética , Éxons/genética , Proteínas de Ligação ao GTP , Duplicação Gênica , Glicosídeo Hidrolases/química , Glicosídeo Hidrolases/genética , Glicosídeo Hidrolases/fisiologia , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/genética , Antígenos de Histocompatibilidade Classe II/fisiologia , Humanos , Proteínas de Insetos/química , Proteínas de Insetos/genética , Proteínas de Insetos/fisiologia , Peptídeos e Proteínas de Sinalização Intracelular , Proteínas de Membrana/química , Proteínas de Membrana/genética , Proteínas de Membrana/fisiologia , Proteínas de Membrana Transportadoras , Dados de Sequência Molecular , Distrofias Musculares/genética , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Proteínas/fisiologia , Retinose Pigmentar/genética , Transdução de Sinais/genética , Especificidade da Espécie , Sequências de Repetição em Tandem
4.
Protein Sci ; 10(2): 285-92, 2001 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-11266614

RESUMO

Sequence similarity is the most common measure currently used to infer homology between proteins. Typically, homologous protein domains show sequence similarity over their entire lengths. Here we identify Asp box motifs, initially found as repeats in sialidases and neuraminidases, in new structural and sequence contexts. These motifs represent significantly similar sequences, localized to beta hairpins within proteins that are otherwise different in sequence and three-dimensional structure. By performing a combined sequence- and structure-based analysis we detect Asp boxes in more than nine protein families, including bacterial ribonucleases, sulfite oxidases, reelin, netrins, some lipoprotein receptors, and a variety of glycosyl hydrolases. Although the function common to each of these proteins, if any, remains unclear, we discuss possible functions of Asp boxes on the basis of previously determined experimental results and discuss different evolutionary scenarios for the origin of Asp-box containing proteins.


Assuntos
Ácido Aspártico/química , Neuraminidase/química , Acetilglucosaminidase/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Bases de Dados Factuais , Evolução Molecular , Modelos Moleculares , Dados de Sequência Molecular , Dobramento de Proteína , Estrutura Terciária de Proteína , Ribonucleases/química , Homologia de Sequência de Aminoácidos , Água/química
5.
Nature ; 408(6810): 331-6, 2000 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-11099034

RESUMO

Genome sequencing projects generate a wealth of information; however, the ultimate goal of such projects is to accelerate the identification of the biological function of genes. This creates a need for comprehensive studies to fill the gap between sequence and function. Here we report the results of a functional genomic screen to identify genes required for cell division in Caenorhabditis elegans. We inhibited the expression of approximately 96% of the approximately 2,300 predicted open reading frames on chromosome III using RNA-mediated interference (RNAi). By using an in vivo time-lapse differential interference contrast microscopy assay, we identified 133 genes (approximately 6%) necessary for distinct cellular processes in early embryos. Our results indicate that these genes represent most of the genes on chromosome III that are required for proper cell division in C. elegans embryos. The complete data set, including sample time-lapse recordings, has been deposited in an open access database. We found that approximately 47% of the genes associated with a differential interference contrast phenotype have clear orthologues in other eukaryotes, indicating that this screen provides putative gene functions for other species as well.


Assuntos
Caenorhabditis elegans/genética , Divisão Celular/genética , Genes de Helmintos , RNA de Helmintos , Animais , Caenorhabditis elegans/fisiologia , Cromossomos , Genômica , Fases de Leitura Aberta
6.
J Mol Biol ; 303(4): 627-41, 2000 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-11054297

RESUMO

We provide statistically reliable sequence evidence indicating that at least 12 of 23 SCOP (betaalpha)(8) (TIM) barrel superfamilies share a common origin. This includes all but one of the known and predicted TIM barrels found in central metabolism. The statistical evidence is complemented by an examination of the details of protein structure, with certain structural locations favouring catalytic residues even though the nature of their molecular function may change. The combined analysis of sequence, structure and function also enables us to propose a phylogeny of TIM barrels. Based on these data, we are able to examine differing theories of pathway and enzyme evolution, by mapping known TIM barrel folds to the pathways of central metabolism. The results favour widespread recruitment of enzymes between pathways, rather than a "backwards evolution" model, and support the idea that modern proteins may have arisen from common ancestors that bound key metabolites.


Assuntos
Enzimas/química , Enzimas/metabolismo , Evolução Molecular , Estrutura Terciária de Proteína , Aldeído Liases/química , Aldeído Liases/metabolismo , Sequência de Aminoácidos , Animais , Sítios de Ligação , Biologia Computacional , Bases de Dados como Assunto , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Família Multigênica , Fosfatos/metabolismo , Fosfopiruvato Hidratase/química , Fosfopiruvato Hidratase/metabolismo , Filogenia , Estrutura Secundária de Proteína , Piruvato Quinase/química , Piruvato Quinase/metabolismo , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
7.
Nat Genet ; 25(2): 201-4, 2000 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-10835637

RESUMO

Cloning procedures aided by homology searches of EST databases have accelerated the pace of discovery of new genes, but EST database searching remains an involved and onerous task. More than 1.6 million human EST sequences have been deposited in public databases, making it difficult to identify ESTs that represent new genes. Compounding the problems of scale are difficulties in detection associated with a high sequencing error rate and low sequence similarity between distant homologues. We have developed a new method, coupling BLAST-based searches with a domain identification protocol, that filters candidate homologues. Application of this method in a large-scale analysis of 100 signalling domain families has led to the identification of ESTs representing more than 1,000 novel human signalling genes. The 4,206 publicly available ESTs representing these genes are a valuable resource for rapid cloning of novel human signalling proteins. For example, we were able to identify ESTs of at least 106 new small GTPases, of which 6 are likely to belong to new subfamilies. In some cases, further analyses of genomic DNA led to the discovery of previously unidentified full-length protein sequences. This is exemplified by the in silico cloning (prediction of a gene product sequence using only genomic and EST sequence data) of a new type of GTPase with two catalytic domains.


Assuntos
Biologia Computacional/métodos , Etiquetas de Sequências Expressas , Proteínas/genética , Proteínas/metabolismo , Transdução de Sinais , Sequência de Aminoácidos , Automação , Domínio Catalítico , Clonagem Molecular/métodos , Bases de Dados Factuais , Genoma Humano , Humanos , Internet , Dados de Sequência Molecular , Proteínas Monoméricas de Ligação ao GTP/química , Proteínas Monoméricas de Ligação ao GTP/genética , Proteínas Monoméricas de Ligação ao GTP/metabolismo , Estrutura Terciária de Proteína , Proteínas/química , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Software
9.
Nucleic Acids Res ; 28(1): 231-4, 2000 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-10592234

RESUMO

SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (http://SMART.embl-heidelberg.de ). More than 400 domain families found in signalling, extra-cellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for proteins containing specific combinations of domains in defined taxa.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Internet , Alinhamento de Sequência , Armazenamento e Recuperação da Informação , Proteínas/química
10.
Proteins ; Suppl 3: 141-8, 1999.
Artigo em Inglês | MEDLINE | ID: mdl-10526363

RESUMO

We applied a succession of sequence search and structure prediction methods to the targets in the fold recognition part of the CASP3 experiment. For each target, we expanded an initial sequence space, obtained through PSI-BLAST, by searching for statistically significant relationships to low-scoring sequences and then by searching for conserved sequence patterns. We then divided the proteins in the sequence space into families and built an alignment hierarchically, using the multiple alignment program MACAW. If no significant similarity to a protein of known structure was apparent at this point, we submitted the alignment to the Jpred server for consensus secondary structure prediction and searched the structure space using the secondary structure mapping program MAP. Failing this, we compared the structural properties that we believed we recognized in the aligned proteins to the folds in the SCOP database, using visual inspection. If all these methods failed to uncover a plausible match, we predicted that the target would adopt a novel fold. This procedure yielded correct answers for seven of twenty-one targets and a partly correct answer for one. A retrospective analysis shows that automating the sequence search procedures would have represented a significant improvement, with at least three additional correct predictions.


Assuntos
Dobramento de Proteína , Estrutura Secundária de Proteína , Proteínas/química , Algoritmos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Modelos Moleculares , Dados de Sequência Molecular , Alinhamento de Sequência
13.
Curr Opin Struct Biol ; 9(3): 408-15, 1999 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-10361098

RESUMO

The complete sequence of the nematode worm Caenorhabditis elegans contains the genetic machinery that is required to undertake the core biological processes of single cells. However, the genome also encodes proteins that are associated with multicellularity, as well as others that are lineage-specific expansions of phylogenetically widespread families and yet more that are absent in non-nematodes. Ongoing analysis is beginning to illuminate the similarities and differences among human proteins and proteins that are encoded by the genomes of the multicellular worm and the unicellular yeast, and will be essential in determining the reliability of transferring experimental data among phylogenetically distant species.


Assuntos
Família Multigênica , Proteínas/química , Proteínas/genética , Animais , Sequência Conservada , Genoma , Humanos , Líquido Intracelular/fisiologia , Filogenia , Proteínas/fisiologia , Transdução de Sinais/genética
15.
J Mol Biol ; 259(3): 349-65, 1996 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-8676374

RESUMO

A strategy is presented for protein fold recognition from secondary structure assignments (alpha-helix and beta-strand). The method can detect similarities between protein folds in the absence of sequence similarity. Secondary structure mapping first identifies all possible matches (maps) between a query string of secondary structures and the secondary structures of protein domains of known three-dimensional structure. The maps are then passed through a series of structural filters to remove those that do not obey simple rules of protein structure. The surviving maps are ranked by scores from the alignment of predicted and experimental accessibilities. Searches made with secondary structure assignments for a test set of 11 fold-families put the correct sequence-dissimilar fold in the first rank 8/11 times. With cross-validated predictions of secondary structure this drops to 4/11 which compares favourably with the widely used THREADER program (1/11). The structural class is correctly predicted 10/11 times by the method in contrast to 5/11 for THREADER. The new technique obtains comparable accuracy in the alignment of amino acid residues and secondary structure elements. Searches are also performed with published secondary structure predictions for the von-Willebrand factor type A domain, the proteasome 20 S alpha subunit and the phosphotyrosine interaction domain. These searches demonstrate how the method can find the correct fold for a protein from a carefully constructed secondary structure prediction, multiple sequence alignment and distant restraints. Scans with experimentally determined secondary structures and accessibility, recognise the correct fold with high alignment accuracy (86% on secondary structures). This suggests that the accuracy of mapping will improve alongside any improvements in the prediction of secondary structure or accessibility. Application to NMR structure determination is also discussed.


Assuntos
Modelos Moleculares , Dobramento de Proteína , Estrutura Secundária de Proteína , Algoritmos , Sequência de Aminoácidos , Cisteína Endopeptidases/química , Dados de Sequência Molecular , Complexos Multienzimáticos/química , Fosfotirosina/metabolismo , Complexo de Endopeptidases do Proteassoma , Proteínas/química , Proteínas/metabolismo , Alinhamento de Sequência/métodos , Software , Fator de von Willebrand/química
16.
J Mol Biol ; 242(4): 321-9, 1994 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-7932692

RESUMO

The high resolution X-ray structures of 38 proteins that bind phosphate containing groups and 36 proteins binding sulphate ions were analysed to characterise the structural features of anion binding sites in proteins. 34 of the 66 phosphates found were in close proximity to the amino terminus of an alpha-helix. 27% of phosphate groups bind to only one amino acid, but there is a wide distribution, with 3% of phosphates binding to seven residues. Similarly, there is a large variability in the number of contacts each phosphate group makes to the protein. This ranges from none (3% of phosphates) to nine (3% of phosphates). The most common number of contacts is two (23% of phosphates). The most commonly found residue at helix-type binding sites is glycine, followed by Arg, Thr, Ser and Lys. At non-helix binding sites, the most commonly found residue is Arg followed by Tyr, His, Lys and Ser. There is no typical phosphate binding site. There are marked differences between propensities for phosphate binding at helix and non-helix type binding sites. Non-helix binding sites show more discrimination between the types of residues involved in binding when compared to the helix set. The propensities for binding of the amino acids reveal the expected trend of positively charged and polar residues being good at binding (although that for lysine is unexpectedly low) with the bulky non-polar residues being poor at binding. Bulky residues are less likely to bind with the amide nitrogen. Sulphate binding sites show similar trends. Analysis of multiple sequence alignments that include phosphate and sulphate binding proteins reveals the degree of conservation at the binding site residues compared to the average conservation of residues in the protein. Phosphate binding site residues are more conserved than sulphate binding sites.


Assuntos
Fosfatos/metabolismo , Ligação Proteica , Sulfatos/metabolismo , Cristalografia por Raios X , Conformação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...