Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Mol Biol Evol ; 36(6): 1281-1293, 2019 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-30912801

RESUMO

In species with chromosomal sex determination, X chromosomes are predicted to evolve faster than autosomes because of positive selection on recessive alleles or weak purifying selection. We investigated X chromosome evolution in Stegodyphus spiders that differ in mating system, sex ratio, and population dynamics. We assigned scaffolds to X chromosomes and autosomes using a novel method based on flow cytometry of sperm cells and reduced representation sequencing. We estimated coding substitution patterns (dN/dS) in a subsocial outcrossing species (S. africanus) and its social inbreeding and female-biased sister species (S. mimosarum), and found evidence for faster-X evolution in both species. X chromosome-to-autosome diversity (piX/piA) ratios were estimated in multiple populations. The average piX/piA estimates of S. africanus (0.57 [95% CI: 0.55-0.60]) was lower than the neutral expectation of 0.75, consistent with more hitchhiking events on X-linked loci and/or a lower X chromosome mutation rate, and we provide evidence in support of both. The social species S. mimosarum has a significantly higher piX/piA ratio (0.72 [95% CI: 0.65-0.79]) in agreement with its female-biased sex ratio. Stegodyphus mimosarum also have different piX/piA estimates among populations, which we interpret as evidence for recurrent founder events. Simulations show that recurrent founder events are expected to decrease the piX/piA estimates in S. mimosarum, thus underestimating the true effect of female-biased sex ratios. Finally, we found lower synonymous divergence on X chromosomes in both species, and the male-to-female substitution ratio to be higher than 1, indicating a higher mutation rate in males.


Assuntos
Evolução Biológica , Aranhas/genética , Cromossomo X/genética , Animais , Variação Genética , Masculino , Dinâmica Populacional , Razão de Masculinidade
2.
PLoS One ; 9(5): e98187, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24878701

RESUMO

Formalin-fixed, paraffin-embedded (FFPE) tissues are an invaluable resource for clinical research. However, nucleic acids extracted from FFPE tissues are fragmented and chemically modified making them challenging to use in molecular studies. We analysed 23 fresh-frozen (FF), 35 FFPE and 38 paired FF/FFPE specimens, representing six different human tissue types (bladder, prostate and colon carcinoma; liver and colon normal tissue; reactive tonsil) in order to examine the potential use of FFPE samples in next-generation sequencing (NGS) based retrospective and prospective clinical studies. Two methods for DNA and three methods for RNA extraction from FFPE tissues were compared and were found to affect nucleic acid quantity and quality. DNA and RNA from selected FFPE and paired FF/FFPE specimens were used for exome and transcriptome analysis. Preparations of DNA Exome-Seq libraries was more challenging (29.5% success) than that of RNA-Seq libraries, presumably because of modifications to FFPE tissue-derived DNA. Libraries could still be prepared from RNA isolated from two-decade old FFPE tissues. Data were analysed using the CLC Bio Genomics Workbench and revealed systematic differences between FF and FFPE tissue-derived nucleic acid libraries. In spite of this, pairwise analysis of DNA Exome-Seq data showed concordance for 70-80% of variants in FF and FFPE samples stored for fewer than three years. RNA-Seq data showed high correlation of expression profiles in FF/FFPE pairs (Pearson Correlations of 0.90 +/- 0.05), irrespective of storage time (up to 244 months) and tissue type. A common set of 1,494 genes was identified with expression profiles that were significantly different between paired FF and FFPE samples irrespective of tissue type. Our results are promising and suggest that NGS can be used to study FFPE specimens in both prospective and retrospective archive-based studies in which FF specimens are not available.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Neoplasias/patologia , Inclusão em Parafina , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Fixação de Tecidos , Criopreservação , DNA/genética , DNA/isolamento & purificação , Exoma/genética , Formaldeído/farmacologia , Perfilação da Expressão Gênica , Humanos , Proteínas Proto-Oncogênicas/genética , Proteínas Proto-Oncogênicas B-raf/genética , Proteínas Proto-Oncogênicas p21(ras) , RNA/genética , RNA/isolamento & purificação , Proteínas ras/genética
3.
BMC Genomics ; 15: 439, 2014 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-24906298

RESUMO

BACKGROUND: Sampling genomes with Fosmid vectors and sequencing of pooled Fosmid libraries on the Illumina platform for massive parallel sequencing is a novel and promising approach to optimizing the trade-off between sequencing costs and assembly quality. RESULTS: In order to sequence the genome of Norway spruce, which is of great size and complexity, we developed and applied a new technology based on the massive production, sequencing, and assembly of Fosmid pools (FP). The spruce chromosomes were sampled with ~40,000 bp Fosmid inserts to obtain around two-fold genome coverage, in parallel with traditional whole genome shotgun sequencing (WGS) of haploid and diploid genomes. Compared to the WGS results, the contiguity and quality of the FP assemblies were high, and they allowed us to fill WGS gaps resulting from repeats, low coverage, and allelic differences. The FP contig sets were further merged with WGS data using a novel software package GAM-NGS. CONCLUSIONS: By exploiting FP technology, the first published assembly of a conifer genome was sequenced entirely with massively parallel sequencing. Here we provide a comprehensive report on the different features of the approach and the optimization of the process.We have made public the input data (FASTQ format) for the set of pools used in this study:ftp://congenie.org/congenie/Nystedt_2013/Assembly/ProcessedData/FosmidPools/.(alternatively accessible via http://congenie.org/downloads).The software used for running the assembly process is available at http://research.scilifelab.se/andrej_alexeyenko/downloads/fpools/.


Assuntos
Vetores Genéticos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Picea/genética , Clonagem Molecular , Genoma de Planta , Sequenciamento de Nucleotídeos em Larga Escala/economia , Software
4.
BMC Genomics ; 14: 75, 2013 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-23375136

RESUMO

BACKGROUND: Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. RESULTS: Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. CONCLUSIONS: The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber.


Assuntos
Genômica , Hevea/genética , Análise de Sequência , Alérgenos/genética , Resistência à Doença/genética , Evolução Molecular , Proteínas F-Box/genética , Genoma de Planta/genética , Haploidia , Hevea/imunologia , Hevea/metabolismo , Látex/metabolismo , Anotação de Sequência Molecular , Filogenia , Reguladores de Crescimento de Plantas/genética , Borracha/metabolismo , Transdução de Sinais/genética , Fatores de Transcrição/genética , Madeira/metabolismo
5.
BMC Bioinformatics ; 14 Suppl 2: S22, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23368905

RESUMO

Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/.


Assuntos
Algoritmos , Modelos Teóricos , Conformação de Ácido Nucleico , RNA/química , Sequência de Bases , Biologia Computacional/métodos , Entropia , Probabilidade , Software
6.
Gene ; 511(2): 195-201, 2012 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-23026207

RESUMO

The Japanese eel is a much appreciated research object and very important for Asian aquaculture; however, its genomic resources are still limited. We have used a streamlined bioinformatics pipeline for the de novo assembly of the genome sequence of the Japanese eel from raw Illumina sequence reads. The total assembled genome has a size of 1.15 Gbp, which is divided over 323,776 scaffolds with an N50 of 52,849 bp, a minimum scaffold size of 200 bp and a maximum scaffold size of 1.14 Mbp. Direct comparison of a representative set of scaffolds revealed that all the Hox genes and their intergenic distances are almost perfectly conserved between the European and the Japanese eel. The first draft genome sequence of an organism strongly catalyzes research progress in multiple fields. Therefore, the Japanese eel genome sequence will provide a rich resource of data for all scientists working on this important fish species.


Assuntos
Anguilla/genética , Genoma , Animais , Biologia Computacional
7.
Bioinformatics ; 28(20): 2691-2, 2012 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-22877864

RESUMO

UNLABELLED: PPfold is a multi-threaded implementation of the Pfold algorithm for RNA secondary structure prediction. Here we present a new version of PPfold, which extends the evolutionary analysis with a flexible probabilistic model for incorporating auxiliary data, such as data from structure probing experiments. Our tests show that the accuracy of single-sequence secondary structure prediction using experimental data in PPfold 3.0 is comparable to RNAstructure. Furthermore, alignment structure prediction quality is improved even further by the addition of experimental data. PPfold 3.0 therefore has the potential of producing more accurate predictions than it was previously possible. AVAILABILITY AND IMPLEMENTATION: PPfold 3.0 is available as a platform-independent Java application and can be downloaded from http://birc.au.dk/software/ppfold.


Assuntos
RNA/química , Software , Algoritmos , Modelos Estatísticos , Conformação de Ácido Nucleico , Filogenia
8.
BMC Bioinformatics ; 12: 103, 2011 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-21501497

RESUMO

BACKGROUND: The prediction of the structure of large RNAs remains a particular challenge in bioinformatics, due to the computational complexity and low levels of accuracy of state-of-the-art algorithms. The pfold model couples a stochastic context-free grammar to phylogenetic analysis for a high accuracy in predictions, but the time complexity of the algorithm and underflow errors have prevented its use for long alignments. Here we present PPfold, a multithreaded version of pfold, which is capable of predicting the structure of large RNA alignments accurately on practical timescales. RESULTS: We have distributed both the phylogenetic calculations and the inside-outside algorithm in PPfold, resulting in a significant reduction of runtime on multicore machines. We have addressed the floating-point underflow problems of pfold by implementing an extended-exponent datatype, enabling PPfold to be used for large-scale RNA structure predictions. We have also improved the user interface and portability: alongside standalone executable and Java source code of the program, PPfold is also available as a free plugin to the CLC Workbenches. We have evaluated the accuracy of PPfold using BRaliBase I tests, and demonstrated its practical use by predicting the secondary structure of an alignment of 24 complete HIV-1 genomes in 65 minutes on an 8-core machine and identifying several known structural elements in the prediction. CONCLUSIONS: PPfold is the first parallelized comparative RNA structure prediction algorithm to date. Based on the pfold model, PPfold is capable of fast, high-quality predictions of large RNA secondary structures, such as the genomes of RNA viruses or long genomic transcripts. The techniques used in the parallelization of this algorithm may be of general applicability to other bioinformatics algorithms.


Assuntos
Algoritmos , Conformação de Ácido Nucleico , RNA/química , Alinhamento de Sequência/métodos , Biologia Computacional/métodos , Genoma Viral , HIV-1/genética , Filogenia , Análise de Sequência de RNA/métodos , Processos Estocásticos
9.
ACS Nano ; 4(3): 1367-76, 2010 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-20146442

RESUMO

The assembly, structure, and stability of DNA nanocages with the shape of truncated octahedra have been studied. The cages are composed of 12 double-stranded B-DNA helices interrupted by single-stranded linkers of thymidines of varying length that constitute the truncated corners of the structure. The structures assemble with a high efficiency in a one-step procedure, compared to previously published structures of similar complexity. The structures of the cages were determined by small-angle X-ray scattering. With increasing linker length, there is a systematic increase of the cage size and decrease of the twist angle of the double helices with respect to the symmetry planes of the cage structure. In the present study, we demonstrate the length of the single-stranded linker regions, which impose a certain degree of flexibility to the structure, to be the important determinant for efficient assembly. The linker length can be decreased to three thymidines without affecting assembly yield or the overall structural characteristics of the DNA cages. A linker length of two thymidines represents a sharp cutoff abolishing cage assembly. This is supported by energy minimization calculations suggesting substantial hydrogen bond deformation in a cage with linkers of two thymidines.


Assuntos
DNA de Cadeia Simples/química , Nanoestruturas/química , Sequência de Bases , DNA de Cadeia Simples/genética , Eletroforese em Gel de Poliacrilamida , Ligação de Hidrogênio , Modelos Moleculares , Conformação de Ácido Nucleico , Espalhamento a Baixo Ângulo , Termodinâmica , Timidina/química , Difração de Raios X
10.
Genes (Basel) ; 1(2): 263-82, 2010 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-24710045

RESUMO

This study presents a new computer program for assessing the effects of different factors and sequencing strategies on de novo sequence assembly. The program uses reads from actual sequencing studies or from simulations with a reference genome that may also be real or simulated. The simulated reads can be created with our read simulator. They can be of differing length and coverage, consist of paired reads with varying distance, and include sequencing errors such as color space miscalls to imitate SOLiD data. The simulated or real reads are mapped to their reference genome and our assembly simulator is then used to obtain optimal assemblies that are limited only by the distribution of repeats. By way of this mapping, the assembly simulator determines which contigs are theoretically possible, or conversely (and perhaps more importantly), which are not. We illustrate the application and utility of our new simulation tools with several experiments that test the effects of genome complexity (repeats), read length and coverage, word size in De Bruijn graph assembly, and alternative sequencing strategies (e.g., BAC pooling) on sequence assemblies. These experiments highlight just some of the uses of our simulators in the experimental design of sequencing projects and in the further development of assembly algorithms.

11.
BMC Bioinformatics ; 10: 247, 2009 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-19671163

RESUMO

BACKGROUND: The population mutation rate (theta) remains one of the most fundamental parameters in genetics, ecology, and evolutionary biology. However, its accurate estimation can be seriously compromised when working with error prone data such as expressed sequence tags, low coverage draft sequences, and other such unfinished products. This study is premised on the simple idea that a random sequence error due to a chance accident during data collection or recording will be distributed within a population dataset as a singleton (i.e., as a polymorphic site where one sampled sequence exhibits a unique base relative to the common nucleotide of the others). Thus, one can avoid these random errors by ignoring the singletons within a dataset. RESULTS: This strategy is implemented under an infinite sites model that focuses on only the internal branches of the sample genealogy where a shared polymorphism can arise (i.e., a variable site where each alternative base is represented by at least two sequences). This approach is first used to derive independently the same new Watterson and Tajima estimators of theta, as recently reported by Achaz 1 for error prone sequences. It is then used to modify the recent, full, maximum-likelihood model of Knudsen and Miyamoto 2, which incorporates various factors for experimental error and design with those for coalescence and mutation. These new methods are all accurate and fast according to evolutionary simulations and analyses of a real complex population dataset for the California seahare. CONCLUSION: In light of these results, we recommend the use of these three new methods for the determination of theta from error prone sequences. In particular, we advocate the new maximum likelihood model as a starting point for the further development of more complex coalescent/mutation models that also account for experimental error and design.


Assuntos
Biologia Computacional/métodos , Genética Populacional , Mutação , Algoritmos , Densidade Demográfica , Alinhamento de Sequência
12.
Nucleic Acids Res ; 36(4): 1113-9, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18096620

RESUMO

The inherent properties of DNA as a stable polymer with unique affinity for partner molecules determined by the specific Watson-Crick base pairing makes it an ideal component in self-assembling structures. This has been exploited for decades in the design of a variety of artificial substrates for investigations of DNA-interacting enzymes. More recently, strategies for synthesis of more complex two-dimensional (2D) and 3D DNA structures have emerged. However, the building of such structures is still in progress and more experiences from different research groups and different fields of expertise are necessary before complex DNA structures can be routinely designed for the use in basal science and/or biotechnology. Here we present the design, construction and structural analysis of a covalently closed and stable 3D DNA structure with the connectivity of an octahedron, as defined by the double-stranded DNA helices that assembles from eight oligonucleotides with a yield of approximately 30%. As demonstrated by Small Angle X-ray Scattering and cryo-Transmission Electron Microscopy analyses the eight-stranded DNA structure has a central cavity larger than the apertures in the surrounding DNA lattice and can be described as a nano-scale DNA cage, Hence, in theory it could hold proteins or other bio-molecules to enable their investigation in certain harmful environments or even allow their organization into higher order structures.


Assuntos
DNA/química , Nanoestruturas/química , Eletroforese em Gel de Poliacrilamida , Microscopia Eletrônica de Transmissão , Modelos Moleculares , Conformação de Ácido Nucleico , Oligonucleotídeos/química , Espalhamento a Baixo Ângulo , Difração de Raios X
13.
RNA ; 13(11): 1850-9, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17804647

RESUMO

We have developed a semiautomated RNA sequence editor (SARSE) that integrates tools for analyzing RNA alignments. The editor highlights different properties of the alignment by color, and its integrated analysis tools prevent the introduction of errors when doing alignment editing. SARSE readily connects to external tools to provide a flexible semiautomatic editing environment. A new method, Pcluster, is introduced for dividing the sequences of an RNA alignment into subgroups with secondary structure differences. Pcluster was used to evaluate 574 seed alignments obtained from the Rfam database and we identified 71 alignments with significant prediction of inconsistent base pairs and 102 alignments with significant prediction of novel base pairs. Four RNA families were used to illustrate how SARSE can be used to manually or automatically correct the inconsistent base pairs detected by Pcluster: the mir-399 RNA, vertebrate telomase RNA (vert-TR), bacterial transfer-messenger RNA (tmRNA), and the signal recognition particle (SRP) RNA. The general use of the method is illustrated by the ability to accommodate pseudoknots and handle even large and divergent RNA families. The open architecture of the SARSE editor makes it a flexible tool to improve all RNA alignments with relatively little human intervention. Online documentation and software are available at (http://sarse.ku.dk).


Assuntos
Alinhamento de Sequência/métodos , Análise de Sequência de RNA , Software , Biologia Computacional , Bases de Dados Genéticas , Conformação de Ácido Nucleico , RNA/química , Homologia de Sequência do Ácido Nucleico , Interface Usuário-Computador
14.
Genetics ; 176(4): 2335-42, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17565962

RESUMO

Coalescent theory provides a powerful framework for estimating the evolutionary, demographic, and genetic parameters of a population from a small sample of individuals. Current coalescent models have largely focused on population genetic factors (e.g., mutation, population growth, and migration) rather than on the effects of experimental design and error. This study develops a new coalescent/mutation model that accounts for unobserved polymorphisms due to missing data, sequence errors, and multiple reads for diploid individuals. The importance of accommodating these effects of experimental design and error is illustrated with evolutionary simulations and a real data set from a population of the California sea hare. In particular, a failure to account for sequence errors can lead to overestimated mutation rates, inflated coalescent times, and inappropriate conclusions about the population. This current model can now serve as a starting point for the development of newer models with additional experimental and population genetic factors. It is currently implemented as a maximum-likelihood method, but this model may also serve as the basis for the development of Bayesian approaches that incorporate experimental design and error.


Assuntos
Genética Populacional/estatística & dados numéricos , Modelos Genéticos , Mutação , Algoritmos , Animais , Aplysia/genética , Teorema de Bayes , Bases de Dados Genéticas , Evolução Molecular , Funções Verossimilhança , Polimorfismo Genético
15.
Cell ; 127(7): 1453-67, 2006 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-17190607

RESUMO

Molecular analyses of Aplysia, a well-established model organism for cellular and systems neural science, have been seriously handicapped by a lack of adequate genomic information. By sequencing cDNA libraries from the central nervous system (CNS), we have identified over 175,000 expressed sequence tags (ESTs), of which 19,814 are unique neuronal gene products and represent 50%-70% of the total Aplysia neuronal transcriptome. We have characterized the transcriptome at three levels: (1) the central nervous system, (2) the elementary components of a simple behavior: the gill-withdrawal reflex-by analyzing sensory, motor, and serotonergic modulatory neurons, and (3) processes of individual neurons. In addition to increasing the amount of available gene sequences of Aplysia by two orders of magnitude, this collection represents the largest database available for any member of the Lophotrochozoa and therefore provides additional insights into evolutionary strategies used by this highly successful diversified lineage, one of the three proposed superclades of bilateral animals.


Assuntos
Aplysia/genética , Sistema Nervoso Central/metabolismo , Neurônios/fisiologia , Transmissão Sináptica , Transcrição Gênica , Animais , Aplysia/anatomia & histologia , Bases de Dados Genéticas , Evolução Molecular , Etiquetas de Sequências Expressas , Gânglios dos Invertebrados , Brânquias/inervação , Rede Nervosa , Filogenia
16.
Mol Phylogenet Evol ; 38(2): 459-69, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16230032

RESUMO

We have sequenced and characterized the complete mitochondrial genome of the sea slug, Aplysia californica, an important model organism in experimental biology and a representative of Anaspidea (Opisthobranchia, Gastropoda). The mitochondrial genome of Aplysia is in the small end of the observed sizes of animal mitochondrial genomes (14,117 bp, NCBI Accession No. NC_005827). The Aplysia genome, like most other mitochondrial genomes, encodes genes for 2 ribosomal subunit RNAs (small and large rRNAs), 22 tRNAs, and 13 protein subunits (cytochrome c oxidase subunits 1-3, cytochrome b apoenzyme, ATP synthase subunits 6 and 8, and NADH dehydrogenase subunits 1-6 and 4L). The gene order is virtually identical between opisthobranchs and pulmonates, with the majority of differences arising from tRNA translocations. In contrast, the gene order from representatives of basal gastropods and other molluscan classes is significantly different from opisthobranchs and pulmonates. The Aplysia genome was compared to all other published molluscan mitochondrial genomes and phylogenetic analyses were carried out using a concatenated protein alignment. Phylogenetic analyses using maximum likelihood based analyses of the well aligned regions of the protein sequences support both monophyly of Euthyneura (a group including both the pulmonates and opisthobranchs) and Opisthobranchia (as a more derived group). The Aplysia mitochondrial genome sequenced here will serve as an important platform in both comparative and neurobiological studies using this model organism.


Assuntos
Aplysia/classificação , DNA Mitocondrial/genética , Ordem dos Genes , Genes Mitocondriais/genética , RNA de Transferência/genética , Animais , Aplysia/genética , Sequência de Bases , Rearranjo Gênico , Genes de RNAr/genética , ATPases Mitocondriais Próton-Translocadoras/genética , Dados de Sequência Molecular , NADH NADPH Oxirredutases/genética , Conformação de Ácido Nucleico , Filogenia
17.
Genetica ; 124(2-3): 247-54, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16134337

RESUMO

Cyanobacteria are only prokaryotes known so far to have a circadian system. It may be based either on two (kaiB and kaiC) or three (kaiA, kaiB and kaiC) circadian genes. The homologs of two circadian proteins, KaiB and KaiC, form four major subfamilies (K1-K4) and also occur in some other prokaryotes. Using the likelihood-ratio tests, we studied a rate shift at the functional divergence of the proteins from the different subfamilies. It appears that only two of the subfamilies (K1 and K2) perform circadian functions. We identified in total 92 sites that have significantly different rates of evolution between the clades K1/K2 and K3/K4; 67 sites (15 in KaiB and 52 in KaiC) been evolving significantly slower in K1/K2 than the overall average for the entire sequence. Many critical sites are located in the identified functionally important motifs and regions, e.g. one of the Walker's motif As, DXXG motif, and two KaiA-binding domains of KaiC. There are also 36 sites (approximately 5%) with rate shift between K1 and K2. The rate shift at these sites may be related to the interaction with KaiA. Rate shift analyses have identified residues whose manipulation in the Kai proteins may lead to better understanding of their functions in the two different types of the cyanobacterial circadian system.


Assuntos
Proteínas de Bactérias/genética , Proteínas de Bactérias/fisiologia , Ritmo Circadiano/genética , Motivos de Aminoácidos , Sequência de Aminoácidos , Proteínas de Bactérias/química , Ritmo Circadiano/fisiologia , Peptídeos e Proteínas de Sinalização do Ritmo Circadiano , Cianobactérias/genética , Cianobactérias/fisiologia , Evolução Molecular , Genes Bacterianos , Dados de Sequência Molecular , Filogenia , Células Procarióticas , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos
18.
BMC Evol Biol ; 5: 21, 2005 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-15743518

RESUMO

BACKGROUND: The f factor is a new parameter for accommodating the influence of both the starting and ending states in the rate matrices of "generalized weighted frequencies" (+gwF) models for sequence evolution. In this study, we derive an expected value for f, starting from a nearly neutral model of weak selection, and then assess the biological interpretation of this factor with evolutionary simulations. RESULTS: An expected value of f = 0.5 (i.e., equal dependency on the starting and ending states) is derived for sequences that are evolving under the nearly neutral model of this study. However, this expectation is sensitive to violations of its underlying assumptions as illustrated with the evolutionary simulations. CONCLUSION: This study illustrates how selection, drift, and mutation at the population level can be linked to the rate matrices of models for sequence evolution to derive an expected value of f. However, as f is affected by a number of factors that limit its biological interpretation, this factor should normally be estimated as a free parameter rather than fixed a priori in a +gwF analysis.


Assuntos
Evolução Molecular , Evolução Biológica , Interpretação Estatística de Dados , Frequência do Gene , Deriva Genética , Modelos Biológicos , Modelos Genéticos , Modelos Estatísticos , Modelos Teóricos , Mutação , Seleção Genética
19.
Biochemistry ; 44(2): 726-33, 2005 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-15641799

RESUMO

The natural resistance-associated macrophage protein (Nramp) family is functionally conserved in bacteria and eukarya; Nramp homologues function as proton-dependent membrane transporters of divalent metals. Sequence analyses indicate that five phylogenetic groups comprise the Nramp family, three bacterial and two eukaryotic, which are distinct from a more distantly related group of microbial sequences (Nramp outgroup). The Nramp family and outgroup share many conserved residues, suggesting they derived from a common ancestor and raising the possibility that the residues invariant in the Nramp family that correspond to residues which are different but also conserved in the outgroup represent candidate sites of functional divergence of the Nramp family. Four Nramp family-specific residues were identified within transmembrane domains 1, 6, and 11, and replaced by the corresponding invariant outgroup residues in the Escherichia coli Nramp ortholog (the proton-dependent manganese transporter, MntH of group A, EcoliA). The resulting mutants (Asp(34)Gly, Asn(37)Thr, His(211)Tyr, and Asn(401)Gly) were tested for both divalent metal uptake and proton transport; quasi-simultaneous analyses of uptake of metals and protons revealed for the first time protons and metals cotransport by a bacterial Nramp homologue. Additional mutations were studied for comparison (Asp(34)Asn, Asn(37)Asp and Asn(37)Val, Asn(401)Thr, His(211)Ala, His(216)Ala, and His(216)Arg). EcoliA activity was impaired after each of the Nramp/outgroup substitutions, as well as after more conservative replacements, showing that the tested sites are all important for metal uptake and metal-dependent H(+) transport. It is proposed that co-occurrence of these four Nramp-specific transmembrane residues may have contributed to the emergence of this family of metal and proton cotransporters.


Assuntos
Aminoácidos/química , Proteínas de Transporte de Cátions/química , Proteínas de Transporte de Cátions/metabolismo , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/metabolismo , Evolução Molecular , Metais Pesados/metabolismo , Prótons , Aminoácidos/genética , Asparagina/química , Asparagina/genética , Ácido Aspártico/química , Ácido Aspártico/genética , Cádmio/metabolismo , Proteínas de Transporte de Cátions/genética , Permeabilidade da Membrana Celular/genética , Cobalto/metabolismo , Proteínas de Escherichia coli/genética , Compostos Ferrosos/metabolismo , Histidina/química , Histidina/genética , Manganês/metabolismo , Mutagênese Sítio-Dirigida , Transporte Proteico/genética , Proteínas Proto-Oncogênicas c-myc/genética
20.
Mol Genet Metab ; 81(4): 322-34, 2004 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-15059620

RESUMO

The availability of 18 thyrotropin receptor (TSHR) sequences, including two recent entries for primates and seven from fish, have allowed us to investigate diversification of residues or domains during evolution. We used a likelihood ratio test for evolutionary rate shifts [Proc. Natl. Acad. Sci. 98 (2001) 14512] using LH/CGR sequences as an out-group. At each residue in the alignment, a statistical test was performed for a rate shift at the divergence between mammals and fish. Eighty-two rate shift sites were found, significantly more than was expected (p < 0.0001). The occurrence of rate shifts was highest in the intracellular tail, lowest in the transmembrane serpentine and intermediate in the ectodomain. In 52 mammalian sites, the rates were significantly faster than for the corresponding sites in fish. We have identified rate shift in sites important to TSHR function or in intimate proximity to such regions. The former category includes residues 53 and 55 (of LLR1 beta strand) and 253 and 255 (of LLR9 beta strand), crucial to TSH thyrotropic activity, residue 113, the site of N-linked glycosylation limited to humans, residue 310, an important switch in the hinge region for receptor binding and constitutive activity and residue 382 which centres a motif important for TSH-mediated receptor activation. The rate shifts positions close to functional region include a site proximal to a TSHR-specific motif on LLR3 beta strand, sites important in TM helix structure and homodimerization as well as, in the case of the third intracellular loop, to TSHR/G protein coupling. Rate shift analyses have identified residues whose manipulation in the human TSHR may lead to better understanding of receptor functions and help in the creation of designer analogues.


Assuntos
Evolução Molecular , Variação Genética , Receptores da Tireotropina/genética , Sequência de Aminoácidos , Animais , Humanos , Funções Verossimilhança , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA