Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Nat Genet ; 29(4): 412-7, 2001 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-11726928

RESUMO

The identification of promoters and first exons has been one of the most difficult problems in gene-finding. We present a set of discriminant functions that can recognize structural and compositional features such as CpG islands, promoter regions and first splice-donor sites. We explain the implementation of the discriminant functions into a decision tree that constitutes a new program called FirstEF. By using different models to predict CpG-related and non-CpG-related first exons, we showed by cross-validation that the program could predict 86% of the first exons with 17% false positives. We also demonstrated the prediction accuracy of FirstEF at the genome level by applying it to the finished sequences of human chromosomes 21 and 22 as well as by comparing the predictions with the locations of the experimentally verified first exons. Finally, we present the analysis of the predicted first exons for all of the 24 chromosomes of the human genome.


Assuntos
Éxons , Genoma Humano , Regiões Promotoras Genéticas , Cromossomos Humanos Par 21 , Cromossomos Humanos Par 22 , Ilhas de CpG , Humanos
2.
J Exp Biol ; 213(11): 1844-51, 2010 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-20472771

RESUMO

Bite force is a measure of whole-organism performance that is often used to investigate the relationships between performance, morphology and fitness. When in vivo measurements of bite force are unavailable, researchers often turn to lever models to predict bite forces. This study demonstrates that bite force predictions based on two-dimensional (2-D) lever models can be improved by including three-dimensional (3-D) geometry and realistic physiological cross-sectional areas derived from dissections. Widely used, the 2-D method does a reasonable job of predicting bite force. However, it does so by over predicting physiological cross-sectional areas for the masseter and pterygoid muscles and under predicting physiological cross-sectional areas for the temporalis muscle. We found that lever models that include the three dimensional structure of the skull and mandible and physiological cross-sectional areas calculated from dissected muscles provide the best predictions of bite force. Models that accurately represent the biting mechanics strengthen our understanding of which variables are functionally relevant and how they are relevant to feeding performance.


Assuntos
Quirópteros/fisiologia , Mastigação , Animais , Fenômenos Biomecânicos , Quirópteros/anatomia & histologia , Feminino , Masculino , Músculos da Mastigação/fisiologia , Modelos Biológicos , Crânio/anatomia & histologia
3.
J Theor Biol ; 256(1): 96-103, 2009 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-18834892

RESUMO

The widespread availability of three-dimensional imaging and computational power has fostered a rapid increase in the number of biologists using finite element analysis (FEA) to investigate the mechanical function of living and extinct organisms. The inevitable rise of studies that compare finite element models brings to the fore two critical questions about how such comparative analyses can and should be conducted: (1) what metrics are appropriate for assessing the performance of biological structures using finite element modeling? and, (2) how can performance be compared such that the effects of size and shape are disentangled? With respect to performance, we argue that energy efficiency is a reasonable optimality criterion for biological structures and we show that the total strain energy (a measure of work expended deforming a structure) is a robust metric for comparing the mechanical efficiency of structures modeled with finite elements. Results of finite element analyses can be interpreted with confidence when model input parameters (muscle forces, detailed material properties) and/or output parameters (reaction forces, strains) are well-documented by studies of living animals. However, many researchers wish to compare species for which these input and validation data are difficult or impossible to acquire. In these cases, researchers can still compare the performance of structures that differ in shape if variation in size is controlled. We offer a theoretical framework and empirical data demonstrating that scaling finite element models to equal force: surface area ratios removes the effects of model size and provides a comparison of stress-strength performance based solely on shape. Further, models scaled to have equal applied force:volume ratios provide the basis for strain energy comparison. Thus, although finite element analyses of biological structures should be validated experimentally whenever possible, this study demonstrates that the relative performance of un-validated models can be compared so long as they are scaled properly.


Assuntos
Anatomia/estatística & dados numéricos , Simulação por Computador , Análise de Elementos Finitos , Animais , Especificidade da Espécie
4.
Phys Rev E Stat Nonlin Soft Matter Phys ; 77(5 Pt 2): 056102, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18643131

RESUMO

We study annual logarithmic growth rates R of various economic variables such as exports, imports, and foreign debt. For each of these variables we find that the distributions of R can be approximated by double exponential (Laplace) distributions in the central parts and power-law distributions in the tails. For each of these variables we further find a power-law dependence of the standard deviation sigma(R) on the average size of the economic variable with a scaling exponent surprisingly close to that found for the gross domestic product (GDP) [Phys. Rev. Lett. 81, 3275 (1998)]. By analyzing annual logarithmic growth rates R of wages of 161 different occupations, we find a power-law dependence of the standard deviation sigma(R) on the average value of the wages with a scaling exponent beta approximately 0.14 close to those found for the growth of exports, imports, debt, and the growth of the GDP. In contrast to these findings, we observe for payroll data collected from 50 states of the USA that the standard deviation sigma(R) of the annual logarithmic growth rate R increases monotonically with the average value of payroll. However, also in this case we observe a power-law dependence of sigma(R) on the average payroll with a scaling exponent beta approximately -0.08 . Based on these observations we propose a stochastic process for multiple cross-correlated variables where for each variable (i) the distribution of logarithmic growth rates decays exponentially in the central part, (ii) the distribution of the logarithmic growth rate decays algebraically in the far tails, and (iii) the standard deviation of the logarithmic growth rate depends algebraically on the average size of the stochastic variable.

5.
Nucleic Acids Res ; 33(Database issue): D619-21, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608274

RESUMO

The crop expressed sequence tag database, CR-EST (http://pgrc.ipk-gatersleben.de/cr-est/), is a publicly available online resource providing access to sequence, classification, clustering and annotation data of crop EST projects. CR-EST currently holds more than 200,000 sequences derived from 41 cDNA libraries of four species: barley, wheat, pea and potato. The barley section comprises approximately one-third of all publicly available ESTs. CR-EST deploys an automatic EST preparation pipeline that includes the identification of chimeric clones in order to transparently display the data quality. Sequences are clustered in species-specific projects to currently generate a non-redundant set of approximately 22,600 consensus sequences and approximately 17,200 singletons, which form the basis of the provided set of unigenes. A web application allows the user to compute BLAST alignments of query sequences against the CR-EST database, query data from Gene Ontology and metabolic pathway annotations and query sequence similarities from stored BLAST results. CR-EST also features interactive JAVA-based tools, allowing the visualization of open reading frames and the explorative analysis of Gene Ontology mappings applied to ESTs.


Assuntos
Produtos Agrícolas/genética , Bases de Dados de Ácidos Nucleicos , Etiquetas de Sequências Expressas/química , Genes de Plantas , Sistemas de Gerenciamento de Base de Dados , Hordeum/genética , Pisum sativum/genética , Análise de Sequência de DNA , Solanum tuberosum/genética , Triticum/genética , Interface Usuário-Computador
6.
Sci Rep ; 7(1): 2489, 2017 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-28559568

RESUMO

Auxin plays a pivotal role in virtually every aspect of plant morphogenesis. It simultaneously orchestrates a diverse variety of processes such as cell wall biogenesis, transition through the cell cycle, or metabolism of a wide range of chemical substances. The coordination principles for such a complex orchestration are poorly understood at the systems level. Here, we perform an RNA-seq experiment to study the transcriptional response to auxin treatment  within gene groups of different biological processes, molecular functions, or cell components in a quantitative fold-change-specific manner. We find for Arabidopsis thaliana roots treated with auxin for 6 h that (i) there are functional groups within which genes respond to auxin with a surprisingly similar fold changes and that (ii) these fold changes vary from one group to another. These findings make it tempting to conjecture the existence of some transcriptional logic orchestrating the coordinated expression of genes within functional groups in a fold-change-specific manner. To obtain some initial insight about this coordinated expression, we performed a motif enrichment analysis and found cis-regulatory elements TBX1-3, SBX, REG, and TCP/site2 as the candidates conferring fold-change-specific responses to auxin in Arabidopsis thaliana.


Assuntos
Arabidopsis/genética , Ácidos Indolacéticos/metabolismo , Raízes de Plantas/genética , Arabidopsis/efeitos dos fármacos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Regulação da Expressão Gênica de Plantas/genética , Ácidos Indolacéticos/farmacologia , Raízes de Plantas/crescimento & desenvolvimento , Raízes de Plantas/metabolismo , Dobramento de Proteína/efeitos dos fármacos , Transdução de Sinais/efeitos dos fármacos , Transdução de Sinais/genética
7.
Phys Rev E Stat Nonlin Soft Matter Phys ; 64(4 Pt 1): 041917, 2001 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-11690062

RESUMO

We study statistical patterns in the DNA sequence of human chromosome 22, the first completely sequenced human chromosome. We find that (i). the 33.4 x 10(6) nucleotide long human chromosome exhibits long-range power-law correlations over more than four orders of magnitude, (ii). the entropies H(n) of the frequency distribution of oligonucleotides of length n (n-mers) grow sublinearly with increasing n, indicating the presence of higher-order correlations for all of the studied lengths 1

Assuntos
Cromossomos Humanos Par 22/ultraestrutura , DNA/ultraestrutura , Algoritmos , Elementos Alu , Entropia , Genoma Humano , Humanos , Modelos Estatísticos , Oligonucleotídeos/química , Sequências Repetitivas de Ácido Nucleico , Termodinâmica
8.
Theor Appl Genet ; 113(2): 239-50, 2006 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-16791690

RESUMO

A set of 111,090 barley expressed sequence tags (ESTs) was searched for the presence of microsatellite motifs [simple sequence repeat (SSRs)] and yielded 2,823 non-redundant SSR-containing ESTs (SSR-ESTs). From this, a set of 754 primer pairs was designed of which 525 primer pairs yielded an amplicon and as a result, 185 EST-derived microsatellite loci (EST-SSRs) were placed onto a genetic map of barley. The markers show a uniform distribution along all seven linkage groups ranging from 21 (7H) to 35 (3H) markers. Polymorphism information content values ranged from of 0.24 to 0.78 (average 0.48). To further investigate the physical distribution of the EST-SSRs in the barley genome, a bacterial artificial chromosomes (BAC) library was screened. Out of 129 markers tested, BAC addresses were obtained for 127 EST-SSR markers. Twenty-seven BACs, forming eight contigs, were hit by two or three EST-SSRs each. This unexpectedly high incidence of EST-SSRs physically linked at the sub-megabase level provides additional evidence of an uneven distribution of genes and the segmentation of the barley genome in gene-rich and gene-poor regions.


Assuntos
Cromossomos Artificiais Bacterianos , Etiquetas de Sequências Expressas , Marcadores Genéticos , Genoma de Planta , Hordeum/genética , Repetições de Microssatélites/genética , Reação em Cadeia da Polimerase , Polimorfismo Genético
9.
Bioinformatics ; 21(11): 2657-66, 2005 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-15797905

RESUMO

MOTIVATION: We propose a new class of variable-order Bayesian network (VOBN) models for the identification of transcription factor binding sites (TFBSs). The proposed models generalize the widely used position weight matrix (PWM) models, Markov models and Bayesian network models. In contrast to these models, where for each position a fixed subset of the remaining positions is used to model dependencies, in VOBN models, these subsets may vary based on the specific nucleotides observed, which are called the context. This flexibility turns out to be of advantage for the classification and analysis of TFBSs, as statistical dependencies between nucleotides in different TFBS positions (not necessarily adjacent) may be taken into account efficiently--in a position-specific and context-specific manner. RESULTS: We apply the VOBN model to a set of 238 experimentally verified sigma-70 binding sites in Escherichia coli. We find that the VOBN model can distinguish these 238 sites from a set of 472 intergenic 'non-promoter' sequences with a higher accuracy than fixed-order Markov models or Bayesian trees. We use a replicated stratified-holdout experiment having a fixed true-negative rate of 99.9%. We find that for a foreground inhomogeneous VOBN model of order 1 and a background homogeneous variable-order Markov (VOM) model of order 5, the obtained mean true-positive (TP) rate is 47.56%. In comparison, the best TP rate for the conventional models is 44.39%, obtained from a foreground PWM model and a background 2nd-order Markov model. As the standard deviation of the estimated TP rate is approximately 0.01%, this improvement is highly significant.


Assuntos
Algoritmos , Modelos Químicos , Modelos Moleculares , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Fatores de Transcrição/química , Inteligência Artificial , Teorema de Bayes , Sítios de Ligação , Simulação por Computador , Bases de Dados de Proteínas , Cadeias de Markov , Modelos Estatísticos , Ligação Proteica , Relação Estrutura-Atividade , Fatores de Transcrição/análise , Fatores de Transcrição/classificação
10.
J Mol Evol ; 51(4): 353-62, 2000 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-11040286

RESUMO

It has been hypothesized that a large fraction of 24% noncoding DNA in R. prowazekii consists of degraded genes. This hypothesis has been based on the relatively high G+C content of noncoding DNA. However, a comparison with other genomes also having a low overall G+C content shows that this argument would also apply to other bacteria. To test this hypothesis, we study the coding potential in sets of genes, pseudogenes, and intergenic regions. We find that the correlation function and the chi(2)-measure are clearly indicative of the coding function of genes and pseudogenes. However, both coding potentials make almost no indication of a preexisting reading frame in the remaining 23% of noncoding DNA. We simulate the degradation of genes due to single-nucleotide substitutions and insertions/deletions and quantify the number of mutations required to remove indications of the reading frame. We discuss a reduced selection pressure as another possible origin of this comparatively large fraction of noncoding sequences.


Assuntos
DNA Intergênico , Genes Bacterianos , Rickettsia prowazekii/genética , Modelos Genéticos , Mutação Puntual , Polimorfismo de Nucleotídeo Único
11.
J Theor Biol ; 206(4): 525-37, 2000 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-11013113

RESUMO

We study the coding potential of human DNA sequences, using the positional asymmetry function (D(p)) and the positional information function (I(q)). Both D(p)and I(q)are based on the positional dependence of single nucleotide frequencies. We investigate the accuracy of D(p)and I(q)in distinguishing coding and non-coding DNA as a function of the parameters p and q, respectively, and explore at which parameters p(opt)and q(opt)both D(p)and I(q)distinguish coding and non-coding DNA most accurately. We compare our findings with classically used parameter values and find that optimized coding potentials yield comparable accuracies as classical frame-independent coding potentials trained on prior data. We find that p(opt)and q(opt)vary only slightly with the sequence length.


Assuntos
Códon , Genoma Humano , Modelos Genéticos , Análise de Sequência de DNA , Humanos
12.
Pac Symp Biocomput ; : 614-23, 2000.
Artigo em Inglês | MEDLINE | ID: mdl-10902209

RESUMO

One basic problem in the analysis of DNA sequences is the recognition of protein-coding genes. Computer algorithms to facilitate gene identification have become important as genome sequencing projects have turned from mapping to large-scale sequencing, resulting in an exponentially growing number of sequenced nucleotides that await their annotation. Many statistical patterns have been discovered that are different in coding and noncoding DNA, but most of them vary from species to species, and hence require prior training on organism-specific data sets. Here, we investigate if there exist species-independent statistical patterns that are different in coding and noncoding DNA. We introduce an information-theoretic quantity, the average mutual information (AMI), and we find that the probability distribution functions of the AMI are significantly different in coding and noncoding DNA, while they are almost identical for different species. This finding suggests that the AMI might be useful for the recognition of protein-coding regions in genomes for which training sets do not exist.


Assuntos
DNA/genética , Modelos Genéticos , Algoritmos , Animais , Códon/genética , Simulação por Computador , Proteínas/genética , Especificidade da Espécie
13.
Artigo em Inglês | MEDLINE | ID: mdl-11031617

RESUMO

We explore if there exist universal statistical patterns that are different in coding and noncoding DNA and can be found in all living organisms, regardless of their phylogenetic origin. We find that (i) the mutual information function [symbol: see text] has a significantly different functional form in coding and noncoding DNA. We further find that (ii) the probability distributions of the average mutual information [symbol: see text] are significantly different in coding and noncoding DNA, while (iii) they are almost the same for organisms of all taxonomic classes. Surprisingly, we find that [symbol: see text] is capable of predicting coding regions as accurately as organism-specific coding measures.


Assuntos
DNA/genética , Código Genético , Modelos Genéticos , DNA/química , Humanos , Modelos Estatísticos
14.
Artigo em Inglês | MEDLINE | ID: mdl-11046521

RESUMO

We derive exact statistical properties of a recursive fragmentation process. We show that introducing a fragmentation probability 0

15.
Phys Rev Lett ; 85(6): 1342-5, 2000 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-10991547

RESUMO

We present a new computational approach to finding borders between coding and noncoding DNA. This approach has two features: (i) DNA sequences are described by a 12-letter alphabet that captures the differential base composition at each codon position, and (ii) the search for the borders is carried out by means of an entropic segmentation method which uses only the general statistical properties of coding DNA. We find that this method is highly accurate in finding borders between coding and noncoding regions and requires no "prior training" on known data sets. Our results appear to be more accurate than those obtained with moving windows in the discrimination of coding from noncoding DNA.


Assuntos
DNA/química , DNA/genética , Entropia , Código Genético , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa