Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Math Biol ; 82(1-2): 6, 2021 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-33483865

RESUMO

The coupled Wright-Fisher diffusion is a multi-dimensional Wright-Fisher diffusion for multi-locus and multi-allelic genetic frequencies, expressed as the strong solution to a system of stochastic differential equations that are coupled in the drift, where the pairwise interaction among loci is modelled by an inter-locus selection. In this paper, an ancestral process, which is dual to the coupled Wright-Fisher diffusion, is derived. The dual process corresponds to the block counting process of coupled ancestral selection graphs, one for each locus. Jumps of the dual process arise from coalescence, mutation, single-branching, which occur at one locus at the time, and double-branching, which occur simultaneously at two loci. The coalescence and mutation rates have the typical structure of the transition rates of the Kingman coalescent process. The single-branching rate not only contains the one-locus selection parameters in a form that generalises the rates of an ancestral selection graph, but it also contains the two-locus selection parameters to include the effect of the pairwise interaction on the single loci. The double-branching rate reflects the particular structure of pairwise selection interactions of the coupled Wright-Fisher diffusion. Moreover, in the special case of two loci, two alleles, with selection and parent independent mutation, the stationary density for the coupled Wright-Fisher diffusion and the transition rates of the dual process are obtained in an explicit form.


Assuntos
Modelos Genéticos , Taxa de Mutação , Alelos , Frequência do Gene , Genética Populacional , Mutação , Seleção Genética
3.
Eur J Hum Genet ; 23(5): 688-92, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25159868

RESUMO

In an attempt to map chromosomal regions carrying rare gene variants contributing to the risk of multiple sclerosis (MS), we identified segments shared identical-by-descent (IBD) using the software BEAGLE 4.0's refined IBD analysis. IBD mapping aims at identifying segments inherited from a common ancestor and shared more frequently in case-case pairs. A total of 2106 MS patients of Nordic origin and 624 matched controls were genotyped on Illumina Human Quad 660 chip and an additional 1352 ethnically matched controls typed on Illumina HumanHap 550 and Illumina 1M were added. The quality control left a total of 441 731 markers for the analysis. After identification of segments shared by descent and significance testing, a filter function for markers with low IBD sharing was applied. Four regions on chromosomes 5, 9, 14 and 19 were found to be significantly associated with the risk for MS. However, all markers but for one were located telomerically, including the very distal markers. For methodological reasons, such segments have a low sharing of IBD signals and are prone to be false positives. One marker on chromosome 19 reached genome-wide significance and was not one of the distal markers. This marker was located within the GNA11 gene, which contains no previous association with MS. We conclude that IBD mapping is not sufficiently powered to identify MS risk loci even in ethnically relatively homogenous populations, or that alternatively rare variants are not adequately present.


Assuntos
Mapeamento Cromossômico , Estudo de Associação Genômica Ampla , Esclerose Múltipla/genética , Estudos de Coortes , Marcadores Genéticos , Humanos , Mutação , Países Escandinavos e Nórdicos
4.
Bull Math Biol ; 69(3): 797-815, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17086368

RESUMO

We introduce a Bayesian theoretical formulation of the statistical learning problem concerning the genetic structure of populations. The two key concepts in our derivation are exchangeability in its various forms and random allocation models. Implications of our results to empirical investigation of the population structure are discussed.


Assuntos
Teorema de Bayes , Genética Populacional , Modelos Genéticos , Animais , Evolução Biológica , Drosophila melanogaster/genética
5.
J Bioinform Comput Biol ; 3(4): 861-90, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16078365

RESUMO

A molecular interaction library modeling favorable non-bonded interactions between atoms and molecular fragments is considered. In this paper, we represent the structure of the interaction library by a network diagram, which demonstrates that the underlying prediction model obtained for a molecular fragment is multi-layered. We clustered the molecular fragments into four groups by analyzing the pairwise distances between the molecular fragments. The distances are represented as an unrooted tree, in which the molecular fragments fall into four groups according to their function. For each fragment group, we modeled a group-specific a priori distribution with a Dirichlet distribution. The group-specific Dirichlet distributions enable us to derive a large population of similar molecular fragments that vary only in their contact preferences. Bayes' theorem then leads to a population distribution of the posterior probability vectors referred to as a "Dickey-Savage"-density. Two known methods for approximating multivariate integrals are applied to obtain marginal distributions of the Dickey-Savage density. The results of the numerical integration methods are compared with the simulated marginal distributions. By studying interactions between the protein structure of cyclohydrolase and its ligand guanosine-5'-triphosphate, we show that the marginal distributions of the posterior probabilities are more informative than the corresponding point estimates.


Assuntos
Algoritmos , Modelos Químicos , Modelos Moleculares , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Modelos Estatísticos , Dados de Sequência Molecular , Ligação Proteica , Proteínas/análise
6.
Int J Syst Evol Microbiol ; 55(Pt 1): 57-66, 2005 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-15653853

RESUMO

Minimization of stochastic complexity (SC) was used as a method for classification of genotypic fingerprints. The method was applied to fluorescent amplified fragment length polymorphism (fAFLP) fingerprint patterns of 507 Vibrionaceae representatives. As the current BinClass implementation of the optimization algorithm for classification only works on binary vectors, the original fingerprints were discretized in a preliminary step using the sliding-window band-matching method, in order to maximally preserve the information content of the original band patterns. The novel classification generated using the BinClass software package was subjected to an in-depth comparison with a hierarchical classification of the same dataset, in order to acknowledge the applicability of the new classification method as a more objective algorithm for the classification of genotyping fingerprint patterns. Recent DNA-DNA hybridization and 16S rRNA gene sequence experiments proved that the classification based on SC-minimization forms separate clusters that contain the fAFLP patterns for all representatives of the species Enterovibrio norvegicus, Vibrio fortis, Vibrio diazotrophicus or Vibrio campbellii, while previous hierarchical cluster analysis had suggested more heterogeneity within the fAFLP patterns by splitting the representatives of the above-mentioned species into multiple distant clusters. As a result, the new classification methodology has highlighted some previously unseen relationships within the biodiversity of the family Vibrionaceae.


Assuntos
Técnicas de Tipagem Bacteriana , Impressões Digitais de DNA/métodos , Polimorfismo de Fragmento de Restrição , Vibrionaceae/classificação , Algoritmos , Genótipo , Software , Processos Estocásticos , Vibrionaceae/genética
7.
Bull Math Biol ; 66(6): 1575-96, 2004 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-15522346

RESUMO

Microbiologists have traditionally applied hierarchical clustering algorithms as their mathematical tool of choice to unravel the taxonomic relationships between micro-organisms. However, the interpretation of such hierarchical classifications suffers from being subjective, in that a variety of ad hoc choices must be made during their construction. On the other hand, the application of more profound and objective mathematical methods--such as the minimization of stochastic complexity--for the classification of bacterial genotyping fingerprints data is hampered by the prerequisite that such methods only act upon vectorized data. In this paper we introduce a new method, coined sliding window discretization, for the transformation of genotypic fingerprint patterns into binary vector format. In the context of an extensive amplified fragment length polymorphism (AFLP) data set of 507 strains from the Vibrionaceae family that has previously been analysed, we demonstrate by comparison with a number of other discretization methods that this new discretization method results in minimal loss of the original information content captured in the banding patterns. Finally, we investigate the implications of the different discretization methods on the classification of bacterial genotyping fingerprints by minimization of stochastic complexity, as it is implemented in the BinClass software package for probabilistic clustering of binary vectors. The new taxonomic insights learned from the resulting classification of the AFLP patterns will prove the value of combining sliding window discretization with minimization of stochastic complexity, as an alternative classification algorithm for bacterial genotyping fingerprints.


Assuntos
Bactérias/genética , Impressões Digitais de DNA/métodos , DNA Bacteriano/genética , Bactérias/classificação , Computação Matemática , Modelos Genéticos , Vibrio/classificação , Vibrio/genética
8.
J Comput Aided Mol Des ; 17(7): 435-61, 2003 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-14677639

RESUMO

We describe a library of molecular fragments designed to model and predict non-bonded interactions between atoms. We apply the Bayesian approach, whereby prior knowledge and uncertainty of the mathematical model are incorporated into the estimated model and its parameters. The molecular interaction data are strengthened by narrowing the atom classification to 14 atom types, focusing on independent molecular contacts that lie within a short cutoff distance, and symmetrizing the interaction data for the molecular fragments. Furthermore, the location of atoms in contact with a molecular fragment are modeled by Gaussian mixture densities whose maximum a posteriori estimates are obtained by applying a version of the expectation-maximization algorithm that incorporates hyperparameters for the components of the Gaussian mixtures. A routine is introduced providing the hyperparameters and the initial values of the parameters of the Gaussian mixture densities. A model selection criterion, based on the concept of a 'minimum message length' is used to automatically select the optimal complexity of a mixture model and the most suitable orientation of a reference frame for a fragment in a coordinate system. The type of atom interacting with a molecular fragment is predicted by values of the posterior probability function and the accuracy of these predictions is evaluated by comparing the predicted atom type with the actual atom type seen in crystal structures. The fact that an atom will simultaneously interact with several molecular fragments forming a cohesive network of interactions is exploited by introducing two strategies that combine the predictions of atom types given by multiple fragments. The accuracy of these combined predictions is compared with those based on an individual fragment. Exhaustive validation analyses and qualitative examples (e.g., the ligand-binding domain of glutamate receptors) demonstrate that these improvements lead to effective modeling and prediction of molecular interactions.


Assuntos
Teorema de Bayes , Biblioteca de Peptídeos , Algoritmos , Automação , Desenho de Fármacos , Ligantes , Modelos Moleculares , Distribuição Normal , Probabilidade
9.
Syst Appl Microbiol ; 25(3): 403-15, 2002 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-12421078

RESUMO

We apply minimization of stochastic complexity and the closely related method of cumulative classification to analyse the extensively studied BIOLOG GN data of Vibrio spp. Minimization of stochastic complexity provides an objective tool of bacterial taxonomy as it produces classifications that are optimal from the point of view of information theory. We compare the outcome of our results with previously published classifications of the same data set. Our results both confirm earlier detected relationships between species and discover new ones.


Assuntos
Classificação/métodos , Biologia Computacional , Vibrio/classificação , Algoritmos , Técnicas de Tipagem Bacteriana , Processos Estocásticos , Vibrio/genética , Vibrio/fisiologia
10.
Bioinformatics ; 18(9): 1257-63, 2002 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12217918

RESUMO

MOTIVATION: Previously, Rantanen et al. (2001; J. Mol. Biol., 313, 197-214) constructed a protein atom-ligand fragment interaction library embodying experimentally solved, high-resolution three-dimensional (3D) structural data from the Protein Data Bank (PDB). The spatial locations of protein atoms that surround ligand fragments were modeled with Gaussian mixture models, the parameters of which were estimated with the expectation-maximization (EM) algorithm. In the validation analysis of this library, there was strong indication that the protein atom classification, 24 classes, was too large and that a reduction in the classes would lead to improved predictions. RESULTS: Here, a dissimilarity (distance) matrix that is suitable for comparison and fusion of 24 pre-defined protein atom classes has been derived. Jeffreys' distances between Gaussian mixture models are used as a basis to estimate dissimilarities between protein atom classes. The dissimilarity data are analyzed both with a hierarchical clustering method and independently by using multidimensional scaling analysis. The results provide additional insight into the relationships between different protein atom classes, giving us guidance on, for example, how to readjust protein atom classification and, thus, they will help us to improve protein--ligand interaction predictions. CONTACT: vira@utu.fi


Assuntos
Bases de Dados de Proteínas , Modelos Moleculares , Modelos Estatísticos , Distribuição Normal , Proteínas/química , Proteínas/classificação , Análise por Conglomerados , Ligantes , Conformação Proteica , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
Math Biosci ; 177-178: 161-84, 2002.
Artigo em Inglês | MEDLINE | ID: mdl-11965254

RESUMO

We present a theory of classification and predictive identification of bacteria. Bacterial strains are characterized by a binary vector and the taxonomy is specified by attaching a label to each vector. The theory is developed from only two basic assumptions, viz. that the sequence of pairs of feature vectors and the attached labels is judged (infinitely) exchangeable and predictively sufficient. We derive expressions for the training error and the probability of identification error and show that latter is an affine function of the former. We prove the law of large numbers for identification matrices, which contain the fundamental information of bacterial data. We prove the Bayesian risk consistency of the predictive identification rule given by the theory and show that the training error is a consistent estimate of the generalization error.


Assuntos
Bactérias/classificação , Teorema de Bayes , Classificação/métodos , Modelos Biológicos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA