Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Epidemics ; 32: 100393, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32674025

RESUMO

Modern data and computational resources, coupled with algorithmic and theoretical advances to exploit these, allow disease dynamic models to be parameterised with increasing detail and accuracy. While this enhances models' usefulness in prediction and policy, major challenges remain. In particular, lack of identifiability of a model's parameters may limit the usefulness of the model. While lack of parameter identifiability may be resolved through incorporation into an inference procedure of prior knowledge, formulating such knowledge is often difficult. Furthermore, there are practical challenges associated with acquiring data of sufficient quantity and quality. Here, we discuss recent progress on these issues.


Assuntos
Doenças Transmissíveis/epidemiologia , Política de Saúde , Modelos Teóricos , Saúde Pública/estatística & dados numéricos , Teorema de Bayes , Humanos , Modelos Biológicos
2.
R Soc Open Sci ; 7(3): 191315, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32269786

RESUMO

The behaviour of many processes in science and engineering can be accurately described by dynamical system models consisting of a set of ordinary differential equations (ODEs). Often these models have several unknown parameters that are difficult to estimate from experimental data, in which case Bayesian inference can be a useful tool. In principle, exact Bayesian inference using Markov chain Monte Carlo (MCMC) techniques is possible; however, in practice, such methods may suffer from slow convergence and poor mixing. To address this problem, several approaches based on approximate Bayesian computation (ABC) have been introduced, including Markov chain Monte Carlo ABC (MCMC ABC) and sequential Monte Carlo ABC (SMC ABC). While the system of ODEs describes the underlying process that generates the data, the observed measurements invariably include errors. In this paper, we argue that several popular ABC approaches fail to adequately model these errors because the acceptance probability depends on the choice of the discrepancy function and the tolerance without any consideration of the error term. We observe that the so-called posterior distributions derived from such methods do not accurately reflect the epistemic uncertainties in parameter values. Moreover, we demonstrate that these methods provide minimal computational advantages over exact Bayesian methods when applied to two ODE epidemiological models with simulated data and one with real data concerning malaria transmission in Afghanistan.

3.
Sci Rep ; 9(1): 8938, 2019 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-31222114

RESUMO

Accurate delimitation of the geographic range of a species is important for control of biological invasions, conservation of threatened species, and understanding species range dynamics under environmental change. However, estimating range boundaries is challenging because monitoring methods are imperfect, the area that might contain individuals is often incompletely surveyed, and species may have patchy distributions. In these circumstances, large areas can be surveyed without finding individuals despite occupancy extending beyond surveyed areas, resulting in underestimation of range limits. We developed a delimitation method that can be applied with imperfect survey data and patchy distributions. The approach is to construct polygons indicative of the geographic range of a species. Each polygon is associated with a specific probability such that each interior point of the polygon has at least that posterior probability of being interior to the true boundary according to a Bayesian model. The method uses the posterior distribution of latent quantities derived from an agent-based Bayesian model and calculates the posterior distribution of the range as a derived quantity from Markov chain Monte Carlo samples. An application of this method described here informed the Australian campaign to eradicate red imported fire ants (Solenopsis invicta).

4.
Malar J ; 17(1): 299, 2018 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-30119664

RESUMO

BACKGROUND: Much of the extensive research regarding transmission of malaria is underpinned by mathematical modelling. Compartmental models, which focus on interactions and transitions between population strata, have been a mainstay of such modelling for more than a century. However, modellers are increasingly adopting agent-based approaches, which model hosts, vectors and/or their interactions on an individual level. One reason for the increasing popularity of such models is their potential to provide enhanced realism by allowing system-level behaviours to emerge as a consequence of accumulated individual-level interactions, as occurs in real populations. METHODS: A systematic review of 90 articles published between 1998 and May 2018 was performed, characterizing agent-based models (ABMs) relevant to malaria transmission. The review provides an overview of approaches used to date, determines the advantages of these approaches, and proposes ideas for progressing the field. RESULTS: The rationale for ABM use over other modelling approaches centres around three points: the need to accurately represent increased stochasticity in low-transmission settings; the benefits of high-resolution spatial simulations; and heterogeneities in drug and vaccine efficacies due to individual patient characteristics. The success of these approaches provides avenues for further exploration of agent-based techniques for modelling malaria transmission. Potential extensions include varying elimination strategies across spatial landscapes, extending the size of spatial models, incorporating human movement dynamics, and developing increasingly comprehensive parameter estimation and optimization techniques. CONCLUSION: Collectively, the literature covers an extensive array of topics, including the full spectrum of transmission and intervention regimes. Bringing these elements together under a common framework may enhance knowledge of, and guide policies towards, malaria elimination. However, because of the diversity of available models, endorsing a standardized approach to ABM implementation may not be possible. Instead it is recommended that model frameworks be contextually appropriate and sufficiently described. One key recommendation is to develop enhanced parameter estimation and optimization techniques. Extensions of current techniques will provide the robust results required to enhance current elimination efforts.


Assuntos
Transmissão de Doença Infecciosa , Interações Hospedeiro-Parasita , Malária/transmissão , Modelos Estatísticos , Mosquitos Vetores/fisiologia , Animais , Humanos
5.
PLoS One ; 13(12): e0208927, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30596668

RESUMO

Time series segmentation aims to identify segment boundary points in a time series, and to determine the dynamical properties corresponding to each segment. To segment time series data, this article presents a Bayesian change-point model in which the data within segments follows an autoregressive moving average (ARMA) model. A prior distribution is defined for the number of change-points, their positions, segment means and error terms. To quantify uncertainty about the location of change-points, the resulting posterior probability distributions are sampled using the Generalized Gibbs sampler Markov chain Monte Carlo technique. This methodology is illustrated by applying it to simulated data and to real data known as the well-log time series data. This well-log data records the measurements of nuclear magnetic response of underground rocks during the drilling of a well. Our approach has high sensitivity, and detects a larger number of change-points than have been identified by comparable methods in the existing literature.


Assuntos
Teorema de Bayes , Espectroscopia de Ressonância Magnética/estatística & dados numéricos , Modelos Estatísticos , Humanos , Cadeias de Markov , Método de Monte Carlo
6.
PLoS One ; 12(3): e0173331, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28288164

RESUMO

Small molecule inhibitors, such as lapatinib, are effective against breast cancer in clinical trials, but tumor cells ultimately acquire resistance to the drug. Maintaining sensitization to drug action is essential for durable growth inhibition. Recently, adaptive reprogramming of signaling circuitry has been identified as a major cause of acquired resistance. We developed a computational framework using a Bayesian statistical approach to model signal rewiring in acquired resistance. We used the p1-model to infer potential aberrant gene-pairs with differential posterior probabilities of appearing in resistant-vs-parental networks. Results were obtained using matched gene expression profiles under resistant and parental conditions. Using two lapatinib-treated ErbB2-positive breast cancer cell-lines: SKBR3 and BT474, our method identified similar dysregulated signaling pathways including EGFR-related pathways as well as other receptor-related pathways, many of which were reported previously as compensatory pathways of EGFR-inhibition via signaling cross-talk. A manual literature survey provided strong evidence that aberrant signaling activities in dysregulated pathways are closely related to acquired resistance in EGFR tyrosine kinase inhibitors. Our approach predicted literature-supported dysregulated pathways complementary to both node-centric (SPIA, DAVID, and GATHER) and edge-centric (ESEA and PAGI) methods. Moreover, by proposing a novel pattern of aberrant signaling called V-structures, we observed that genes were dysregulated in resistant-vs-sensitive conditions when they were involved in the switch of dependencies from targeted to bypass signaling events. A literature survey of some important V-structures suggested they play a role in breast cancer metastasis and/or acquired resistance to EGFR-TKIs, where the mRNA changes of TGFBR2, LEF1 and TP53 in resistant-vs-sensitive conditions were related to the dependency switch from targeted to bypass signaling links. Our results suggest many signaling pathway structures are compromised in acquired resistance, and V-structures of aberrant signaling within/among those pathways may provide further insights into the bypass mechanism of targeted inhibition.


Assuntos
Antineoplásicos/uso terapêutico , Teorema de Bayes , Neoplasias da Mama/tratamento farmacológico , Resistencia a Medicamentos Antineoplásicos/genética , Regulação Neoplásica da Expressão Gênica , Neoplasias da Mama/genética , Feminino , Humanos , Probabilidade
7.
BMC Genomics ; 18(1): 259, 2017 03 27.
Artigo em Inglês | MEDLINE | ID: mdl-28347272

RESUMO

BACKGROUND: Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. RESULTS: We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. CONCLUSIONS: This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.


Assuntos
Genoma , RNA não Traduzido/metabolismo , Animais , Teorema de Bayes , Sítios de Ligação , Sequência Conservada , Humanos , Íntrons , Camundongos , Desenvolvimento Muscular/genética , Conformação de Ácido Nucleico , RNA não Traduzido/química , RNA não Traduzido/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Interface Usuário-Computador , Peixe-Zebra/genética
8.
Methods Mol Biol ; 1525: E1, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28220404
9.
Methods Mol Biol ; 1525: 293-312, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896726

RESUMO

Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.


Assuntos
Biologia Computacional/métodos , Genoma/genética , Software , Interface Usuário-Computador
10.
PLoS One ; 10(3): e0118595, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25739023

RESUMO

Wolbachia pipientis is an endosymbiotic bacterium that induces a wide range of effects in its insect hosts, including manipulation of reproduction and protection against pathogens. Little is known of the molecular mechanisms underlying the insect-Wolbachia interaction, though it is likely to be mediated via the secretion of proteins or other factors. There is an increasing amount of evidence that bacteria regulate many cellular processes, including secretion of virulence factors, using small non-coding RNAs (sRNAs), but sRNAs have not previously been described from Wolbachia. We have used two independent approaches, one based on comparative genomics and the other using RNA-Seq data generated for gene expression studies, to identify candidate sRNAs in Wolbachia. We experimentally characterized the expression of one of these candidates in four Wolbachia strains, and showed that it is differentially regulated in different host tissues and sexes. Given the roles played by sRNAs in other host-associated bacteria, the conservation of the candidate sRNAs between different Wolbachia strains, and the sex- and tissue-specific differential regulation we have identified, we hypothesise that sRNAs may play a significant role in the biology of Wolbachia, and in particular in its interactions with its host.


Assuntos
Espaço Intracelular/microbiologia , Pequeno RNA não Traduzido/genética , Wolbachia/genética , Wolbachia/fisiologia , Animais , Biologia Computacional , Sequência Conservada , Drosophila melanogaster/microbiologia , Feminino , Especificidade de Hospedeiro , Masculino , Especificidade de Órgãos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de RNA , Transcrição Gênica
11.
BMC Syst Biol ; 9: 2, 2015 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-25599599

RESUMO

BACKGROUND: Initial success of inhibitors targeting oncogenes is often followed by tumor relapse due to acquired resistance. In addition to mutations in targeted oncogenes, signaling cross-talks among pathways play a vital role in such drug inefficacy. These include activation of compensatory pathways and altered activities of key effectors in other cell survival and growth-associated pathways. RESULTS: We propose a computational framework using Bayesian modeling to systematically characterize potential cross-talks among breast cancer signaling pathways. We employed a fully Bayesian approach known as the p 1-model to infer posterior probabilities of gene-pairs in networks derived from the gene expression datasets of ErbB2-positive breast cancer cell-lines (parental, lapatinib-sensitive cell-line SKBR3 and the lapatinib-resistant cell-line SKBR3-R, derived from SKBR3). Using this computational framework, we searched for cross-talks between EGFR/ErbB and other signaling pathways from Reactome, KEGG and WikiPathway databases that contribute to lapatinib resistance. We identified 104, 188 and 299 gene-pairs as putative drug-resistant cross-talks, respectively, each comprised of a gene in the EGFR/ErbB signaling pathway and a gene from another signaling pathway, that appear to be interacting in resistant cells but not in parental cells. In 168 of these (distinct) gene-pairs, both of the interacting partners are up-regulated in resistant conditions relative to parental conditions. These gene-pairs are prime candidates for novel cross-talks contributing to lapatinib resistance. They associate EGFR/ErbB signaling with six other signaling pathways: Notch, Wnt, GPCR, hedgehog, insulin receptor/IGF1R and TGF- ß receptor signaling. We conducted a literature survey to validate these cross-talks, and found evidence supporting a role for many of them in contributing to drug resistance. We also analyzed an independent study of lapatinib resistance in the BT474 breast cancer cell-line and found the same signaling pathways making cross-talks with the EGFR/ErbB signaling pathway as in the primary dataset. CONCLUSIONS: Our results indicate that the activation of compensatory pathways can potentially cause up-regulation of EGFR/ErbB pathway genes (counteracting the inhibiting effect of lapatinib) via signaling cross-talk. Thus, the up-regulated members of these compensatory pathways along with the members of the EGFR/ErbB signaling pathway are interesting as potential targets for designing novel anti-cancer therapeutics.


Assuntos
Neoplasias da Mama/patologia , Biologia Computacional/métodos , Resistencia a Medicamentos Antineoplásicos , Modelos Estatísticos , Transdução de Sinais/efeitos dos fármacos , Teorema de Bayes , Linhagem Celular Tumoral , Receptores ErbB/metabolismo , Humanos , Receptor IGF Tipo 1 , Receptor de Insulina/metabolismo , Receptores Notch/metabolismo , Receptores de Somatomedina/metabolismo , Via de Sinalização Wnt/efeitos dos fármacos
12.
Cladistics ; 31(4): 438-440, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34772263

RESUMO

A recent article published in Cladistics is critical of a number of heuristic methods for phylogenetic inference based on parsimony scores. One of my papers is among those criticized, and I would appreciate the opportunity to make a public response. The specific criticism is that I have re-invented an algorithm for economizing parsimony calculations on trees that differ by a subtree pruning and regrafting (SPR) rearrangement. This criticism is justified, and I apologize for incorrectly claiming originality for my presentation of this algorithm. However, I would like to clarify the intent of my paper, if I can do so without detracting from the sincerity of my apology. My paper is not about that algorithm, nor even primarily about parsimony. Rather, it is about a novel strategy for Markov chain Monte Carlo (MCMC) sampling in a state space consisting of trees. The sampler involves drawing from conditional distributions over sets of trees: a Gibbs-like strategy that had not previously been used to sample tree-space. I would like to see this technique incorporated into MCMC samplers for phylogenetics, as it may have advantages over commonly used Metropolis-like strategies. I have recently used it to sample phylogenies of a biological invasion, and I am finding many applications for it in agent-based Bayesian ecological modelling. It is thus my contention that my 2005 paper retains substantial value.

13.
Comput Struct Biotechnol J ; 10(17): 107-15, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25349679

RESUMO

Genomes are composed of a wide variety of elements with distinct roles and characteristics. Some of these elements are well-characterised functional components such as protein-coding exons. Other elements play regulatory or structural roles, encode functional non-protein-coding RNAs, or perform some other function yet to be characterised. Still others may have no functional importance, though they may nevertheless be of interest to biologists. One technique for investigating the composition of genomes is to segment sequences into compositionally homogenous blocks. This technique, known as 'sequence segmentation' or 'change-point analysis', is used to identify patterns of variation across genomes such as GC-rich and GC-poor regions, coding and non-coding regions, slowly evolving and rapidly evolving regions and many other types of variation. In this mini-review we outline many of the genome segmentation methods currently available and then focus on a Bayesian DNA segmentation algorithm, with examples of its various applications.

14.
PLoS One ; 9(5): e97336, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24824035

RESUMO

The 3' UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3' UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3' UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3' UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3' UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.


Assuntos
Regiões 3' não Traduzidas/genética , Drosophila/genética , Variação Genética , Modelos Genéticos , Animais , Sequência de Bases , Biologia Computacional , Dados de Sequência Molecular , Fases de Leitura Aberta/genética , Especificidade da Espécie
15.
Proc Natl Acad Sci U S A ; 110(33): 13428-33, 2013 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-23878210

RESUMO

Eradication of an invasive species can provide significant environmental, economic, and social benefits, but eradication programs often fail. Constant and careful monitoring improves the chance of success, but an invasion may seem to be in decline even when it is expanding in abundance or spatial extent. Determining whether an invasion is in decline is a challenging inference problem for two reasons. First, it is typically infeasible to regularly survey the entire infested region owing to high cost. Second, surveillance methods are imperfect and fail to detect some individuals. These two factors also make it difficult to determine why an eradication program is failing. Agent-based methods enable inferences to be made about the locations of undiscovered individuals over time to identify trends in invader abundance and spatial extent. We develop an agent-based Bayesian method and apply it to Australia's largest eradication program: the campaign to eradicate the red imported fire ant (Solenopsis invicta) from Brisbane. The invasion was deemed to be almost eradicated in 2004 but our analyses indicate that its geographic range continued to expand despite a sharp decline in number of nests. We also show that eradication would probably have been achieved with a relatively small increase in the area searched and treated. Our results demonstrate the importance of inferring temporal and spatial trends in ongoing invasions. The method can handle incomplete observations and takes into account the effects of human intervention. It has the potential to transform eradication practices.


Assuntos
Formigas/fisiologia , Conservação dos Recursos Naturais/métodos , Monitoramento Ambiental/métodos , Controle de Insetos/métodos , Espécies Introduzidas/estatística & dados numéricos , Modelos Biológicos , Animais , Teorema de Bayes , Dinâmica Populacional , Queensland
16.
BMC Bioinformatics ; 13: 179, 2012 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-22838505

RESUMO

BACKGROUND: Many problems in bioinformatics involve classification based on features such as sequence, structure or morphology. Given multiple classifiers, two crucial questions arise: how does their performance compare, and how can they best be combined to produce a better classifier? A classifier can be evaluated in terms of sensitivity and specificity using benchmark, or gold standard, data, that is, data for which the true classification is known. However, a gold standard is not always available. Here we demonstrate that a Bayesian model for comparing medical diagnostics without a gold standard can be successfully applied in the bioinformatics domain, to genomic scale data sets. We present a new implementation, which unlike previous implementations is applicable to any number of classifiers. We apply this model, for the first time, to the problem of finding the globally optimal logical combination of classifiers. RESULTS: We compared three classifiers of protein subcellular localisation, and evaluated our estimates of sensitivity and specificity against estimates obtained using a gold standard. The method overestimated sensitivity and specificity with only a small discrepancy, and correctly ranked the classifiers. Diagnostic tests for swine flu were then compared on a small data set. Lastly, classifiers for a genome-wide association study of macular degeneration with 541094 SNPs were analysed. In all cases, run times were feasible, and results precise. The optimal logical combination of classifiers was also determined for all three data sets. Code and data are available from http://bioinformatics.monash.edu.au/downloads/. CONCLUSIONS: The examples demonstrate the methods are suitable for both small and large data sets, applicable to the wide range of bioinformatics classification problems, and robust to dependence between classifiers. In all three test cases, the globally optimal logical combination of the classifiers was found to be their union, according to three out of four ranking criteria. We propose as a general rule of thumb that the union of classifiers will be close to optimal.


Assuntos
Biologia Computacional/métodos , Algoritmos , Teorema de Bayes , Classificação/métodos , Estudo de Associação Genômica Ampla , Humanos , Degeneração Macular/genética , Polimorfismo de Nucleotídeo Único , Proteínas/análise , Sensibilidade e Especificidade
17.
PLoS One ; 7(3): e33565, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22479412

RESUMO

Glial fibrillary acidic protein (GFAP) is an intermediate filament (IF) protein specific to central nervous system (CNS) astrocytes. It has been the subject of intense interest due to its association with neurodegenerative diseases, and because of growing evidence that IF proteins not only modulate cellular structure, but also cellular function. Moreover, GFAP has a family of splicing isoforms apparently more complex than that of other CNS IF proteins, consistent with it possessing a range of functional and structural roles. The gene consists of 9 exons, and to date all isoforms associated with 3' end splicing have been identified from modifications within intron 7, resulting in the generation of exon 7a (GFAPδ/ε) and 7b (GFAPκ). To better understand the nature and functional significance of variation in this region, we used a Bayesian multiple change-point approach to identify conserved regions. This is the first successful application of this method to a single gene--it has previously only been used in whole-genome analyses. We identified several highly or moderately conserved regions throughout the intron 7/7a/7b regions, including untranslated regions and regulatory features, consistent with the biology of GFAP. Several putative unconfirmed features were also identified, including a possible new isoform. We then integrated multiple computational analyses on both the DNA and protein sequences from the mouse, rat and human, showing that the major isoform, GFAPα, has highly conserved structure and features across the three species, whereas the minor isoforms GFAPδ/ε and GFAPκ have low conservation of structure and features at the distal 3' end, both relative to each other and relative to GFAPα. The overall picture suggests distinct and tightly regulated functions for the 3' end isoforms, consistent with complex astrocyte biology. The results illustrate a computational approach for characterising splicing isoform families, using both DNA and protein sequences.


Assuntos
Biologia Computacional/métodos , Proteína Glial Fibrilar Ácida/química , Processamento Alternativo , Sequência de Aminoácidos , Animais , Sequência de Bases , Sequência Conservada , Éxons , Proteína Glial Fibrilar Ácida/genética , Humanos , Interações Hidrofóbicas e Hidrofílicas , Camundongos , Dados de Sequência Molecular , Fosforilação , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Sítios de Splice de RNA , Ratos , Elementos Reguladores de Transcrição
18.
Bioinformatics ; 27(5): 604-10, 2011 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21208984

RESUMO

MOTIVATION: The analysis of multiple sequence alignments is allowing researchers to glean valuable insights into evolution, as well as identify genomic regions that may be functional, or discover novel classes of functional elements. Understanding the distribution of conservation levels that constitutes the evolutionary landscape is crucial to distinguishing functional regions from non-functional. Recent evidence suggests that a binary classification of evolutionary rates is inappropriate for this purpose and finds only highly conserved functional elements. Given that the distribution of evolutionary rates is multi-modal, determining the number of modes is of paramount concern. Through simulation, we evaluate the performance of a number of information criterion approaches derived from MCMC simulations in determining the dimension of a model. RESULTS: We utilize a deviance information criterion (DIC) approximation that is more robust than the approximations from other information criteria, and show our information criteria approximations do not produce superfluous modes when estimating conservation distributions under a variety of circumstances. We analyse the distribution of conservation for a multiple alignment comprising four primate species and mouse, and repeat this on two additional multiple alignments of similar species. We find evidence of six distinct classes of evolutionary rates that appear to be robust to the species used. AVAILABILITY: Source code and data are available at http://dl.dropbox.com/u/477240/changept.zip.


Assuntos
Evolução Molecular , Modelos Estatísticos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Animais , Teorema de Bayes , Biologia Computacional/métodos , Simulação por Computador , DNA/análise , Genômica/métodos , Camundongos , Primatas
19.
Mol Biol Evol ; 27(4): 942-53, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19955480

RESUMO

The proportion of functional sequence in the human genome is currently a subject of debate. The most widely accepted figure is that approximately 5% is under purifying selection. In Drosophila, estimates are an order of magnitude higher, though this corresponds to a similar quantity of sequence. These estimates depend on the difference between the distribution of genomewide evolutionary rates and that observed in a subset of sequences presumed to be neutrally evolving. Motivated by the widening gap between these estimates and experimental evidence of genome function, especially in mammals, we developed a sensitive technique for evaluating such distributions and found that they are much more complex than previously apparent. We found strong evidence for at least nine well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least seven classes in an alignment of four mammals, including human. We also identified at least three rate classes in human ancestral repeats. By positing that the largest of these ancestral repeat classes is neutrally evolving, we estimate that the proportion of nonneutrally evolving sequence is 30% of human ancestral repeats and 45% of the aligned portion of the genome. However, we also question whether any of the classes represent neutrally evolving sequences and argue that a plausible alternative is that they reflect variable structure-function constraints operating throughout the genomes of complex organisms.


Assuntos
Drosophila/genética , Mamíferos/genética , Animais , Sequência Conservada , Evolução Molecular , Genoma Humano , Humanos , Recombinação Genética , Alinhamento de Sequência
20.
Hum Genet ; 126(2): 277-88, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19390863

RESUMO

Definition of disease phenotype is a necessary preliminary to research into genetic causes of a complex disease. Clinical diagnosis of migraine is currently based on diagnostic criteria developed by the International Headache Society. Previously, we examined the natural clustering of these diagnostic symptoms using latent class analysis (LCA) and found that a four-class model was preferred. However, the classes can be ordered such that all symptoms progressively intensify, suggesting that a single continuous variable representing disease severity may provide a better model. Here, we compare two models: item response theory and LCA, each constructed within a Bayesian context. A deviance information criterion is used to assess model fit. We phenotyped our population sample using these models, estimated heritability and conducted genome-wide linkage analysis using Merlin-qtl. LCA with four classes was again preferred. After transformation, phenotypic trait values derived from both models are highly correlated (correlation = 0.99) and consequently results from subsequent genetic analyses were similar. Heritability was estimated at 0.37, while multipoint linkage analysis produced genome-wide significant linkage to chromosome 7q31-q33 and suggestive linkage to chromosomes 1 and 2. We argue that such continuous measures are a powerful tool for identifying genes contributing to migraine susceptibility.


Assuntos
Transtornos de Enxaqueca/diagnóstico , Transtornos de Enxaqueca/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Teorema de Bayes , Análise por Conglomerados , Doenças em Gêmeos , Feminino , Ligação Genética , Predisposição Genética para Doença , Humanos , Escore Lod , Masculino , Pessoa de Meia-Idade , Transtornos de Enxaqueca/fisiopatologia , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...