Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
BMC Bioinformatics ; 18(1): 199, 2017 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-28359297

RESUMO

BACKGROUND: Factor graphs provide a flexible and general framework for specifying probability distributions. They can capture a range of popular and recent models for analysis of both genomics data as well as data from other scientific fields. Owing to the ever larger data sets encountered in genomics and the multiple-testing issues accompanying them, accurate significance evaluation is of great importance. We here address the problem of evaluating statistical significance of observations from factor graph models. RESULTS: Two novel numerical approximations for evaluation of statistical significance are presented. First a method using importance sampling. Second a saddlepoint approximation based method. We develop algorithms to efficiently compute the approximations and compare them to naive sampling and the normal approximation. The individual merits of the methods are analysed both from a theoretical viewpoint and with simulations. A guideline for choosing between the normal approximation, saddle-point approximation and importance sampling is also provided. Finally, the applicability of the methods is demonstrated with examples from cancer genomics, motif-analysis and phylogenetics. CONCLUSIONS: The applicability of saddlepoint approximation and importance sampling is demonstrated on known models in the factor graph framework. Using the two methods we can substantially improve computational cost without compromising accuracy. This contribution allows analyses of large datasets in the general factor graph framework.


Assuntos
Algoritmos , Biologia Computacional/métodos , Modelos Teóricos , Sequência de Aminoácidos , Fator de Ligação a CCCTC , Genômica , Humanos , Células MCF-7 , Neoplasias/diagnóstico , Neoplasias/genética , Filogenia , Probabilidade , Domínios e Motivos de Interação entre Proteínas , Proteínas Repressoras , Alinhamento de Sequência
2.
Theor Popul Biol ; 98: 48-58, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24486389

RESUMO

The coalescent with recombination process has initially been formulated backwards in time, but simulation algorithms and inference procedures often apply along sequences. Therefore it is of major interest to approximate the coalescent with recombination process by a Markov chain along sequences. We consider the finite loci case and two or more sequences. We formulate a natural Markovian approximation for the tree building process along the sequences, and derive simple and analytically tractable formulae for the distribution of the tree at the next locus conditioned on the tree at the present locus. We compare our Markov approximation to other sequential Markov chains and discuss various applications.


Assuntos
Genética Populacional/métodos , Modelos Genéticos , Densidade Demográfica , Algoritmos , Simulação por Computador , Loci Gênicos , Humanos , Cadeias de Markov , Modelos Teóricos , Probabilidade , Recombinação Genética
3.
Nat Genet ; 33(1): 90-6, 2003 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-12469123

RESUMO

Bladder cancer is a common malignant disease characterized by frequent recurrences. The stage of disease at diagnosis and the presence of surrounding carcinoma in situ are important in determining the disease course of an affected individual. Despite considerable effort, no accepted immunohistological or molecular markers have been identified to define clinically relevant subsets of bladder cancer. Here we report the identification of clinically relevant subclasses of bladder carcinoma using expression microarray analysis of 40 well characterized bladder tumors. Hierarchical cluster analysis identified three major stages, Ta, T1 and T2-4, with the Ta tumors further classified into subgroups. We built a 32-gene molecular classifier using a cross-validation approach that was able to classify benign and muscle-invasive tumors with close correlation to pathological staging in an independent test set of 68 tumors. The classifier provided new predictive information on disease progression in Ta tumors compared with conventional staging (P < 0.005). To delineate non-recurring Ta tumors from frequently recurring Ta tumors, we analyzed expression patterns in 31 tumors by applying a supervised learning classification methodology, which classified 75% of the samples correctly (P < 0.006). Furthermore, gene expression profiles characterizing each stage and subtype identified their biological properties, producing new potential targets for therapy.


Assuntos
Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias da Bexiga Urinária/classificação , Neoplasias da Bexiga Urinária/genética , Progressão da Doença , Genes Neoplásicos , Humanos , Imuno-Histoquímica , Estadiamento de Neoplasias , Prognóstico , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Neoplásico/análise , RNA Neoplásico/genética , Reprodutibilidade dos Testes , Neoplasias da Bexiga Urinária/diagnóstico , Neoplasias da Bexiga Urinária/patologia
4.
BMC Endocr Disord ; 9: 7, 2009 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-19243577

RESUMO

BACKGROUND: Beta-cells are extremely rich in zinc and zinc homeostasis is regulated by zinc transporter proteins. beta-cells are sensitive to cytokines, interleukin-1beta (IL-1beta) has been associated with beta-cell dysfunction and -death in both type 1 and type 2 diabetes. This study explores the regulation of zinc transporters following cytokine exposure. METHODS: The effects of cytokines IL-1beta, interferon-gamma (IFN-gamma), and tumor necrosis factor-alpha (TNF-alpha) on zinc transporter gene expression were measured in INS-1-cells and rat pancreatic islets. Being the more sensitive transporter, we further explored ZnT8 (Slc30A8): the effect of ZnT8 over expression on cytokine induced apoptosis was investigated as well as expression of the insulin gene and two apoptosis associated genes, BAX and BCL2. RESULTS: Our results showed a dynamic response of genes responsible for beta-cell zinc homeostasis to cytokines: IL-1beta down regulated a number of zinc-transporters, most strikingly ZnT8 in both islets and INS-1 cells. The effect was even more pronounced when mixing the cytokines. TNF-alpha had little effect on zinc transporter expression. IFN-gamma down regulated a number of zinc transporters. Insulin expression was down regulated by all cytokines. ZnT8 over expressing cells were more sensitive to IL-1beta induced apoptosis whereas no differences were observed with IFN-gamma, TNF-alpha, or a mixture of cytokines. CONCLUSION: The zinc transporting system in beta-cells is influenced by the exposure to cytokines. Particularly ZnT8, which has been associated with the development of diabetes, seems to be cytokine sensitive.

5.
Psychometrika ; 84(2): 484-510, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-29951971

RESUMO

In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge, etc. Traditional frequentist approaches that are used in such problems include the Wald test, the likelihood ratio test, and the score test (e.g., Fischer, Appl Psychol Meas 27:3-26, 2003; Finkelman, Weiss, & Kim-Kang, Appl Psychol Meas 34:238-254, 2010; Glas & Dagohoy, Psychometrika 72:159-180, 2007; Guo & Drasgow, Int J Sel Assess 18:351-364, 2010; Klauer & Rettig, Br J Math Stat Psychol 43:193-206, 1990; Sinharay, J Educ Behav Stat 42:46-68, 2017). This paper shows that approaches based on higher-order asymptotics (e.g., Barndorff-Nielsen & Cox, Inference and asymptotics. Springer, London, 1994; Ghosh, Higher order asymptotics. Institute of Mathematical Statistics, Hayward, 1994) can also be used to test for the equality of the examinee ability over two sets of items. The modified signed likelihood ratio test (e.g., Barndorff-Nielsen, Biometrika 73:307-322, 1986) and the Lugannani-Rice approximation (Lugannani & Rice, Adv Appl Prob 12:475-490, 1980), both of which are based on higher-order asymptotics, are shown to provide some improvement over the traditional frequentist approaches in three simulations. Two real data examples are also provided.


Assuntos
Enganação , Avaliação Educacional , Modelos Estatísticos , Humanos , Psicometria
6.
Hum Mutat ; 27(2): 187-94, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16395669

RESUMO

Colorectal cancer (CRC) is a multifactorial disease that involves both lifestyle and genetic factors. To identify single nucleotide polymorphisms (SNPs) associated with sporadic CRC, we used pooled DNA samples representing 230 cases with sporadic CRC and 540 controls. The allele frequency of the SNPs was estimated in the two pools using a genotyping method based on primer extension and capillary electrophoresis (CE). The sensitivity of the method was high, which permitted the detection of an odds ratio (OR) of 1.5. Validation of the method showed that it is robust, linear, sensitive, and reproducible. Of the 224 SNPs investigated, 20 potential candidates associated with CRC were identified, including IL6 -174G>C (g.22062318G>C), XRCC1 c.685 C>T (p.Arg194Trp), PPARGC1A g.92945042C>T (3'UTR 96516), GSTP1 c.342A>C (p.Ile105Val), GSTM1 c.573C>G (p.Lys173Asn), and SULT1A1 g.19934792G>A (p.Arg213His). All were borderline significant, and none were significant at the 5% level. A high number of the SNPs (40%) were not polymorphic in our population. We conclude that instead of looking for single risk factors, investigators should examine individual combinations of potential risk factors to clarify the genetic predisposition to CRC.


Assuntos
Neoplasias Colorretais/genética , DNA/química , Polimorfismo de Nucleotídeo Único , Idoso , Alelos , Primers do DNA/química , Eletroforese Capilar , Feminino , Frequência do Gene , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Genéticos , Razão de Chances
7.
Stat Appl Genet Mol Biol ; 4: Article18, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-16646835

RESUMO

We describe statistical inference in continuous time Markov processes of DNA sequences related by a phylogenetic tree. The maximum likelihood estimator can be found by the expectation maximization (EM) algorithm and an expression for the information matrix is also derived. We provide explicit analytical solutions for the EM algorithm and information matrix.

8.
Cancer Res ; 64(15): 5245-50, 2004 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-15289330

RESUMO

Accurate normalization is an absolute prerequisite for correct measurement of gene expression. For quantitative real-time reverse transcription-PCR (RT-PCR), the most commonly used normalization strategy involves standardization to a single constitutively expressed control gene. However, in recent years, it has become clear that no single gene is constitutively expressed in all cell types and under all experimental conditions, implying that the expression stability of the intended control gene has to be verified before each experiment. We outline a novel, innovative, and robust strategy to identify stably expressed genes among a set of candidate normalization genes. The strategy is rooted in a mathematical model of gene expression that enables estimation not only of the overall variation of the candidate normalization genes but also of the variation between sample subgroups of the sample set. Notably, the strategy provides a direct measure for the estimated expression variation, enabling the user to evaluate the systematic error introduced when using the gene. In a side-by-side comparison with a previously published strategy, our model-based approach performed in a more robust manner and showed less sensitivity toward coregulation of the candidate normalization genes. We used the model-based strategy to identify genes suited to normalize quantitative RT-PCR data from colon cancer and bladder cancer. These genes are UBC, GAPD, and TPT1 for the colon and HSPCB, TEGT, and ATP5B for the bladder. The presented strategy can be applied to evaluate the suitability of any normalization gene candidate in any kind of experimental design and should allow more reliable normalization of RT-PCR data.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias do Colo/genética , Perfilação da Expressão Gênica/normas , Proteínas de Neoplasias/genética , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Neoplasias da Bexiga Urinária/genética , Colo/metabolismo , Neoplasias do Colo/metabolismo , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Humanos , Modelos Teóricos , RNA Mensageiro/genética , Padrões de Referência , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Proteína Tumoral 1 Controlada por Tradução , Bexiga Urinária/metabolismo , Neoplasias da Bexiga Urinária/metabolismo
9.
BMC Bioinformatics ; 6: 83, 2005 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-15804354

RESUMO

BACKGROUND: Two central problems in computational biology are the determination of the alignment and phylogeny of a set of biological sequences. The traditional approach to this problem is to first build a multiple alignment of these sequences, followed by a phylogenetic reconstruction step based on this multiple alignment. However, alignment and phylogenetic inference are fundamentally interdependent, and ignoring this fact leads to biased and overconfident estimations. Whether the main interest be in sequence alignment or phylogeny, a major goal of computational biology is the co-estimation of both. RESULTS: We developed a fully Bayesian Markov chain Monte Carlo method for coestimating phylogeny and sequence alignment, under the Thorne-Kishino-Felsenstein model of substitution and single nucleotide insertion-deletion (indel) events. In our earlier work, we introduced a novel and efficient algorithm, termed the "indel peeling algorithm", which includes indels as phylogenetically informative evolutionary events, and resembles Felsenstein's peeling algorithm for substitutions on a phylogenetic tree. For a fixed alignment, our extension analytically integrates out both substitution and indel events within a proper statistical model, without the need for data augmentation at internal tree nodes, allowing for efficient sampling of tree topologies and edge lengths. To additionally sample multiple alignments, we here introduce an efficient partial Metropolized independence sampler for alignments, and combine these two algorithms into a fully Bayesian co-estimation procedure for the alignment and phylogeny problem. Our approach results in estimates for the posterior distribution of evolutionary rate parameters, for the maximum a-posteriori (MAP) phylogenetic tree, and for the posterior decoding alignment. Estimates for the evolutionary tree and multiple alignment are augmented with confidence estimates for each node height and alignment column. Our results indicate that the patterns in reliability broadly correspond to structural features of the proteins, and thus provides biologically meaningful information which is not existent in the usual point-estimate of the alignment. Our methods can handle input data of moderate size (10-20 protein sequences, each 100-200 bp), which we analyzed overnight on a standard 2 GHz personal computer. CONCLUSION: Joint analysis of multiple sequence alignment, evolutionary trees and additional evolutionary parameters can be now done within a single coherent statistical framework.


Assuntos
Biologia Computacional/métodos , Algoritmos , Sequência de Aminoácidos , Animais , Sequência de Bases , Teorema de Bayes , Simulação por Computador , Evolução Molecular , Deleção de Genes , Humanos , Funções Verossimilhança , Cadeias de Markov , Modelos Genéticos , Modelos Estatísticos , Dados de Sequência Molecular , Método de Monte Carlo , Mutação , Mioglobina/química , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Homologia de Sequência de Aminoácidos , Software , Especificidade da Espécie
10.
J Comput Biol ; 12(2): 186-203, 2005 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-15767776

RESUMO

Identifying and characterizing the structure in genome sequences is one of the principal challenges in modern molecular biology, and comparative genomics offers a powerful tool. In this paper, we introduce a hidden Markov model that allows a comparative analysis of multiple sequences related by a phylogenetic tree, and we present an efficient method for estimating the parameters of the model. The model integrates structure prediction methods for one sequence, statistical multiple alignment methods, and phylogenetic information. This unified model is particularly useful for a detailed characterization of DNA sequences with a common gene. We illustrate the model on a variety of homologous sequences.


Assuntos
Biologia Computacional/estatística & dados numéricos , Cadeias de Markov , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de DNA/estatística & dados numéricos , Agrobacterium tumefaciens/genética , Animais , Sequência de Bases , Interpretação Estatística de Dados , Humanos , Dados de Sequência Molecular , Homologia de Sequência do Ácido Nucleico , Sinorhizobium meliloti/genética
11.
J Comput Biol ; 16(9): 1209-10, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19772432

RESUMO

We demonstrate the simplicity and generality of the recently introduced linear space Baum-Welch algorithm for hidden Markov models. We also point to previous literature on the subject.


Assuntos
Algoritmos , Cadeias de Markov
12.
Neurobiol Aging ; 30(11): 1756-76, 2009 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18336954

RESUMO

Synaptic changes occur early in the course of Alzheimer's disease and are key to understanding the initial events in associated neurodegenerative processes. The quantitative analysis of synaptic morphology in transgenic mouse models of Alzheimer's disease can provide important insights into these processes. To this end, the total number and the distribution of the diameters of synaptic contacts in the stratum radiatum of the CA1 region of the hippocampus of 12-month-old APP/PS1DeltaE9 transgenic mice and wild type littermates have been evaluated by applying design-based stereological methods to material prepared for electron microscopy. Although there were no differences in the size of the synaptic contacts, the total number of synaptic contacts was significantly larger in the transgenic mice, suggesting that the transgenic effect at this age is synaptotrophic and that the presence of amyloid plaques and an elevated Abeta42/40 ratio are not necessarily detrimental to populations of synapses. The potential of this type of data in evaluating synaptic changes related to Alzheimer's disease is discussed and the methodology described in detail.


Assuntos
Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Região CA1 Hipocampal/patologia , Neurônios/patologia , Sinapses/patologia , Precursor de Proteína beta-Amiloide/genética , Animais , Atrofia/patologia , Contagem de Células/métodos , Córtex Cerebral/patologia , Modelos Animais de Doenças , Feminino , Humanos , Camundongos , Camundongos Transgênicos , Placa Amiloide/patologia , Presenilina-1/genética , Técnicas Estereotáxicas
13.
Proc Natl Acad Sci U S A ; 100(25): 14960-5, 2003 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-14657378

RESUMO

Algorithms are presented that allow the calculation of the probability of a set of sequences related by a binary tree that have evolved according to the Thorne-Kishino-Felsenstein model for a fixed set of parameters. The algorithms are based on a Markov chain generating sequences and their alignment at nodes in a tree. Depending on whether the complete realization of this Markov chain is decomposed into the first transition and the rest of the realization or the last transition and the first part of the realization, two kinds of recursions are obtained that are computationally similar but probabilistically different. The running time of the algorithms is O(Pi id=1 Li), where Li is the length of the ith observed sequences and d is the number of sequences. An alternative recursion is also formulated that uses only a Markov chain involving the inner nodes of a tree.


Assuntos
Evolução Biológica , Modelos Estatísticos , Algoritmos , Cadeias de Markov , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA