Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 70.564
Filtrar
1.
Genet Sel Evol ; 51(1): 43, 2019 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-31409294

RESUMO

BACKGROUND: Random regression models (RRM) are widely used to analyze longitudinal data in genetic evaluation systems because they can better account for time-course changes in environmental effects and additive genetic values of animals by fitting the test-day (TD) specific effects. Our objective was to implement a random regression model for the evaluation of dairy production traits in French goats. RESULTS: The data consisted of milk TD records from 30,186 and 32,256 first lactations of Saanen and Alpine goats. Milk yield, fat yield, protein yield, fat content and protein content were considered. Splines were used to model the environmental factors. The genetic and permanent environmental effects were modeled by the same Legendre polynomials. The goodness-of-fit and the genetic parameters derived from functions of the polynomials of orders 0 to 4 were tested. Results were also compared to those from a lactation model with total milk yield calculated over 250 days and to those of a multiple-trait model that considers performance in six periods throughout lactation as different traits. Genetic parameters were consistent between models. Models with fourth-order Legendre polynomials led to the best fit of the data. In order to reduce complexity, computing time, and interpretation, a rank reduction of the variance covariance matrix was performed using eigenvalue decomposition. With a reduction to rank 2, the first two principal components correctly summarized the genetic variability of milk yield level and persistency, with a correlation close to 0 between them. CONCLUSIONS: A random regression model was implemented in France to evaluate and select goats for yield traits and persistency, which are independent i.e. no genetic correlation between them, in first lactation.


Assuntos
Cabras/genética , Lactação/genética , Modelos Genéticos , Modelos Estatísticos , Animais , Indústria de Laticínios , Feminino , Cabras/fisiologia , Masculino , Leite , Análise de Regressão
2.
Genet Sel Evol ; 51(1): 45, 2019 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-31426753

RESUMO

BACKGROUND: Crossbreeding is widely used in pig production because of the benefits of heterosis effects and breed complementarity. Commonly, sire lines are bred for traits such as feed efficiency, growth and meat content, whereas maternal lines are also bred for reproduction and longevity traits, and the resulting three-way crossbred pigs are used for production of meat. The most important genetic basis for heterosis is dominance effects, e.g. removal of inbreeding depression. The aims of this study were to (1) present a modification of a previously developed model with additive, dominance and inbreeding depression genetic effects for analysis of data from a purebred sire line and three-way crossbred pigs; (2) based on this model, present equations for additive genetic variances, additive genetic covariance, and estimated breeding values (EBV) with associated accuracies for purebred and crossbred performances; (3) use the model to analyse four production traits, i.e. ultra-sound recorded backfat thickness (BF), conformation score (CONF), average daily gain (ADG), and feed conversion ratio (FCR), recorded on Danbred Duroc and Danbred Duroc-Landrace-Yorkshire crossbred pigs reared in the same environment; and (4) obtain estimates of genetic parameters, additive genetic correlations between purebred and crossbred performances, and EBV with associated accuracies for purebred and crossbred performances for this data set. RESULTS: Additive genetic correlations (with associated standard errors) between purebred and crossbred performances were equal to 0.96 (0.07), 0.83 (0.16), 0.75 (0.17), and 0.87 (0.18) for BF, CONF, ADG, and FCR, respectively. For BF, ADG, and FCR, the additive genetic variance was smaller for purebred performance than for crossbred performance, but for CONF the reverse was observed. EBV on Duroc boars were more accurate for purebred performance than for crossbred performance for BF, CONF and FCR, but not for ADG. CONCLUSIONS: Methodological developments led to equations for genetic (co)variances and EBV with associated accuracies for purebred and crossbred performances in a three-way crossbreeding system. As illustrated by the data analysis, these equations may be useful for implementation of genomic selection in this system.


Assuntos
Cruzamento , Depressão por Endogamia , Modelos Genéticos , Modelos Estatísticos , Suínos/genética , Animais , Cruzamentos Genéticos , Feminino , Variação Genética , Hibridização Genética , Masculino
3.
Medicine (Baltimore) ; 98(26): e16170, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31261547

RESUMO

OBJECTIVE: Non-syndromic cleft of the lip and/or palate (NSCL/P) is one of the most common polygenic diseases. In this study, both case-control and family-based association study were used to confirm whether the Single Nucleotide Polymorphisms (SNPs) were associated with NSCL/P. METHODS: A total of 37 nuclear families and 189 controls were recruited, whose blood DNA was extracted and subjected to genotyping of SNPs of 27 candidate genes by polymerase chain reaction-improved multiple ligase detection reaction technology (PCR-iMLDR). Case-control statistical analysis was performed using the SPSS 19.0. Haplotype Relative Risk (HRR), transmission disequilibrium test (TDT), and Family-Based Association Test (FBAT) were used to test for over-transmission of the target alleles in case-parent trios. The gene-gene interactions on NSCL/P were analyzed by Unphased-3.1.4. RESULTS: In case-control statistical analysis, only C14orf49 chr14_95932477 had statistically significant on genotype model (P = .03) and allele model (P = .03). Seven SNPs had statistically significant on TDT. None of 26 alleles has association with NSCL/P on FBAT. Some SNPs had haplotype-haplotype interactions and genotype-genotype interactions. CONCLUSION: C14orf49 chr14_95932477 was significantly different between cases and controls on genotype model and allele model by case-control design. Seven SNPs were significantly different on HRR. Four SNPs were significantly different on TDT.


Assuntos
Fenda Labial/genética , Fissura Palatina/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Estudos de Casos e Controles , Fenda Labial/complicações , Fissura Palatina/complicações , Família , Feminino , Estudos de Associação Genética , Humanos , Masculino , Modelos Genéticos , Fosfatases de Fosfoinositídeos/genética , Proteínas de Transporte Vesicular/genética
4.
Genet Sel Evol ; 51(1): 40, 2019 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-31311493

RESUMO

BACKGROUND: In modern dairy breeding programmes, high contributions from foreign sires are nearly always present. Genotyping, and therefore genomic selection (GS), concern only a subpopulation of the breeding programme's wider dairy population. These features of a breeding programme contribute in different ways to the rate of genetic gain for the wider industry. METHODS: A deterministic recursive gene flow model across subpopulations of animals in a dairy industry was created to predict the commercial performance of replacement heifers and future artificial insemination bulls. Various breeding strategies were assessed by varying the reliability of breeding values, the genetic contributions from subpopulations, and the genetic trend and merit of the foreign subpopulation. RESULTS: A higher response in the true breeding goal measured in standard deviations (SD) of true merit (G) after 20 years of selection can be achieved when genetic contributions shift towards higher merit alternatives compared to keeping them fixed. A foreign annual genetic trend of 0.08 SD of the breeding goal, while the domestic genetic trend is 0.10 SD, results in the overall net present value of genetic gain increasing by 1.2, 2.3, and 3.4% after 20 years as the reliability of GS in the domestic population increased from 0.3 to 0.45, 0.60 and 0.75. With a foreign genetic trend of 0.10 SD, these increases are more modest; 0.9, 1.7, and 2.4%. Increasing the foreign genetic trend so that it is higher than the domestic trend erodes the benefits of increasing the reliability of domestic GS further. CONCLUSIONS: Having a foreign source of genetic material with a high rate of genetic progress contributes substantially to the benefits of domestic genetic progress while at the same time reducing the expected returns from investments to improve the accuracy of genomic prediction in the home country.


Assuntos
Bovinos/genética , Indústria de Laticínios , Modelos Genéticos , Seleção Genética , Seleção Artificial , Animais , Feminino , Fluxo Gênico , Masculino
5.
Yi Chuan ; 41(6): 469-485, 2019 Jun 20.
Artigo em Chinês | MEDLINE | ID: mdl-31257196

RESUMO

The field of circular non-coding RNAs have been gradually attracted wide attention with the developments of high-throughput sequencing. In this review, we systematically summarize three driving models for circRNAs biogenesis: intron-pairing-driven, RNA binding protein-driven and lariat-driven. In addition, we also briefly introduce the current research methods of circRNAs, which include high-throughput library construction methods, identification through bioinformatics and common experimental verification. Here, we also systematically summarize the functions of circRNAs, including microRNA (miRNA) or protein sponges, regulating the alternative splicing (AS) and expression of host genes, and extensive translation. Finally, we provide a systematic characterization and the latest research progress of circRNAs, which provide a new perspective for further studies of circRNAs in plants.


Assuntos
Processamento Alternativo , RNA/genética , Íntrons , MicroRNAs , Modelos Genéticos , Plantas/genética , Proteínas de Ligação a RNA
6.
J Chem Phys ; 151(2): 024106, 2019 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-31301707

RESUMO

Single cells exhibit a significant amount of variability in transcript levels, which arises from slow, stochastic transitions between gene expression states. Elucidating the nature of these states and understanding how transition rates are affected by different regulatory mechanisms require state-of-the-art methods to infer underlying models of gene expression from single cell data. A Bayesian approach to statistical inference is the most suitable method for model selection and uncertainty quantification of kinetic parameters using small data sets. However, this approach is impractical because current algorithms are too slow to handle typical models of gene expression. To solve this problem, we first show that time-dependent mRNA distributions of discrete-state models of gene expression are dynamic Poisson mixtures, whose mixing kernels are characterized by a piecewise deterministic Markov process. We combined this analytical result with a kinetic Monte Carlo algorithm to create a hybrid numerical method that accelerates the calculation of time-dependent mRNA distributions by 1000-fold compared to current methods. We then integrated the hybrid algorithm into an existing Monte Carlo sampler to estimate the Bayesian posterior distribution of many different, competing models in a reasonable amount of time. We demonstrate that kinetic parameters can be reasonably constrained for modestly sampled data sets if the model is known a priori. If there are many competing models, Bayesian evidence can rigorously quantify the likelihood of a model relative to other models from the data. We demonstrate that Bayesian evidence selects the true model and outperforms approximate metrics typically used for model selection.


Assuntos
Algoritmos , Expressão Gênica , Modelos Genéticos , Método de Monte Carlo , Análise de Célula Única , Teorema de Bayes
7.
Nat Commun ; 10(1): 2766, 2019 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-31235692

RESUMO

A major challenge in biology is that genetically identical cells in the same environment can display gene expression stochasticity (noise), which contributes to bet-hedging, drug tolerance, and cell-fate switching. The magnitude and timescales of stochastic fluctuations can depend on the gene regulatory network. Currently, it is unclear how gene expression noise of specific networks impacts the evolution of drug resistance in mammalian cells. Answering this question requires adjusting network noise independently from mean expression. Here, we develop positive and negative feedback-based synthetic gene circuits to decouple noise from the mean for Puromycin resistance gene expression in Chinese Hamster Ovary cells. In low Puromycin concentrations, the high-noise, positive-feedback network delays long-term adaptation, whereas it facilitates adaptation under high Puromycin concentration. Accordingly, the low-noise, negative-feedback circuit can maintain resistance by acquiring mutations while the positive-feedback circuit remains mutation-free and regains drug sensitivity. These findings may have profound implications for chemotherapeutic inefficiency and cancer relapse.


Assuntos
Antimetabólitos Antineoplásicos/farmacologia , Resistencia a Medicamentos Antineoplásicos/genética , Regulação da Expressão Gênica/efeitos dos fármacos , Redes Reguladoras de Genes/genética , Modelos Genéticos , Animais , Antimetabólitos Antineoplásicos/uso terapêutico , Células CHO , Simulação por Computador , Cricetulus , Relação Dose-Resposta a Droga , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Retroalimentação Fisiológica , Regulação da Expressão Gênica/genética , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Puromicina/farmacologia , Puromicina/uso terapêutico , Processos Estocásticos
8.
Genet Sel Evol ; 51(1): 28, 2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31221101

RESUMO

BACKGROUND: Single-step genomic best linear unbiased prediction (SSGBLUP) is a comprehensive method for genomic prediction. Point estimates of marker effects from SSGBLUP are often used for genome-wide association studies (GWAS) without a formal framework of hypothesis testing. Our objective was to implement p-values for single-marker GWAS studies within the single-step GWAS (SSGWAS) framework by deriving computational algorithms and procedures, and by applying these to a large beef cattle population. METHODS: P-values were obtained based on the prediction error (co)variances for single nucleotide polymorphisms (SNPs), which were obtained from the prediction error (co)variances of genomic predictions based on the inverse of the coefficient matrix and formulas to estimate SNP effects. RESULTS: Computation of p-values took a negligible time for a dataset with almost 2 million animals in the pedigree and 1424 genotyped sires, and no inflation of statistics was observed. The SNPs that passed the Bonferroni threshold of 10-5.9 were the same as those that explained the highest proportion of additive genetic variance, but even at the same significance levels and effects, some of them explained less genetic variance due to lower allele frequency. CONCLUSIONS: The use of a p-value for SSGWAS is a very general and efficient strategy to identify quantitative trait loci (QTL). It can be used for complex datasets such as those used in animal breeding, where only a proportion of the pedigreed animals are genotyped.


Assuntos
Peso ao Nascer/genética , Bovinos/genética , Marcadores Genéticos , Estudo de Associação Genômica Ampla/veterinária , Algoritmos , Animais , Conjuntos de Dados como Assunto , Feminino , Masculino , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
9.
Genet Sel Evol ; 51(1): 30, 2019 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-31238880

RESUMO

BACKGROUND: The preconditioned conjugate gradient (PCG) method is an iterative solver of linear equations systems commonly used in animal breeding. However, the PCG method has been shown to encounter convergence issues when applied to single-step single nucleotide polymorphism BLUP (ssSNPBLUP) models. Recently, we proposed a deflated PCG (DPCG) method for solving ssSNPBLUP efficiently. The DPCG method introduces a second-level preconditioner that annihilates the effect of the largest unfavourable eigenvalues of the ssSNPBLUP preconditioned coefficient matrix on the convergence of the iterative solver. While it solves the convergence issues of ssSNPBLUP, the DPCG method requires substantial additional computations, in comparison to the PCG method. Accordingly, the aim of this study was to develop a second-level preconditioner that decreases the largest eigenvalues of the ssSNPBLUP preconditioned coefficient matrix at a lower cost than the DPCG method, in addition to comparing its performance to the (D)PCG methods applied to two different ssSNPBLUP models. RESULTS: Based on the properties of the ssSNPBLUP preconditioned coefficient matrix, we proposed a second-level diagonal preconditioner that decreases the largest eigenvalues of the ssSNPBLUP preconditioned coefficient matrix under some conditions. This proposed second-level preconditioner is easy to implement in current software and does not result in additional computing costs as it can be combined with the commonly used (block-)diagonal preconditioner. Tested on two different datasets and with two different ssSNPBLUP models, the second-level diagonal preconditioner led to a decrease of the largest eigenvalues and the condition number of the preconditioned coefficient matrices. It resulted in an improvement of the convergence pattern of the iterative solver. For the largest dataset, the convergence of the PCG method with the proposed second-level diagonal preconditioner was slower than the DPCG method, but it performed better than the DPCG method in terms of total computing time. CONCLUSIONS: The proposed second-level diagonal preconditioner can improve the convergence of the (D)PCG methods applied to two ssSNPBLUP models. Based on our results, the PCG method combined with the proposed second-level diagonal preconditioner seems to be more efficient than the DPCG method in solving ssSNPBLUP. However, the optimal combination of ssSNPBLUP and solver will most likely be situation-dependent.


Assuntos
Bovinos/genética , Interpretação Estatística de Dados , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Software , Animais , Cruzamento , Conjuntos de Dados como Assunto
10.
Hum Genet ; 138(7): 739-748, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31154530

RESUMO

Metabolic syndrome is a complex human disorder characterized by a cluster of conditions (increased blood pressure, hyperglycemia, excessive body fat around the waist, and abnormal cholesterol or triglyceride levels). Any of these conditions increases the risk of serious disorders such as diabetes or cardiovascular disease. Currently, the degree of genetic regulation of this syndrome is under debate and partially unknown. The principal aim of this study was to estimate the genetic component and the common environmental effects in different populations using full pedigree and genomic information. We used three large populations (Gubbio, ARIC, and Ogliastra cohorts) to estimate the heritability of metabolic syndrome. Due to both pedigree and genotyped data, different approaches were applied to summarize relatedness conditions. Linear mixed models (LLM) using average information restricted maximum likelihood (AIREML) algorithm were applied to partition the variances and estimate heritability (h2) and common sib-household effect (c2). Globally, results obtained from pedigree information showed a significant heritability (h2: 0.286 and 0.271 in Gubbio and Ogliastra, respectively), whereas a lower, but still significant heritability was found using SNPs data ([Formula: see text]: 0.167 and 0.254 in ARIC and Ogliastra). The remaining heritability between h2 and [Formula: see text] ranged between 0.031 and 0.237. Finally, the common environmental c2 in Gubbio and Ogliastra were also significant accounting for about 11% of the phenotypic variance. Availability of different kinds of populations and data helped us to better understand what happened when heritability of metabolic syndrome is estimated and account for different possible confounding. Furthermore, the opportunity of comparing different results provided more precise and less biased estimation of heritability.


Assuntos
Predisposição Genética para Doença , Genética Populacional/métodos , Genoma Humano , Estudo de Associação Genômica Ampla , Genômica/métodos , Síndrome Metabólica/genética , Polimorfismo de Nucleotídeo Único , Estudos de Coortes , Feminino , Genótipo , Humanos , Masculino , Modelos Genéticos , Linhagem
11.
Genet Sel Evol ; 51(1): 24, 2019 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-31146682

RESUMO

BACKGROUND: In settings with social interactions, the phenotype of an individual is affected by the direct genetic effect (DGE) of the individual itself and by indirect genetic effects (IGE) of its group mates. In the presence of IGE, heritable variance and response to selection depend on size of the interaction group (group size), which can be modelled via a 'dilution' parameter (d) that measures the magnitude of IGE as a function of group size. However, little is known about the estimability of d and the precision of its estimate. Our aim was to investigate how precisely d can be estimated and what determines this precision. METHODS: We simulated data with different group sizes and estimated d using a mixed model that included IGE and d. Schemes included various average group sizes (4, 6, and 8), variation in group size (coefficient of variation (CV) ranging from 0.125 to 1.010), and three values of d (0, 0.5, and 1). A design in which individuals were randomly allocated to groups was used for all schemes and a design with two families per group was used for some schemes. Parameters were estimated using restricted maximum likelihood (REML). Bias and precision of estimates were used to assess their statistical quality. RESULTS: The dilution parameter of IGE can be estimated for simulated data with variation in group size. For all schemes, the length of confidence intervals ranged from 0.114 to 0.927 for d, from 0.149 to 0.198 for variance of DGE, from 0.011 to 0.086 for variance of IGE, and from 0.310 to 0.557 for genetic correlation between DGE and IGE. To estimate d, schemes with groups composed of two families performed slightly better than schemes with randomly composed groups. CONCLUSIONS: Dilution of IGE was estimable, and in general its estimation was more precise when CV of group size was larger. All estimated parameters were unbiased. Estimation of dilution of IGE allows the contribution of direct and indirect variance components to heritable variance to be quantified in relation to group size and, thus, it could improve prediction of the expected response to selection in environments with group sizes that differ from the average size.


Assuntos
Variação Genética , Gado/genética , Modelos Genéticos , Animais , Feminino , Masculino , Fenótipo , Tamanho da Amostra , Seleção Genética , Comportamento Social
12.
Nat Commun ; 10(1): 2417, 2019 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-31160569

RESUMO

Accumulating evidence from genome wide association studies (GWAS) suggests an abundance of shared genetic influences among complex human traits and disorders, such as mental disorders. Here we introduce a statistical tool, MiXeR, which quantifies polygenic overlap irrespective of genetic correlation, using GWAS summary statistics. MiXeR results are presented as a Venn diagram of unique and shared polygenic components across traits. At 90% of SNP-heritability explained for each phenotype, MiXeR estimates that 8.3 K variants causally influence schizophrenia and 6.4 K influence bipolar disorder. Among these variants, 6.2 K are shared between the disorders, which have a high genetic correlation. Further, MiXeR uncovers polygenic overlap between schizophrenia and educational attainment. Despite a genetic correlation close to zero, the phenotypes share 8.3 K causal variants, while 2.5 K additional variants influence only educational attainment. By considering the polygenicity, discoverability and heritability of complex phenotypes, MiXeR analysis may improve our understanding of cross-trait genetic architectures.


Assuntos
Transtorno Bipolar/genética , Modelos Genéticos , Modelos Estatísticos , Herança Multifatorial , Esquizofrenia/genética , Frequência do Gene , Estudo de Associação Genômica Ampla , Humanos , Desequilíbrio de Ligação
13.
Nat Commun ; 10(1): 2418, 2019 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-31160574

RESUMO

In transcriptional regulatory networks (TRNs), a canonical 3-node feed-forward loop (FFL) is hypothesized to evolve to filter out short spurious signals. We test this adaptive hypothesis against a novel null evolutionary model. Our mutational model captures the intrinsically high prevalence of weak affinity transcription factor binding sites. We also capture stochasticity and delays in gene expression that distort external signals and intrinsically generate noise. Functional FFLs evolve readily under selection for the hypothesized function but not in negative controls. Interestingly, a 4-node "diamond" motif also emerges as a short spurious signal filter. The diamond uses expression dynamics rather than path length to provide fast and slow pathways. When there is no idealized external spurious signal to filter out, but only internally generated noise, only the diamond and not the FFL evolves. While our results support the adaptive hypothesis, we also show that non-adaptive factors, including the intrinsic expression dynamics, matter.


Assuntos
Regulação da Expressão Gênica/fisiologia , Redes Reguladoras de Genes/fisiologia , RNA Mensageiro/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Adaptação Fisiológica , Retroalimentação Fisiológica , Modelos Genéticos , Modelos Teóricos , Saccharomyces cerevisiae
14.
BMC Bioinformatics ; 20(1): 349, 2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31221105

RESUMO

BACKGROUND: Testing model adequacy is important before a DNA substitution model is chosen for phylogenetic inference. Using a mis-specified model can negatively impact phylogenetic inference, for example, the maximum likelihood method can be inconsistent when the DNA sequences are generated under a tree topology which is in the Felsentein Zone and analyzed with a mis-specified or inadequate model. However, model adequacy testing in phylogenetics is underdeveloped. RESULTS: Here we develop a simple, general, powerful and robust model test based on Pearson's goodness-of-fit test and binning of site patterns. We demonstrate through simulation that this test is robust in its high power to reject the inadequate models for a large range of different ways of binning site patterns while the Type I error is controlled well. In the real data analysis we discovered many cases where models chosen by another method can be rejected by this new test, in particular, our proposed test rejects the most complex DNA model (GTR+I+ Γ) while the Goldman-Cox test fails to reject the commonly used simple models. CONCLUSIONS: Model adequacy testing and bootstrap should be used together to assess reliability of conclusions after model selection and model fitting have already been applied to choose the model and fit it. The new goodness-of-fit test proposed in this paper is a simple and powerful model adequacy testing method serving such a regular model checking purpose. We caution against deriving strong conclusions from analyses based on inadequate models. At a minimum, those results derived from inadequate models can now be readly flagged using the new test, and reported as such.


Assuntos
DNA/genética , Modelos Genéticos , Sequência de Bases , Viés , Sítios de Ligação , Simulação por Computador , Bases de Dados Genéticas , Humanos , Funções Verossimilhança , Filogenia
15.
BMC Bioinformatics ; 20(1): 327, 2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31195954

RESUMO

BACKGROUND: The gap gene system controls the early cascade of the segmentation pathway in Drosophila melanogaster as well as other insects. Owing to its tractability and key role in embryo patterning, this system has been the focus for both computational modelers and experimentalists. The gap gene expression dynamics can be considered strictly as a one-dimensional process and modeled as a system of reaction-diffusion equations. While substantial progress has been made in modeling this phenomenon, there still remains a deficit of approaches to evaluate competing hypotheses. Most of the model development has happened in isolation and there has been little attempt to compare candidate models. RESULTS: The Bayesian framework offers a means of doing formal model evaluation. Here, we demonstrate how this framework can be used to compare different models of gene expression. We focus on the Papatsenko-Levine formalism, which exploits a fractional occupancy based approach to incorporate activation of the gap genes by the maternal genes and cross-regulation by the gap genes themselves. The Bayesian approach provides insight about relationship between system parameters. In the regulatory pathway of segmentation, the parameters for number of binding sites and binding affinity have a negative correlation. The model selection analysis supports a stronger binding affinity for Bicoid compared to other regulatory edges, as shown by a larger posterior mean. The procedure doesn't show support for activation of Kruppel by Bicoid. CONCLUSIONS: We provide an efficient solver for the general representation of the Papatsenko-Levine model. We also demonstrate the utility of Bayes factor for evaluating candidate models for spatial pattering models. In addition, by using the parallel tempering sampler, the convergence of Markov chains can be remarkably improved and robust estimates of Bayes factors obtained.


Assuntos
Drosophila melanogaster/genética , Redes Reguladoras de Genes , Animais , Teorema de Bayes , Proteínas de Drosophila/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Funções Verossimilhança , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
16.
BMC Bioinformatics ; 20(1): 324, 2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31195961

RESUMO

BACKGROUND: As DNA sequencing technologies are improving and getting cheaper, genomic data can be utilized for diagnosis of many diseases such as cancer. Human raw genome data is huge in size for computational systems. Therefore, there is a need for a compact and accurate representation of the valuable information in DNA. The occurrence of complex genetic disorders often results from multiple gene mutations. The effect of each mutation is not equal for the development of a disease. Inspired from the field of information retrieval, we propose using the term frequency (tf) and BM25 term weighting measures with the inverse document frequency (idf) and relevance frequency (rf) measures to weight genes based on their mutations. The underlying assumption is that the more mutations a gene has in patients with a certain disease and the less mutations it has in other patients, the more discriminative that gene is. RESULTS: We evaluated the proposed representations on the task of cancer type classification. We applied various machine learning techniques using the tf-idf and tf-rf schemes and their BM25 versions. Our results show that the BM25-tf-rf representation leads to improved classification accuracy and f-score values compared to the other representations. The highest accuracy (76.44%) and f-score (76.95%) are achieved with the BM25-tf-rf based data representation. CONCLUSIONS: As a result of our experiments, the BM25-tf-rf scheme and the proposed neural network model is shown to be the best performing classification system for our case study of cancer type classification. This system is further utilized for causal gene analysis. Examples from the most effective genes that are used for decision making are found to be in the literature as target or causal genes.


Assuntos
Genômica/métodos , Modelos Genéticos , Modelos Estatísticos , Mutação/genética , Bases de Dados Genéticas , Éxons/genética , Humanos , Íntrons/genética , Aprendizado de Máquina , Neoplasias/genética , Redes Neurais (Computação)
17.
Nat Commun ; 10(1): 2750, 2019 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-31227714

RESUMO

Understanding the clonal architecture and evolutionary history of a tumour poses one of the key challenges to overcome treatment failure due to resistant cell populations. Previously, studies on subclonal tumour evolution have been primarily based on bulk sequencing and in some recent cases on single-cell sequencing data. Either data type alone has shortcomings with regard to this task, but methods integrating both data types have been lacking. Here, we present B-SCITE, the first computational approach that infers tumour phylogenies from combined single-cell and bulk sequencing data. Using a comprehensive set of simulated data, we show that B-SCITE systematically outperforms existing methods with respect to tree reconstruction accuracy and subclone identification. B-SCITE provides high-fidelity reconstructions even with a modest number of single cells and in cases where bulk allele frequencies are affected by copy number changes. On real tumour data, B-SCITE generated mutation histories show high concordance with expert generated trees.


Assuntos
Evolução Clonal/genética , Biologia Computacional/métodos , Análise Mutacional de DNA/métodos , Modelos Genéticos , Neoplasias/genética , Algoritmos , Conjuntos de Dados como Assunto , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação , Filogenia , Análise de Célula Única/métodos , Software
18.
BMC Genomics ; 20(Suppl 6): 435, 2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31189480

RESUMO

BACKGROUND: Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based on such data still manifests significant uncertainty. RESULTS: The goal of this paper is to develop optimal classification of single-cell trajectories accounting for potential model uncertainty. Partially-observed Boolean dynamical systems (POBDS) are used for modeling gene regulatory networks observed through noisy gene-expression data. We derive the exact optimal Bayesian classifier (OBC) for binary classification of single-cell trajectories. The application of the OBC becomes impractical for large GRNs, due to computational and memory requirements. To address this, we introduce a particle-based single-cell classification method that is highly scalable for large GRNs with much lower complexity than the optimal solution. CONCLUSION: The performance of the proposed particle-based method is demonstrated through numerical experiments using a POBDS model of the well-known T-cell large granular lymphocyte (T-LGL) leukemia network with noisy time-series gene-expression data.


Assuntos
Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Redes Reguladoras de Genes , Leucemia Linfocítica Granular Grande/genética , Análise de Célula Única/métodos , Perfilação da Expressão Gênica , Humanos , Modelos Biológicos , Modelos Genéticos , Incerteza
19.
Gene ; 710: 240-245, 2019 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-31181311

RESUMO

Hirschsprung disease (HSCR) is a congenital rare disorder and a kind of developmental neuropathies, characterized by the lack of enteric neurons in variable segments of distal bowel. Our recent genome-wide association study identified a variant (rs13223150) of testis-specific A13 (TSGA13) as a potential risk locus for total colonic aganglionosis (TCA) in HSCR. The aim of this study was to identify the impact of the variant (rs13223150) and potential association of genetic variations of TSGA13 with TCA in HSCR. This study performed a fine mapping and extended analyses in Korean population. A total of 9 single nucleotide polymorphisms (SNPs) of TSGA13 were genotyped in a larger HSCR cohort (187 HSCR patients and 283 unaffected controls), and extended genetic analyses using various genetic modelling, haplotype, and combined analyses were performed. The rs13223150_A allele showed a significant association with TCA (P = 0.003), even after correcting for multiple testing (Pcorr = 0.02). One haplotype (BL1_ht1, G-A-C-C) including rs13223150 also showed a significant association with TCA (P = 0.002, Pcorr = 0.01). Further combined imputation analysis indicated that several single nucleotide polymorphisms of TSGA13 were significantly associated with TCA in HSCR. Although replications in other population cohorts and functional evaluations are needed, our results suggest that TSGA13 genetic variants may affect TCA in HSCR and/or the extent of aganglionosis during enteric nervous system development.


Assuntos
Doença de Hirschsprung/genética , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Estudos de Casos e Controles , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Haplótipos , Humanos , Modelos Genéticos , República da Coreia
20.
Nat Commun ; 10(1): 2611, 2019 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-31197158

RESUMO

The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios.


Assuntos
Perfilação da Expressão Gênica/métodos , Modelos Genéticos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Análise por Conglomerados , Simulação por Computador , Conjuntos de Dados como Assunto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Cinética , Transcriptoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA