Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 85
Filtrar
1.
PLoS Comput Biol ; 17(2): e1007948, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33600408

RESUMO

Gene function annotation is important for a variety of downstream analyses of genetic data. But experimental characterization of function remains costly and slow, making computational prediction an important endeavor. Phylogenetic approaches to prediction have been developed, but implementation of a practical Bayesian framework for parameter estimation remains an outstanding challenge. We have developed a computationally efficient model of evolution of gene annotations using phylogenies based on a Bayesian framework using Markov Chain Monte Carlo for parameter estimation. Unlike previous approaches, our method is able to estimate parameters over many different phylogenetic trees and functions. The resulting parameters agree with biological intuition, such as the increased probability of function change following gene duplication. The method performs well on leave-one-out cross-validation, and we further validated some of the predictions in the experimental scientific literature.


Assuntos
Modelos Genéticos , Anotação de Sequência Molecular/métodos , Filogenia , Algoritmos , Animais , Teorema de Bayes , Biologia Computacional , Bases de Dados Genéticas , Evolução Molecular , Ontologia Genética/estatística & dados numéricos , Humanos , Funções Verossimilhança , Cadeias de Markov , Camundongos , Modelos Estatísticos , Anotação de Sequência Molecular/estatística & dados numéricos , Método de Monte Carlo , Família Multigênica
2.
BMC Bioinformatics ; 20(1): 327, 2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31195954

RESUMO

BACKGROUND: The gap gene system controls the early cascade of the segmentation pathway in Drosophila melanogaster as well as other insects. Owing to its tractability and key role in embryo patterning, this system has been the focus for both computational modelers and experimentalists. The gap gene expression dynamics can be considered strictly as a one-dimensional process and modeled as a system of reaction-diffusion equations. While substantial progress has been made in modeling this phenomenon, there still remains a deficit of approaches to evaluate competing hypotheses. Most of the model development has happened in isolation and there has been little attempt to compare candidate models. RESULTS: The Bayesian framework offers a means of doing formal model evaluation. Here, we demonstrate how this framework can be used to compare different models of gene expression. We focus on the Papatsenko-Levine formalism, which exploits a fractional occupancy based approach to incorporate activation of the gap genes by the maternal genes and cross-regulation by the gap genes themselves. The Bayesian approach provides insight about relationship between system parameters. In the regulatory pathway of segmentation, the parameters for number of binding sites and binding affinity have a negative correlation. The model selection analysis supports a stronger binding affinity for Bicoid compared to other regulatory edges, as shown by a larger posterior mean. The procedure doesn't show support for activation of Kruppel by Bicoid. CONCLUSIONS: We provide an efficient solver for the general representation of the Papatsenko-Levine model. We also demonstrate the utility of Bayes factor for evaluating candidate models for spatial pattering models. In addition, by using the parallel tempering sampler, the convergence of Markov chains can be remarkably improved and robust estimates of Bayes factors obtained.


Assuntos
Drosophila melanogaster/genética , Redes Reguladoras de Genes , Animais , Teorema de Bayes , Proteínas de Drosophila/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Funções Verossimilhança , Cadeias de Markov , Modelos Genéticos , Método de Monte Carlo
3.
Anim Cogn ; 20(5): 867-880, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28669114

RESUMO

Probabilistic decision-making is a general phenomenon in animal behavior, and has often been interpreted to reflect the relative certainty of animals' beliefs. Extensive neurological and behavioral results increasingly suggest that animal beliefs may be represented as probability distributions, with explicit accounting of uncertainty. Accordingly, we develop a model that describes decision-making in a manner consistent with this understanding of neuronal function in learning and conditioning. This first-order Markov, recursive Bayesian algorithm is as parsimonious as its minimalist point-estimate, Rescorla-Wagner analogue. We show that the Bayesian algorithm can reproduce naturalistic patterns of probabilistic foraging, in simulations of an experiment in bumblebees. We go on to show that the Bayesian algorithm can efficiently describe the behavior of several heuristic models of decision-making, and is consistent with the ubiquitous variation in choice that we observe within and between individuals in implementing heuristic decision-making. By describing learning and decision-making in a single Bayesian framework, we believe we can realistically unify descriptions of behavior across contexts and organisms. A unified cognitive model of this kind may facilitate descriptions of behavioral evolution.


Assuntos
Comportamento de Escolha , Aprendizagem , Algoritmos , Animais , Comportamento Apetitivo , Teorema de Bayes , Abelhas/fisiologia , Tomada de Decisões , Modelos Teóricos
4.
BMC Genomics ; 17: 176, 2016 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-26940994

RESUMO

BACKGROUND: For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. RESULTS: We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. CONCLUSIONS: Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.


Assuntos
Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Mapeamento Cromossômico , Neoplasias do Colo/genética , Biologia Computacional , Variação Genética , Humanos
5.
Bioinformatics ; 31(21): 3549-51, 2015 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-26142186

RESUMO

MOTIVATION: The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. RESULTS: We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. AVAILABILITY AND IMPLEMENTATION: al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. CONTACT: astram@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Modelos Biológicos , Software , Algoritmos , Animais , Teorema de Bayes , Método de Monte Carlo
6.
Stat Appl Genet Mol Biol ; 14(4): 317-32, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26167984

RESUMO

Recent results in Markov chain Monte Carlo (MCMC) show that a chain based on an unbiased estimator of the likelihood can have a stationary distribution identical to that of a chain based on exact likelihood calculations. In this paper we develop such an estimator for elliptically contoured distributions, a large family of distributions that includes and generalizes the multivariate normal. We then show how this estimator, combined with pseudorandom realizations of an elliptically contoured distribution, can be used to run MCMC in a way that replicates the stationary distribution of a likelihood based chain, but does not require explicit likelihood calculations. Because many elliptically contoured distributions do not have closed form densities, our simulation based approach enables exact MCMC based inference in a range of cases where previously it was impossible.


Assuntos
Funções Verossimilhança , Cadeias de Markov , Método de Monte Carlo , Algoritmos , Simulação por Computador
7.
J Pathol ; 237(3): 355-62, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26119426

RESUMO

Intratumoural mutational heterogeneity (ITH) or the presence of different private mutations in different parts of the same tumour is commonly observed in human tumours. The mechanisms generating such ITH are uncertain. Here we find that ITH can be remarkably well structured by measuring point mutations, chromosome copy numbers, and DNA passenger methylation from opposite sides and individual glands of a 6 cm human colorectal adenoma. ITH was present between tumour sides and individual glands, but the private mutations were side-specific and subdivided the adenoma into two major subclones. Furthermore, ITH disappeared within individual glands because the glands were clonal populations composed of cells with identical mutant genotypes. Despite mutation clonality, the glands were relatively old, diverse populations when their individual cells were compared for passenger methylation and by FISH. These observations can be organized into an expanding star-like ancestral tree with co-clonal expansion, where many private mutations and multiple related clones arise during the first few divisions. As a consequence, most detectable mutational ITH in the final tumour originates from the first few divisions. Much of the early history of a tumour, especially the first few divisions, may be embedded within the detectable ITH of tumour genomes.


Assuntos
Adenoma/genética , Biomarcadores Tumorais/genética , Divisão Celular , Evolução Clonal , Neoplasias Colorretais/genética , Mutação Puntual , Adenoma/patologia , Neoplasias Colorretais/patologia , Metilação de DNA , Análise Mutacional de DNA , Epigênese Genética , Dosagem de Genes , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Predisposição Genética para Doença , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Polimorfismo de Nucleotídeo Único
8.
Nature ; 465(7298): 627-31, 2010 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-20336072

RESUMO

Although pioneered by human geneticists as a potential solution to the challenging problem of finding the genetic basis of common human diseases, genome-wide association (GWA) studies have, owing to advances in genotyping and sequencing technology, become an obvious general approach for studying the genetics of natural variation and traits of agricultural importance. They are particularly useful when inbred lines are available, because once these lines have been genotyped they can be phenotyped multiple times, making it possible (as well as extremely cost effective) to study many different traits in many different environments, while replicating the phenotypic measurements to reduce environmental noise. Here we demonstrate the power of this approach by carrying out a GWA study of 107 phenotypes in Arabidopsis thaliana, a widely distributed, predominantly self-fertilizing model plant known to harbour considerable genetic variation for many adaptively important traits. Our results are dramatically different from those of human GWA studies, in that we identify many common alleles of major effect, but they are also, in many cases, harder to interpret because confounding by complex genetics and population structure make it difficult to distinguish true associations from false. However, a-priori candidates are significantly over-represented among these associations as well, making many of them excellent candidates for follow-up experiments. Our study demonstrates the feasibility of GWA studies in A. thaliana and suggests that the approach will be appropriate for many other organisms.


Assuntos
Arabidopsis/classificação , Arabidopsis/genética , Genoma de Planta/genética , Estudo de Associação Genômica Ampla , Fenótipo , Alelos , Proteínas de Arabidopsis/genética , Flores/genética , Genes de Plantas/genética , Loci Gênicos/genética , Genótipo , Imunidade Inata/genética , Endogamia , Polimorfismo de Nucleotídeo Único/genética
9.
Am Nat ; 185(6): 797-808, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25996864

RESUMO

Understanding the mechanisms that give rise to social structure is central to predicting the evolutionary and ecological outcomes of social interactions. Modeling this process is challenging, because all individuals simultaneously behave in ways that shape their social environments--a process called social niche construction (SNC). In earlier work, we demonstrated that aggression acts as an SNC trait in fruit flies (Drosophila melanogaster), but the mechanisms of that process remained cryptic. Here, we analyze how individual social group preferences generate overall social structure. We use a combination of agent-based simulation and approximate Bayesian computation to fit models to empirical data. We confirm that genetic variation in aggressive behavior influences social group structure. Furthermore, we find that female decamping due to male behavior may play an underappreciated role in structuring social groups. Male-male aggression may sometimes destabilize groups, but it may also be an SNC behavior for shaping desirable groups for females. Density intensifies female social preferences; thus, the role of female behavior in shaping group structure may become more important at high densities. Our ability to model the ontogeny of group structure demonstrates the utility of the Bayesian model-based approach in social behavioral studies.


Assuntos
Drosophila melanogaster/fisiologia , Animais , Teorema de Bayes , Comportamento Animal , Drosophila melanogaster/genética , Feminino , Variação Genética , Masculino , Densidade Demográfica
10.
J Theor Biol ; 359: 136-45, 2014 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-24907673

RESUMO

A tumor is thought to start from a single cell and genome. Yet genomes in the final tumor are typically heterogeneous. The mystery of this intratumoral heterogeneity (ITH) has not yet been uncovered, but much of this ITH may be secondary to replication errors. Methylation of cytosine bases often exhibits ITH and therefore may encode the ancestry of the tumor. In this study, we measure the passenger methylation patterns of a specific CpG region in 9 colorectal tumors by bisulfite sequencing and apply a tumor development model. Based on our model, we are able to retrieve information regarding the ancestry of each tumor using approximate Bayesian computation. With a large simulation study we explore the conditions under which we can estimate the model parameters, and the initial state of the first transformed cell. Finally we apply our analysis to clinical data to gain insight into the dynamics of tumor formation.


Assuntos
Biologia Computacional , Evolução Molecular , Neoplasias/genética , Teorema de Bayes , Proliferação de Células/genética , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Ilhas de CpG/genética , Metilação de DNA , Heterogeneidade Genética , Genoma Humano , Humanos , Neoplasias/patologia , Análise de Sequência de DNA
11.
Genet Epidemiol ; 36(7): 696-709, 2012 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-22865643

RESUMO

Next-generation sequencing technology provides us with vast amounts of sequence data. It is efficient and cheaper than previous sequencing technologies, but deep resequencing of entire samples is still expensive. Therefore, sensible strategies for choosing subsets of samples to sequence are required. Here we describe an algorithm for selection of a sub-sample of an existing sample if one has either of two possible goals in mind: maximizing the number of new polymorphic sites that are detected, or improving the efficiency with which the remaining unsequenced individuals can have their types imputed at newly discovered polymorphisms. We then describe a variation on our algorithm that is more focused on detecting rarer variants. We demonstrate the performance of our algorithm using simulated data and data from the 1000 Genomes Project.


Assuntos
Algoritmos , Genoma Humano , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Simulação por Computador , Diploide , Variação Genética , Estudo de Associação Genômica Ampla , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Projeto Genoma Humano , Humanos , Modelos Genéticos
12.
Bioinformatics ; 28(6): 838-44, 2012 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-22257666

RESUMO

MOTIVATION: We introduce a coalescent-based method (RECOAL) for the simulation of new haplotype data from a reference population of haplotypes. A coalescent genealogy for the reference haplotype data is sampled from the appropriate posterior probability distribution, then a coalescent genealogy is simulated which extends the sampled genealogy to include new haplotype data. The new haplotype data will, therefore, contain both some of the existing polymorphic sites and new polymorphisms added based on the structure of the simulated coalescent genealogy. This allows exact coalescent simulation of new haplotype data, compared with other methods which are more approximate in nature. RESULTS: We demonstrate the performance of our method using a variety of data simulated under a coalescent model, before applying it to data from the 1000 Genomes project.


Assuntos
Genoma Humano , Modelos Genéticos , Algoritmos , Simulação por Computador , Estudo de Associação Genômica Ampla , Haplótipos , Humanos , Grupos Populacionais , Probabilidade
13.
EMBO Rep ; 12(7): 735-42, 2011 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-21637295

RESUMO

We describe a new mechanism by which CTG tract expansion affects myotonic dystrophy (DM1). Changes to the levels of a panel of RNAs involved in muscle development and function that are downregulated in DM1 are due to aberrant localization of the transcription factor SHARP (SMART/HDAC1-associated repressor protein). Mislocalization of SHARP in DM1 is consistent with increased CRM1-mediated export of SHARP to the cytoplasm. A direct link between CTG repeat expression and SHARP mislocalization is demonstrated as expression of expanded CTG repeats in normal cells recapitulates cytoplasmic SHARP localization. These results demonstrate a role for the inactivation of SHARP transcription in DM1 biology.


Assuntos
Núcleo Celular/metabolismo , Proteínas de Homeodomínio/metabolismo , Distrofia Miotônica/fisiopatologia , Proteínas Nucleares/metabolismo , RNA/metabolismo , Antibióticos Antineoplásicos/farmacologia , Citoplasma/metabolismo , Proteínas de Ligação a DNA , Ácidos Graxos Insaturados/farmacologia , Regulação da Expressão Gênica , Proteínas de Homeodomínio/genética , Humanos , Mioblastos/metabolismo , Distrofia Miotônica/genética , Proteínas Nucleares/genética , Transporte Proteico/efeitos dos fármacos , Splicing de RNA/genética , Proteínas de Ligação a RNA/metabolismo , Expansão das Repetições de Trinucleotídeos/genética
14.
Sci Rep ; 13(1): 5346, 2023 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-37005426

RESUMO

Biomarkers such as exhaled nitric oxide (FeNO), a marker of airway inflammation, have applications in the study of chronic respiratory disease where longitudinal studies of within-participant changes in the biomarker are particularly relevant. A cutting-edge approach to assessing FeNO, called multiple flow FeNO, repeatedly assesses FeNO across a range of expiratory flow rates at a single visit and combines these data with a deterministic model of lower respiratory tract NO to estimate parameters quantifying airway wall and alveolar NO sources. Previous methodological work for multiple flow FeNO has focused on methods for data from a single participant or from cross-sectional studies. Performance of existing ad hoc two-stage methods for longitudinal multiple flow FeNO in cohort or panel studies has not been evaluated. In this paper, we present a novel longitudinal extension to a unified hierarchical Bayesian (L_U_HB) model relating longitudinally assessed multiple flow FeNO to covariates. In several simulation study scenarios, we compare the L_U_HB method to other unified and two-stage frequentist methods. In general, L_U_HB produced unbiased estimates, had good power, and its performance was not sensitive to the magnitude of the association with a covariate and correlations between NO parameters. In an application relating height to longitudinal multiple flow FeNO in schoolchildren without asthma, unified analysis methods estimated positive, statistically significant associations of height with airway and alveolar NO concentrations and negative associations with airway wall diffusivity while estimates from two-stage methods were smaller in magnitude and sometimes non-significant.


Assuntos
Asma , Óxido Nítrico , Humanos , Criança , Óxido Nítrico/análise , Teorema de Bayes , Estudos Transversais , Brônquios/química , Expiração , Testes Respiratórios/métodos , Biomarcadores
15.
Mol Biol Evol ; 28(8): 2231-7, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21368315

RESUMO

Analyses of genetic polymorphism data have the potential to be highly informative about the demographic history of Native American populations, but due to a combination of historical and political factors, there are essentially no autosomal sequence polymorphism data from any Native American group. However, there are many resequencing studies involving Latinos, whose genomes contain segments inherited from their Native American ancestors. In this study, we introduce a new method for estimating local ancestry across the genomes of admixed individuals and show how this method, along with dense genotyping and targeted resequencing, can be used to assay genetic variation in ancestral Native American groups. We analyze roughly 6 Mb of resequencing data from 22 Mexican Americans to provide the first large-scale view of sequence level variation in Native Americans. We observe low levels of diversity and high levels of linkage disequilibrium in the Native American-derived sequences, consistent with a recent severe population bottleneck associated with the initial peopling of the Americas. Using two different computational approaches, one novel, we estimate that this bottleneck occurred roughly 12.5 Kya; when uncertainty in the estimation process is taken into account, our results are consistent with archeological estimates for the colonization of the Americas.


Assuntos
Hispânico ou Latino/genética , Indígenas Norte-Americanos/genética , Polimorfismo de Nucleotídeo Único/genética , Alelos , Genética Populacional , Genômica/métodos , Genótipo , Humanos
16.
Stat Appl Genet Mol Biol ; 10(1)2011 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-23089822

RESUMO

In this paper, we develop a Genetic Algorithm that can address the fundamental problem of how one should weight the summary statistics included in an approximate Bayesian computation analysis built around an accept/reject algorithm, and how one might choose the tolerance for that analysis. We then demonstrate that using weighted statistics, and a well-chosen tolerance, in such an approximate Bayesian computation approach can result in improved performance, when compared to unweighted analyses, using one example drawn purely from statistics and two drawn from the estimation of population genetics parameters.


Assuntos
Algoritmos , Teorema de Bayes , Biologia Computacional/métodos , Software , Simulação por Computador , Genética Populacional/métodos , Haplótipos , Humanos , Modelos Genéticos , Mutação , Recombinação Genética , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
17.
Proc Natl Acad Sci U S A ; 106(12): 4828-33, 2009 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-19261858

RESUMO

Cancers are clonal expansions, but how a single, transformed human cell grows into a billion-cell tumor is uncertain because serial observations are impractical. Potentially, this history is surreptitiously recorded within genomes that become increasingly numerous, polymorphic, and physically separated after transformation. To correlate physical with epigenetic pairwise distances, small 2,000- to 10,000-cell gland fragments were sampled from left and right sides of 12 primary colorectal cancers, and passenger methylation at 2 CpG-rich regions was measured by bisulfite sequencing. Methylation patterns were polymorphic but differences were similar between different parts of the same tumor, consistent with relatively isotropic or "flat" clonal expansions that could be simulated by rapid initial population expansions. Methylation patterns were too diverse to be consistent with very rare cancer stem cells but were more consistent with multiple ( approximately 4 to 1,000) long-lived cancer stem cell lineages per cancer gland. Our study illustrates the potential to reconstruct the unperturbed biology of human cancers from epigenetic passenger variations in their present-day genomes.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Metilação de DNA , Células-Tronco Neoplásicas/patologia , Contagem de Células , Linhagem da Célula , Proliferação de Células , Células Clonais , Simulação por Computador , Ilhas de CpG/genética , Humanos , Microscopia Confocal , Mitose , Fenótipo
18.
BMC Bioinformatics ; 12: 284, 2011 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-21752297

RESUMO

BACKGROUND: Etiologic studies of cancer increasingly use molecular features such as gene expression, DNA methylation and sequence mutation to subclassify the cancer type. In large population-based studies, the tumor tissues available for study are archival specimens that provide variable amounts of amplifiable DNA for molecular analysis. As molecular features measured from small amounts of tumor DNA are inherently noisy, we propose a novel approach to improve statistical efficiency when comparing groups of samples. We illustrate the phenomenon using the MethyLight technology, applying our proposed analysis to compare MLH1 DNA methylation levels in males and females studied in the Colon Cancer Family Registry. RESULTS: We introduce two methods for computing empirical weights to model heteroscedasticity that is caused by sampling variable quantities of DNA for molecular analysis. In a simulation study, we show that using these weights in a linear regression model is more powerful for identifying differentially methylated loci than standard regression analysis. The increase in power depends on the underlying relationship between variation in outcome measure and input DNA quantity in the study samples. CONCLUSIONS: Tumor characteristics measured from small amounts of tumor DNA are inherently noisy. We propose a statistical analysis that accounts for the measurement error due to sampling variation of the molecular feature and show how it can improve the power to detect differential characteristics between patient groups.


Assuntos
Proteínas Adaptadoras de Transdução de Sinal/genética , Neoplasias do Colo/genética , Simulação por Computador , Metilação de DNA , Análise Mutacional de DNA , Proteínas Nucleares/genética , Elementos Alu , Feminino , Humanos , Análise dos Mínimos Quadrados , Modelos Lineares , Masculino , Proteína 1 Homóloga a MutL , Análise de Regressão
19.
PLoS One ; 16(9): e0253250, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34520456

RESUMO

Recent DepMap CRISPR-Cas9 single gene disruptions have identified genes more essential to proliferation in tissue culture. It would be valuable to translate these finding with measurements more practical for human tissues. Here we show that DepMap essential genes and other literature curated functional genes exhibit cell-specific preferential epigenetic conservation when DNA methylation measurements are compared between replicate cell lines and between intestinal crypts from the same individual. Culture experiments indicate that epigenetic drift accumulates through time with smaller differences in more functional genes. In NCI-60 cell lines, greater targeted gene conservation correlated with greater drug sensitivity. These studies indicate that two measurements separated in time allow normal or neoplastic cells to signal through conservation which human genes are more essential to their survival in vitro or in vivo.


Assuntos
Técnicas de Cultura de Células/métodos , Metilação de DNA , Genes Essenciais , Linhagem Celular Tumoral , Epigênese Genética , Regulação da Expressão Gênica , Deriva Genética , Humanos
20.
Sci Rep ; 11(1): 17180, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34433846

RESUMO

Exhaled breath biomarkers are an important emerging field. The fractional concentration of exhaled nitric oxide (FeNO) is a marker of airway inflammation with clinical and epidemiological applications (e.g., air pollution health effects studies). Systems of differential equations describe FeNO-measured non-invasively at the mouth-as a function of exhalation flow rate and parameters representing airway and alveolar sources of NO in the airway. Traditionally, NO parameters have been estimated separately for each study participant (Stage I) and then related to covariates (Stage II). Statistical properties of these two-step approaches have not been investigated. In simulation studies, we evaluated finite sample properties of existing two-step methods as well as a novel Unified Hierarchical Bayesian (U-HB) model. The U-HB is a one-step estimation method developed with the goal of properly propagating uncertainty as well as increasing power and reducing type I error for estimating associations of covariates with NO parameters. We demonstrated the U-HB method in an analysis of data from the southern California Children's Health Study relating traffic-related air pollution exposure to airway and alveolar airway inflammation.


Assuntos
Asma/epidemiologia , Expiração , Modelos Teóricos , Óxido Nítrico/metabolismo , Alvéolos Pulmonares/metabolismo , Mucosa Respiratória/metabolismo , Asma/etiologia , Teorema de Bayes , Biomarcadores/metabolismo , Testes Respiratórios , Criança , Interpretação Estatística de Dados , Humanos , Exposição por Inalação/estatística & dados numéricos , Emissões de Veículos/toxicidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA