Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
PLoS One ; 10(3): e0119254, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25787144

RESUMEN

This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.


Asunto(s)
Modelos Estadísticos , Análisis de Secuencia de ARN/métodos , Simulación por Computador , Análisis de Regresión
2.
Stat Appl Genet Mol Biol ; 12(1): 49-70, 2013 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-23502340

RESUMEN

RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Modelos Genéticos , Análisis de Secuencia de ARN , Algoritmos , Arabidopsis/genética , Secuencia de Bases , Simulación por Computador , Secuenciación de Nucleótidos de Alto Rendimiento , Funciones de Verosimilitud , Modelos Estadísticos , Distribución de Poisson , Pseudomonas syringae/genética , ARN Bacteriano/genética , ARN de Planta/genética , Análisis de Regresión
3.
PLoS One ; 6(10): e25279, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21998647

RESUMEN

GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN , Arabidopsis/genética , Arabidopsis/inmunología , Benchmarking , Secuencia Conservada , Interpretación Estadística de Datos , Bases de Datos Genéticas , Genómica , Análisis de Secuencia por Matrices de Oligonucleótidos
4.
Radiat Res ; 166(1 Pt 2): 303-12, 2006 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-16808615

RESUMEN

Statistical dose-response analyses in radiation epidemiology can produce misleading results if they fail to account for radiation dose uncertainties. While dosimetries may differ substantially depending on the ways in which the subjects were exposed, the statistical problems typically involve a predominantly linear dose-response curve, multiple sources of uncertainty, and uncertainty magnitudes that are best characterized as proportional rather than additive. We discuss some basic statistical issues in this setting, including the bias and shape distortion induced by classical and Berkson uncertainties, the effect of uncertain dose-prediction model parameters on estimated dose-response curves, and some notes on statistical methods for dose-response estimation in the presence of radiation dose uncertainties.


Asunto(s)
Artefactos , Interpretación Estadística de Datos , Relación Dosis-Respuesta en la Radiación , Modelos Biológicos , Modelos Estadísticos , Neoplasias Inducidas por Radiación/epidemiología , Radiometría/métodos , Medición de Riesgo/métodos , Sesgo , Carga Corporal (Radioterapia) , Simulación por Computador , Humanos , Dosis de Radiación , Efectividad Biológica Relativa , Factores de Riesgo
5.
Radiat Res ; 161(3): 359-68, 2004 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-14982478

RESUMEN

In the 1940s and 1950s, children in Israel were treated for tinea capitis by irradiation to the scalp to induce epilation. Follow-up studies of these patients and of other radiation- exposed populations show an increased risk of malignant and benign thyroid tumors. Those analyses, however, assume that thyroid dose for individuals is estimated precisely without error. Failure to account for uncertainties in dosimetry may affect standard errors and bias dose-response estimates. For the Israeli tinea capitis study, we discuss sources of uncertainties and adjust dosimetry for uncertainties in the prediction of true dose from X-ray treatment parameters. We also account for missing ages at exposure for patients with multiple X-ray treatments, since only ages at first treatment are known, and for missing data on treatment center, which investigators use to define exposure. Our reanalysis of the dose response for thyroid cancer and benign thyroid tumors indicates that uncertainties in dosimetry have minimal effects on dose-response estimation and for inference on the modifying effects of age at first exposure, time since exposure, and other factors. Since the components of the dose uncertainties we describe are likely to be present in other epidemiological studies of patients treated with radiation, our analysis may provide a model for considering the potential role of these uncertainties.


Asunto(s)
Interpretación Estadística de Datos , Neoplasias Inducidas por Radiación/epidemiología , Radiometría/métodos , Radioterapia/estadística & datos numéricos , Medición de Riesgo/métodos , Neoplasias de la Tiroides/epidemiología , Tiña del Cuero Cabelludo/epidemiología , Tiña del Cuero Cabelludo/radioterapia , Adolescente , Carga Corporal (Radioterapia) , Niño , Preescolar , Relación Dosis-Respuesta en la Radiación , Femenino , Humanos , Incidencia , Lactante , Recién Nacido , Masculino , Modelos Biológicos , Modelos Estadísticos , Dosificación Radioterapéutica , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Glándula Tiroides/efectos de la radiación
6.
Biometrics ; 58(2): 448-53, 2002 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-12071420

RESUMEN

This article demonstrates semiparametric maximum likelihood estimation of a nonlinear growth model for fish lengths using imprecisely measured ages. Data on the species corvina reina, found in the Gulf of Nicoya, Costa Rica, consist of lengths and imprecise ages for 168 fish and precise ages for a subset of 16 fish. The statistical problem may therefore be classified as nonlinear errors-in-variables regression with internal validation data. Inferential techniques are based on ideas extracted from several previous works on semiparametric maximum likelihood for errors-in-variables problems. The illustration of the example clarifies practical aspects of the associated computational, inferential, and data analytic techniques.


Asunto(s)
Funciones de Verosimilitud , Dinámicas no Lineales , Análisis de Regresión , Algoritmos , Animales , Biometría , Interpretación Estadística de Datos , Explotaciones Pesqueras/estadística & datos numéricos , Peces/crecimiento & desarrollo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA