Your browser doesn't support javascript.
loading
A two-stage statistical procedure for feature selection and comparison in functional analysis of metagenomes.
Pookhao, Naruekamol; Sohn, Michael B; Li, Qike; Jenkins, Isaac; Du, Ruofei; Jiang, Hongmei; An, Lingling.
Afiliação
  • Pookhao N; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • Sohn MB; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • Li Q; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • Jenkins I; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • Du R; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • Jiang H; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA.
  • An L; Department of Agricultural & Biosystems Engineering, Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ, 85721 and Department of Statistics, Northwestern University, Evanston, IL 60208, USA Department of Agricultural & Biosystems Engineering, Interdisciplinary Program
Bioinformatics ; 31(2): 158-65, 2015 Jan 15.
Article em En | MEDLINE | ID: mdl-25256572
MOTIVATION: With the advance of new sequencing technologies producing massive short reads data, metagenomics is rapidly growing, especially in the fields of environmental biology and medical science. The metagenomic data are not only high dimensional with large number of features and limited number of samples but also complex with a large number of zeros and skewed distribution. Efficient computational and statistical tools are needed to deal with these unique characteristics of metagenomic sequencing data. In metagenomic studies, one main objective is to assess whether and how multiple microbial communities differ under various environmental conditions. RESULTS: We propose a two-stage statistical procedure for selecting informative features and identifying differentially abundant features between two or more groups of microbial communities. In the functional analysis of metagenomes, the features may refer to the pathways, subsystems, functional roles and so on. In the first stage of the proposed procedure, the informative features are selected using elastic net as reducing the dimension of metagenomic data. In the second stage, the differentially abundant features are detected using generalized linear models with a negative binomial distribution. Compared with other available methods, the proposed approach demonstrates better performance for most of the comprehensive simulation studies. The new method is also applied to two real metagenomic datasets related to human health. Our findings are consistent with those in previous reports. AVAILABILITY: R code and two example datasets are available at http://cals.arizona.edu/∼anling/software.htm. SUPPLEMENTARY INFORMATION: Supplementary file is available at Bioinformatics online.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Doenças Inflamatórias Intestinais / Interpretação Estatística de Dados / Metagenômica / Genes Bacterianos / Obesidade Tipo de estudo: Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Doenças Inflamatórias Intestinais / Interpretação Estatística de Dados / Metagenômica / Genes Bacterianos / Obesidade Tipo de estudo: Observational_studies / Prognostic_studies / Risk_factors_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article