RESUMO
BACKGROUND: Many microarray experiments search for genes with differential expression between a common "reference" group and multiple "test" groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups. RESULTS: We implemented an R-Bioconductor package named Mulcom, with a statistical test derived from the Dunnett's t-test, designed to compare multiple test groups individually against a common reference. Interestingly, the Dunnett's test uses for the denominator of each comparison a within-group standard error aggregated from all the experimental groups. In addition to the basic Dunnett's t value, the package includes an optional minimal fold-change threshold, m. Due to the automated, permutation-based estimation of False Discovery Rate (FDR), the package also permits fast optimization of the test, to obtain the maximum number of significant genes at a given FDR value. When applied to a time-course experiment profiled in parallel on two microarray platforms, and compared with two commonly used tests, Mulcom displayed better concordance of significant genes in the two array platforms (39% vs. 26% or 15%), and higher enrichment in functional annotation to categories related to the biology of the experiment (p value < 0.001 in 4 categories vs. 3). CONCLUSIONS: The Mulcom package provides a powerful tool for the identification of differentially expressed genes when several experimental conditions are compared against a common reference. The results of the practical example presented here show that lists of differentially expressed genes generated by Mulcom are particularly consistent across microarray platforms and enriched in genes belonging to functionally significant groups.
Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Humanos , Neoplasias/genética , Análise de RegressãoRESUMO
We investigated whether residual material from diagnostic smears of fine needle aspirations (FNAs) of mammographically detected breast lesions can be successfully used to extract RNA for reliable gene expression analysis. Twenty-eight patients underwent FNA of breast lesions under ultrasonographic guidance. After smearing slides for cytology, residual cells were rinsed with TRIzol to recover RNA. RNA yield ranged from 0.78 to 88.40 µg per sample. FNA leftovers from 23 nonpalpable breast cancers were selected for gene expression profiling using oligonucleotide microarrays. Clusters generated by global expression profiles partitioned samples in well-distinguished subgroups that overlapped with clusters obtained using "biologic scores" (cytohistologic variables) and differed from clusters based on "technical scores" (RNA/complementary RNA/microarray quality). Microarray profiling used to measure the grade of differentiation and estrogen receptor and ERBB2/HER2 status reflected the results obtained by histology and immunohistochemistry. Given that proliferative status in the FNA material is not always assessable, we designed and performed on FNA leftover a multiprobe genomic signature for proliferation genes that strongly correlated with the Ki67 index examined on histologic material. These findings show that cells residual to cytologic smears of FNA are suitable for obtaining high-quality RNA for high-throughput analysis even when taken from small nonpalpable breast lesions.