RESUMEN
BACKGROUND: Gene expression microarray technologies are widely used across most areas of biological and medical research. Comparing and integrating microarray data from different experiments would be very useful, but is currently very challenging due to the experimental and hybridization conditions, as well as data preprocessing and normalization methods. Furthermore, even in the case of the widely-used, industry-standard Affymetrix oligonucleotide microarrays, the various array generations have different probe sets representing different genes, hindering the data integration. RESULTS: In this study our objective is to find systematic approaches to normalize the data emerging from different Affymetrix array generations and from different laboratories. We compare and assess the accuracy of five normalization methods for Affymetrix gene expression data using 6,926 Affymetrix experiments from five array generations. The methods that we compare include 1) standardization, 2) housekeeping gene based normalization, 3) equalized quantile normalization, 4) Weibull distribution based normalization and 5) array generation based gene centering. Our results indicate that the best results are achieved when the data is normalized first within a sample and then between-samples with Array Generation based gene Centering (AGC) normalization. CONCLUSION: We conclude that with the AGC method integrating different Affymetrix datasets results in values that are significantly more comparable across the array generations than in the cases where no array generation based normalization is used. The AGC method was found to be the best method for normalizing the data from several different array generations, and achieve comparable gene values across thousands of samples.
Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Distribuciones Estadísticas , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodosRESUMEN
BACKGROUND: Gene copy number and gene expression values play important roles in cancer initiation and progression. Both can be measured with high-throughput microarrays and some methodologies to integrate and analyze these data exist. However, varying gene sets within different gene expression and copy number microarrays present significant challenges. RESULTS: We report an advanced version of earlier published CGH-Plotter that rapidly can identify amplified and deleted areas using gene copy number data. With CGH-Plotter v2, the copy number values can be filtered based on the genomic location in basepair units. After filtering, the values for the missing genes can be interpolated. Moreover, the effect of non-informative areas in the genome can be systematically removed by smoothing and interpolating. Further, we developed a tool (ECN) to illustrate the CGH-data values annotated based on the gene expression. The ECN-tool is a MATLAB toolbox enabling straightforward illustration of copy numbers annotated based on the gene expression levels. CONCLUSION: CGH-Plotter v2 provides two methods for analyzing copy number data; dynamic programming and genomic location based smoothing. With ECN-tool the data analyzed with CGH-Plotter v2 can easily be illustrated along the chromosomes individually or along the whole genome. ECN-tool plots the copy number data annotated based on the gene expression data, and it is easy to find the genes that are both over-expressed and amplified or under-expressed and deleted in the samples. From the resulting figures it is straightforward to select interesting genes.
Asunto(s)
Dosificación de Gen/genética , Perfilación de la Expresión Génica/métodos , Expresión Génica , Hibridación Genómica Comparativa , Genoma Humano , Neoplasias de Cabeza y Cuello/genética , Humanos , Programas Informáticos , Neoplasias de la Lengua/genéticaRESUMEN
Gene amplifications and deletions are frequent in head and neck squamous cell carcinomas (SCC) but the association of these alterations with gene expression is mostly unknown. Here, we characterized genome-wide copy number and gene expression changes on microarrays for 18 oral tongue SCC (OTSCC) cell lines. We identified a number of altered regions including nine high-level amplifications such as 6q12-q14 (CD109, MYO6), 9p24 (JAK2, CD274, SLC1A1, RLN1), 11p12-p13 (TRAF6, COMMD9, TRIM44, FJX1, CD44, PDHX, APIP), 11q13 (FADD, PPFIA1, CTTN), and 14q24 (ABCD4, HBLD1, LTBP2, ZNF410, COQ6, ACYP1, JDP2) where 9% to 64% of genes showed overexpression. Across the whole genome, 26% of the amplified genes had associated overexpression in OTSCC. Furthermore, our data implicated that OTSCC cell lines harbored similar genomic alterations as laryngeal SCC cell lines We have previously analyzed, suggesting that despite differences in clinicopathological features there are no marked differences in molecular genetic alterations of these two HNSCC sites. To identify genes whose expression was associated with copy number increase in head and neck SCC, a statistical analysis for oral tongue and laryngeal SCC cell line data were performed. We pinpointed 1,192 genes that had a statistically significant association between copy number and gene expression. These results suggest that genomic alterations with associated gene expression changes play an important role in the malignant behavior of head and neck SCC. The identified genes provide a basis for further functional validation and may lead to the identification of novel candidates for targeted therapies. This article contains Supplementary Material available at http://www.interscience.wiley.com/jpages/1045-2257/suppmat.
Asunto(s)
Carcinoma de Células Escamosas/genética , Amplificación de Genes , Dosificación de Gen , Regulación Neoplásica de la Expresión Génica , Neoplasias Laríngeas/genética , Neoplasias de la Lengua/genética , Carcinoma de Células Escamosas/metabolismo , Carcinoma de Células Escamosas/patología , Línea Celular Tumoral/metabolismo , Eliminación de Gen , Perfilación de la Expresión Génica , Humanos , Neoplasias Laríngeas/metabolismo , Neoplasias Laríngeas/patología , Proteínas de Neoplasias/biosíntesis , Proteínas de Neoplasias/genética , Hibridación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos , Neoplasias de la Lengua/metabolismo , Neoplasias de la Lengua/patologíaRESUMEN
BACKGROUND: The 70 kDa ribosomal protein S6 kinase (RPS6KB1), located at 17q23, is amplified and overexpressed in 10-30% of primary breast cancers and breast cancer cell lines. p70S6K is a serine/threonine kinase regulated by PI3K/mTOR pathway, which plays a crucial role in control of cell cycle, growth and survival. Our aim was to determine p70S6K and PI3K/mTOR/p70S6K pathway dependent gene expression profiles by microarrays using five breast cancer cell lines with predefined gene copy number and gene expression alterations. The p70S6K dependent profiles were determined by siRNA silencing of RPS6KB1 in two breast cancer cell lines overexpressing p70S6K. These profiles were further correlated with gene expression alterations caused by inhibition of PI3K/mTOR pathway with PI3K inhibitor Ly294002 or mTOR inhibitor rapamycin. RESULTS: Altogether, the silencing of p70S6K altered the expression of 109 and 173 genes in two breast cancer cell lines and 67 genes were altered in both cell lines in addition to RPS6KB1. Furthermore, 17 genes including VTCN1 and CDKN2B showed overlap with genes differentially expressed after PI3K or mTOR inhibition. The gene expression signatures responsive to both PI3K/mTOR pathway and p70S6K inhibitions revealed previously unidentified genes suggesting novel downstream targets for PI3K/mTOR/p70S6K pathway. CONCLUSION: Since p70S6K overexpression is associated with aggressive disease and poor prognosis of breast cancer patients, the potential downstream targets of p70S6K and the whole PI3K/mTOR/p70S6K pathway identified in our study may have diagnostic value.
Asunto(s)
Neoplasias de la Mama/genética , Regulación Neoplásica de la Expresión Génica , Fosfatidilinositol 3-Quinasas/genética , Proteínas Quinasas/genética , Proteínas Quinasas S6 Ribosómicas 70-kDa/genética , Antibióticos Antineoplásicos/farmacología , Apoptosis/efectos de los fármacos , Línea Celular Tumoral , Cromonas/farmacología , Perfilación de la Expresión Génica , Silenciador del Gen , Humanos , Morfolinas/farmacología , Análisis de Secuencia por Matrices de Oligonucleótidos , Inhibidores de las Quinasa Fosfoinosítidos-3 , Fosforilación , Proteínas Quinasas/efectos de los fármacos , ARN Interferente Pequeño/genética , Proteínas Quinasas S6 Ribosómicas 70-kDa/efectos de los fármacos , Sirolimus/farmacología , Serina-Treonina Quinasas TORRESUMEN
Our knowledge on tissue- and disease-specific functions of human genes is rather limited and highly context-specific. Here, we have developed a method for the comparison of mRNA expression levels of most human genes across 9,783 Affymetrix gene expression array experiments representing 43 normal human tissue types, 68 cancer types, and 64 other diseases. This database of gene expression patterns in normal human tissues and pathological conditions covers 113 million datapoints and is available from the GeneSapiens website.