RESUMO
The modern drug discovery process involves multiple sources of high-dimensional data. This imposes the challenge of data integration. A typical example is the integration of chemical structure (fingerprint features), phenotypic bioactivity (bioassay read-outs) data for targets of interest, and transcriptomic (gene expression) data in early drug discovery to better understand the chemical and biological mechanisms of candidate drugs, and to facilitate early detection of safety issues prior to later and expensive phases of drug development cycles. In this paper, we discuss a joint model for the transcriptomic and the phenotypic variables conditioned on the chemical structure. This modeling approach can be used to uncover, for a given set of compounds, the association between gene expression and biological activity taking into account the influence of the chemical structure of the compound on both variables. The model allows to detect genes that are associated with the bioactivity data facilitating the identification of potential genomic biomarkers for compounds efficacy. In addition, the effect of every structural feature on both genes and pIC50 and their associations can be simultaneously investigated. Two oncology projects are used to illustrate the applicability and usefulness of the joint model to integrate multi-source high-dimensional information to aid drug discovery.
Assuntos
Biomarcadores/química , Química Farmacêutica/métodos , Descoberta de Drogas , Expressão Gênica , Modelos Genéticos , Genômica , Estrutura MolecularRESUMO
During drug discovery and development, the early identification of adverse effects is expected to reduce costly late-stage failures of candidate drugs. As risk/safety assessment takes place rather late during the development process and due to the limited ability of animal models to predict the human situation, modern unbiased high-dimensional biology readouts are sought, such as molecular signatures predictive for in vivo response using high-throughput cell-based assays. In this theoretical proof of concept, we provide findings of an in-depth exploration of a single chemical core structure. Via transcriptional profiling, we identified a subset of close analogues that commonly downregulate multiple tubulin genes across cellular contexts, suggesting possible spindle poison effects. Confirmation via a qualified toxicity assay (in vitro micronucleus test) and the identification of a characteristic aggregate-formation phenotype via exploratory high-content imaging validated the initial findings. SAR analysis triggered the synthesis of a new set of compounds and allowed us to extend the series showing the genotoxic effect. We demonstrate the potential to flag toxicity issues by utilizing data from exploratory experiments that are typically generated for target evaluation purposes during early drug discovery. We share our thoughts on how this approach may be incorporated into drug development strategies.
Assuntos
Descoberta de Drogas , Perfilação da Expressão Gênica , Animais , Linhagem Celular Tumoral , Células HEK293 , Humanos , Microscopia Confocal , Inibidores de Fosfodiesterase/química , Inibidores de Fosfodiesterase/metabolismo , Inibidores de Fosfodiesterase/toxicidade , Diester Fosfórico Hidrolases/química , Diester Fosfórico Hidrolases/metabolismo , Pirrolidinas/química , Pirrolidinas/metabolismo , Pirrolidinas/toxicidade , Relação Estrutura-Atividade , Transcriptoma/efeitos dos fármacos , Tubulina (Proteína)/metabolismoRESUMO
Illumina bead arrays are microarrays that contain a random number of technical replicates (beads) for every probe (bead type) within the same array. Typically around 30 beads are placed at random positions on the array surface, which opens unique opportunities for quality control. Most preprocessing methods for Illumina bead arrays are ported from the Affymetrix microarray platform and ignore the availability of the technical replicates. The large number of beads for a particular bead type on the same array, however, should be highly correlated, otherwise they just measure noise and can be removed from the downstream analysis. Hence, filtering bead types can be considered as an important step of the preprocessing procedure for Illumina platform. This paper proposes a filtering method for Illumina bead arrays, which builds upon the mixed model framework. Bead types are called informative/non-informative (I/NI) based on a trade-off between within and between array variabilities. The method is illustrated on a publicly available Illumina Spike-in data set (Dunning et al., 2008) and we also show that filtering results in a more powerful analysis of differentially expressed genes.
Assuntos
Perfilação da Expressão Gênica/métodos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.
Assuntos
Variações do Número de Cópias de DNA , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Algoritmos , Alelos , Biologia Computacional , Polimorfismo de Nucleotídeo ÚnicoRESUMO
In this article, we discuss methods to select three different types of genes (treatment related, response related, or both) and investigate whether they can serve as biomarkers for a binary outcome variable. We consider an extension of the joint model introduced by Lin et al. (2010) and Tilahun et al. (2010) for a continuous response. As the model has certain drawbacks in a binary setting, we also present a way to use classical selection methods to identify subgroups of genes, which are treatment and/or response related. We evaluate their potential to serve as biomarkers by applying DLDA to predict the response level.
Assuntos
Descoberta de Drogas/métodos , Marcadores Genéticos/genética , Genômica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Biomarcadores , Humanos , Fatores de Tempo , Resultado do TratamentoRESUMO
MOTIVATION: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called 'FABIA: Factor Analysis for Bicluster Acquisition'. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques. RESULTS: On 100 simulated datasets with known true, artificially implanted biclusters, FABIA clearly outperformed all 11 competitors. On these datasets, FABIA was able to separate spurious biclusters from true biclusters by ranking biclusters according to their information content. FABIA was tested on three microarray datasets with known subclusters, where it was two times the best and once the second best method among the compared biclustering approaches. AVAILABILITY: FABIA is available as an R package on Bioconductor (http://www.bioconductor.org). All datasets, results and software are available at http://www.bioinf.jku.at/software/fabia/fabia.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Perfilação da Expressão Gênica/métodos , Software , Algoritmos , Análise Fatorial , Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de PadrãoRESUMO
The strength and weakness of microarray technology can be attributed to the enormous amount of information it is generating. To fully enhance the benefit of microarray technology for testing differentially expressed genes and classification, there is a need to minimize the amount of irrelevant genes present in microarray data. A major interest is to use probe-level data to call genes informative or noninformative based on the trade-off between the array-to-array variability and the measurement error. Existing works in this direction include filtering likely uninformative sets of hybridization (FLUSH; Calza et al., 2007) and I/NI calls for the exclusion of noninformative genes using FARMS (I/NI calls; Talloen et al., 2007; Hochreiter et al., 2006). In this paper, we propose a linear mixed model as a more flexible method that performs equally good as I/NI calls and outperforms FLUSH. We also introduce other criteria for gene filtering, such as, R2 and intra-cluster correlation. Additionally, we include some objective criteria based on likelihood ratio testing, the Akaike information criteria (AIC; Akaike, 1973) and the Bayesian information criterion (BIC; Schwarz, 1978 ). Based on the HGU-133A Spiked-in data set, it is shown that the linear mixed model approach outperforms FLUSH, a method that filters genes based on a quantile regression. The linear model is equivalent to a factor analysis model when either the factor loadings are set to a constant with the variance of the latent factor equal to one, or if the factor loadings are set to one together with unconstrained variance of the latent factor. Filtering based on conditional variance calls a probe set informative when the intensity of one or more probes is consistent across the arrays, while filtering using R2 or intra-cluster correlation calls a probe set informative only when average intensity of a probe set is consistent across the arrays. Filtering based on likelihood ratio test AIC and BIC are less stringent compared to the other criteria.
Assuntos
Expressão Gênica , Modelos Genéticos , Modelos Estatísticos , Teorema de Bayes , Bioestatística , Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Funções Verossimilhança , Modelos Lineares , Técnicas de Sonda Molecular/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricosRESUMO
Phenomic profiles are high-dimensional sets of readouts that can comprehensively capture the biological impact of chemical and genetic perturbations in cellular assay systems. Phenomic profiling of compound libraries can be used for compound target identification or mechanism of action (MoA) prediction and other applications in drug discovery. To devise an economical set of phenomic profiling assays, we assembled a library of 1,008 approved drugs and well-characterized tool compounds manually annotated to 218 unique MoAs, and we profiled each compound at four concentrations in live-cell, high-content imaging screens against a panel of 15 reporter cell lines, which expressed a diverse set of fluorescent organelle and pathway markers in three distinct cell lineages. For 41 of 83 testable MoAs, phenomic profiles accurately ranked the reference compounds (AUC-ROC ≥ 0.9). MoAs could be better resolved by screening compounds at multiple concentrations than by including replicates at a single concentration. Screening additional cell lineages and fluorescent markers increased the number of distinguishable MoAs but this effect quickly plateaued. There remains a substantial number of MoAs that were hard to distinguish from others under the current study's conditions. We discuss ways to close this gap, which will inform the design of future phenomic profiling efforts.
Assuntos
Produtos Biológicos/farmacologia , Proteínas Luminescentes/genética , Fenômica/métodos , Bibliotecas de Moléculas Pequenas/farmacologia , Células A549 , Linhagem Celular , Descoberta de Drogas , Regulação da Expressão Gênica/efeitos dos fármacos , Células Hep G2 , Humanos , Proteínas Luminescentes/metabolismoRESUMO
SNP arrays offer the opportunity to get a genome-wide view on copy number alterations and are increasingly used in oncology. DNA from formalin-fixed paraffin-embedded material (FFPE) is partially degraded which limits the application of those technologies for retrospective studies. We present the use of Affymetrix GeneChip SNP6.0 for identification of copy number alterations in fresh frozen (FF) and matched FFPE samples. Fifteen pairs of adenocarcinomas with both frozen and FFPE embedded material were analyzed. We present an optimization of the sample preparation and show the importance of correcting the measured intensities for fragment length and GC-content when using FFPE samples. The absence of GC content correction results in a chromosome specific "wave pattern" which may lead to the misclassification of genomic regions as being altered. The highest concordance between FFPE and matched FF were found in samples with the highest call rates. Nineteen of the 23 high level amplifications (83%) seen using FF samples were also detected in the corresponding FFPE material. For limiting the rate of "false positive" alterations, we have chosen a conservative False Discovery Rate (FDR). We observed better results using SNP probes than CNV probes for copy number analysis of FFPE material. This is the first report on the detection of copy number alterations in FFPE samples using Affymetrix GeneChip SNP6.0.
Assuntos
Dosagem de Genes , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único , DNA de Neoplasias/análise , Formaldeído/química , Humanos , Inclusão em Parafina/métodosRESUMO
Most candidate drugs currently fail later-stage clinical trials, largely due to poor prediction of efficacy on early target selection1. Drug targets with genetic support are more likely to be therapeutically valid2,3, but the translational use of genome-scale data such as from genome-wide association studies for drug target discovery in complex diseases remains challenging4-6. Here, we show that integration of functional genomic and immune-related annotations, together with knowledge of network connectivity, maximizes the informativeness of genetics for target validation, defining the target prioritization landscape for 30 immune traits at the gene and pathway level. We demonstrate how our genetics-led drug target prioritization approach (the priority index) successfully identifies current therapeutics, predicts activity in high-throughput cellular screens (including L1000, CRISPR, mutagenesis and patient-derived cell assays), enables prioritization of under-explored targets and allows for determination of target-level trait relationships. The priority index is an open-access, scalable system accelerating early-stage drug target selection for immune-mediated disease.
Assuntos
Artrite Reumatoide/genética , Descoberta de Drogas , Redes Reguladoras de Genes , Genoma Humano , Imunidade Inata/genética , Locos de Características Quantitativas , Seleção Genética , Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/imunologia , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND & AIMS: Irritable bowel syndrome (IBS) has been associated with mucosal dysfunction, mild inflammation, and altered colonic bacteria. We used microarray expression profiling of sigmoid colon mucosa to assess whether there are stably expressed sets of genes that suggest there are objective molecular biomarkers associated with IBS. METHODS: Gene expression profiling was performed using Human Genome U133 Plus 2.0 (Affymetrix) GeneChips with RNA from sigmoid colon mucosal biopsy specimens from 36 IBS patients and 25 healthy control subjects. Real-time quantitative polymerase chain reaction was used to confirm the data in 12 genes of interest. Statistical methods for microarray data were applied to search for differentially expressed genes, and to assess the stability of molecular signatures in IBS patients. RESULTS: Mucosal gene expression profiles were consistent across different sites within the sigmoid colon and were stable on repeat biopsy over approximately 3 months. Differentially expressed genes suggest functional alterations of several components of the host mucosal immune response to microbial pathogens. The most strikingly increased expression involved a yet uncharacterized gene, DKFZP564O0823. Identified specific genes suggest the hypothesis that molecular signatures may enable distinction of a subset of IBS patients from healthy controls. By using 75% of the biopsy specimens as a validation set to develop a gene profile, the test set (25%) was predicted correctly with approximately 70% accuracy. CONCLUSIONS: Mucosal gene expression analysis shows there are relatively stable alterations in colonic mucosal immunity in IBS. These molecular alterations provide the basis to test the hypothesis that objective biomarkers may be identified in IBS and enhance understanding of the disease.
Assuntos
Colo/imunologia , Imunidade nas Mucosas/genética , Mucosa Intestinal/imunologia , Síndrome do Intestino Irritável/imunologia , Adolescente , Adulto , Idoso , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Masculino , Pessoa de Meia-Idade , Análise de Sequência com Séries de Oligonucleotídeos/métodos , RNA/genética , RNA/isolamento & purificação , RNA Mensageiro/genética , RNA Mensageiro/isolamento & purificação , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodosRESUMO
Probe-level data from Affymetrix GeneChips can be summarized in many ways to produce probe-set level gene expression measures (GEMs). Disturbingly, the different approaches not only generate quite different measures but they could also yield very different analysis results. Here, we explore the question of how much the analysis results really do differ, first at the gene level, then at the biological process level. We demonstrate that, even though the gene level results may not necessarily match each other particularly well, as long as there is reasonably strong differentiation between the groups in the data, the various GEMs do in fact produce results that are similar to one another at the biological process level. Not only that the results are biologically relevant. As the extent of differentiation drops, the degree of concurrence weakens, although the biological relevance of findings at the biological process level may yet remain.
Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
MOTIVATION: DNA microarray technology typically generates many measurements of which only a relatively small subset is informative for the interpretation of the experiment. To avoid false positive results, it is therefore critical to select the informative genes from the large noisy data before the actual analysis. Most currently available filtering techniques are supervised and therefore suffer from a potential risk of overfitting. The unsupervised filtering techniques, on the other hand, are either not very efficient or too stringent as they may mix up signal with noise. We propose to use the multiple probes measuring the same target mRNA as repeated measures to quantify the signal-to-noise ratio of that specific probe set. A Bayesian factor analysis with specifically chosen prior settings, which models this probe level information, is providing an objective feature filtering technique, named informative/non-informative calls (I/NI calls). RESULTS: Based on 30 real-life data sets (including various human, rat, mice and Arabidopsis studies) and a spiked-in data set, it is shown that I/NI calls is highly effective, with exclusion rates ranging from 70% to 99%. Consequently, it offers a critical solution to the curse of high-dimensionality in the analysis of microarray data. AVAILABILITY: This filtering approach is publicly available as a function implemented in the R package FARMS (www.bioinf.jku.at/software/farms/farms.html).
Assuntos
Algoritmos , Análise por Conglomerados , Interpretação Estatística de Dados , Perfilação da Expressão Gênica/métodos , Família Multigênica/fisiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodosRESUMO
Dose-response studies are commonly used in experiments in pharmaceutical research in order to investigate the dependence of the response on dose, i.e., a trend of the response level toxicity with respect to dose. In this paper we focus on dose-response experiments within a microarray setting in which several microarrays are available for a sequence of increasing dose levels. A gene is called differentially expressed if there is a monotonic trend (with respect to dose) in the gene expression. We review several testing procedures which can be used in order to test equality among the gene expression means against ordered alternatives with respect to dose, namely Williams' (Williams 1971 and 1972), Marcus' (Marcus 1976), global likelihood ratio test (Bartholomew 1961, Barlow et al. 1972, and Robertson et al. 1988), and M (Hu et al. 2005) statistics. Additionally we introduce a modification to the standard error of the M statistic. We compare the performance of these five test statistics. Moreover, we discuss the issue of one-sided versus two-sided testing procedures. False Discovery Rate (Benjamni and Hochberg 1995, Ge et al. 2003), and resampling-based Familywise Error Rate (Westfall and Young 1993) are used to handle the multiple testing issue. The methods above are applied to a data set with 4 doses (3 arrays per dose) and 16,998 genes. Results on the number of significant genes from each statistic are discussed. A simulation study is conducted to investigate the power of each statistic. A R library IsoGene implementing the methods is available from the first author.
Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Biblioteca Gênica , Humanos , Funções Verossimilhança , Testes Psicológicos , Reprodutibilidade dos TestesRESUMO
By adding biological information, beyond the chemical properties and desired effect of a compound, uncharted compound areas and connections can be explored. In this study, we add transcriptional information for 31K compounds of Janssen's primary screening deck, using the HT L1000 platform and assess (a) the transcriptional connection score for generating compound similarities, (b) machine learning algorithms for generating target activity predictions, and (c) the scaffold hopping potential of the resulting hits. We demonstrate that the transcriptional connection score is best computed from the significant genes only and should be interpreted within its confidence interval for which we provide the stats. These guidelines help to reduce noise, increase reproducibility, and enable the separation of specific and promiscuous compounds. The added value of machine learning is demonstrated for the NR3C1 and HSP90 targets. Support Vector Machine models yielded balanced accuracy values ≥80% when the expression values from DDIT4 & SERPINE1 and TMEM97 & SPR were used to predict the NR3C1 and HSP90 activity, respectively. Combining both models resulted in 22 new and confirmed HSP90-independent NR3C1 inhibitors, providing two scaffolds (i.e., pyrimidine and pyrazolo-pyrimidine), which could potentially be of interest in the treatment of depression (i.e., inhibiting the glucocorticoid receptor (i.e., NR3C1), while leaving its chaperone, HSP90, unaffected). As such, the initial hit rate increased by a factor 300, as less, but more specific chemistry could be screened, based on the upfront computed activity predictions.
Assuntos
Proteínas de Choque Térmico HSP90/genética , Ensaios de Triagem em Larga Escala , Pirazóis/farmacologia , Pirimidinas/farmacologia , Receptores de Glucocorticoides/genética , Transcriptoma , Proteínas de Choque Térmico HSP90/metabolismo , Humanos , Receptores de Glucocorticoides/metabolismo , Máquina de Vetores de SuporteRESUMO
Slc17a5-/- mice represent an animal model for the infantile form of sialic acid storage disease (SASD). We analyzed genetic and histological time-course expression of myelin and oligodendrocyte (OL) lineage markers in different parts of the CNS, and related this to postnatal neurobehavioral development in these mice. Sialin-deficient mice display a distinct spatiotemporal pattern of sialic acid storage, CNS hypomyelination and leukoencephalopathy. Whereas few genes are differentially expressed in the perinatal stage (p0), microarray analysis revealed increased differential gene expression in later postnatal stages (p10-p18). This included progressive upregulation of neuroinflammatory genes, as well as continuous down-regulation of genes that encode myelin constituents and typical OL lineage markers. Age-related histopathological analysis indicates that initial myelination occurs normally in hindbrain regions, but progression to more frontal areas is affected in Slc17a5-/- mice. This course of progressive leukoencephalopathy and CNS hypomyelination delays neurobehavioral development in sialin-deficient mice. Slc17a5-/- mice successfully achieve early neurobehavioral milestones, but exhibit progressive delay of later-stage sensory and motor milestones. The present findings may contribute to further understanding of the processes of CNS myelination as well as help to develop therapeutic strategies for SASD and other myelination disorders.
Assuntos
Encéfalo/patologia , Regulação da Expressão Gênica no Desenvolvimento/genética , Leucoencefalopatias , Transtornos Mentais/etiologia , Transportadores de Ânions Orgânicos/deficiência , Doença do Armazenamento de Ácido Siálico , Simportadores/deficiência , Fatores Etários , Animais , Animais Recém-Nascidos , Encéfalo/metabolismo , Deficiências do Desenvolvimento/etiologia , Deficiências do Desenvolvimento/genética , Modelos Animais de Doenças , Proteína Glial Fibrilar Ácida/metabolismo , Filamentos Intermediários/metabolismo , Leucoencefalopatias/complicações , Leucoencefalopatias/etiologia , Leucoencefalopatias/genética , Proteína 1 de Membrana Associada ao Lisossomo/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Transportadores de Ânions Orgânicos/genética , Doença do Armazenamento de Ácido Siálico/complicações , Doença do Armazenamento de Ácido Siálico/genética , Doença do Armazenamento de Ácido Siálico/patologia , Simportadores/genéticaRESUMO
Vagal afferent neurons are thought to convey primarily physiological information, whereas spinal afferents transmit noxious signals from the viscera to the central nervous system. To elucidate molecular identities for these different properties, we compared gene expression profiles of neurons located in nodose ganglia (NG) and dorsal root ganglia (DRG) in mice. Intraperitoneal administration of Alexa Fluor-488-conjugated cholera toxin B allowed enrichment for neurons projecting to the viscera. Fluorescent neurons in DRG (from T10 to T13) and NG were isolated using laser-capture microdissection. Gene expression profiles of these afferent neurons, obtained by microarray hybridization, were analyzed using multivariate spectral map analysis, significance analysis of microarrays (SAM) algorithm, and fold-difference filtering. A total of 1,996 genes were differentially expressed in DRG vs. NG, including 41 G protein-coupled receptors and 60 ion channels. Expression profiles obtained on laser-captured neurons were contrasted to those obtained on whole ganglia, demonstrating striking differences and the need for microdissection when studying visceral sensory neurons because of dilution of the signal by somatic sensory neurons. Furthermore, we provide a detailed catalog of all adrenergic and cholinergic, GABA, glutamate, serotonin, and dopamine receptors; voltage-gated potassium, sodium, and calcium channels; and transient receptor potential cation channels present in afferents projecting to the peritoneal cavity. Our genome-wide expression profiling data provide novel insight into molecular signatures that underlie both functional differences and similarities between NG and DRG sensory neurons. Moreover, these findings will offer novel insight into mode of action of pharmacological agents modulating visceral sensation.
Assuntos
Gânglios Espinais/metabolismo , Perfilação da Expressão Gênica/métodos , Neurônios/metabolismo , Gânglio Nodoso/metabolismo , Cavidade Peritoneal/microbiologia , Animais , Feminino , Gânglios Sensitivos , Camundongos , Camundongos Endogâmicos BALB C , Cavidade Peritoneal/citologia , Transdução de SinaisRESUMO
When small biological samples are collected by microdissection or other methods, amplification techniques are required to provide sufficient target for hybridization to expression arrays. One such technique is to perform two successive rounds of T7-based in vitro transcription. However the use of random primers, required to regenerate cDNA from the first round of transcription, results in shortened copies of cDNA from which the 5' end is missing. In this paper we describe an experiment designed to compare the quality of data obtained from labeling small RNA samples using the Affymetrix Two-Cycle Eukaryotic. Target Labeling procedure to that of data obtained using the One-Cycle Eukaryotic Target Labeling protocol. We utilized different preprocessing algorithms to compare the data generated using both labeling methods and present a new algorithm that improves upon existing ones in this setting.
Assuntos
Técnicas de Amplificação de Ácido Nucleico/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , RNA/análise , Algoritmos , DNA Complementar/genética , RNA Polimerases Dirigidas por DNA , Microdissecção , Hibridização de Ácido Nucleico , RNA/genética , Transcrição Gênica , Proteínas ViraisRESUMO
The NIH-funded LINCS program has been initiated to generate a library of integrated, network-based, cellular signatures (LINCS). A novel high-throughput gene-expression profiling assay known as L1000 was the main technology used to generate more than a million transcriptional profiles. The profiles are based on the treatment of 14 cell lines with one of many perturbation agents of interest at a single concentration for 6 and 24 hours duration. In this study, we focus on the chemical compound treatments within the LINCS data set. The experimental variables available include number of replicates, cell lines, and time points. Our study reveals that compound characterization based on three cell lines at two time points results in more genes being affected than six cell lines at a single time point. Based on the available LINCS data, we conclude that the most optimal experimental design to characterize a large set of compounds is to test them in duplicate in three different cell lines. Our conclusions are constrained by the fact that the compounds were profiled at a single, relative high concentration, and the longer time point is likely to result in phenotypic rather than mechanistic effects being recorded.