Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Commun ; 15(1): 3745, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38702304

RESUMEN

Early childhood tumours arise from transformed embryonic cells, which often carry large copy number alterations (CNA). However, it remains unclear how CNAs contribute to embryonic tumourigenesis due to a lack of suitable models. Here we employ female human embryonic stem cell (hESC) differentiation and single-cell transcriptome and epigenome analysis to assess the effects of chromosome 17q/1q gains, which are prevalent in the embryonal tumour neuroblastoma (NB). We show that CNAs impair the specification of trunk neural crest (NC) cells and their sympathoadrenal derivatives, the putative cells-of-origin of NB. This effect is exacerbated upon overexpression of MYCN, whose amplification co-occurs with CNAs in NB. Moreover, CNAs potentiate the pro-tumourigenic effects of MYCN and mutant NC cells resemble NB cells in tumours. These changes correlate with a stepwise aberration of developmental transcription factor networks. Together, our results sketch a mechanistic framework for the CNA-driven initiation of embryonal tumours.


Asunto(s)
Diferenciación Celular , Variaciones en el Número de Copia de ADN , Proteína Proto-Oncogénica N-Myc , Cresta Neural , Neuroblastoma , Humanos , Neuroblastoma/genética , Neuroblastoma/patología , Cresta Neural/metabolismo , Cresta Neural/patología , Femenino , Proteína Proto-Oncogénica N-Myc/genética , Proteína Proto-Oncogénica N-Myc/metabolismo , Aberraciones Cromosómicas , Células Madre Embrionarias Humanas/metabolismo , Transcriptoma , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica
2.
Blood ; 142(9): 827-845, 2023 08 31.
Artículo en Inglés | MEDLINE | ID: mdl-37249233

RESUMEN

The nuclear factor of activated T cells (NFAT) family of transcription factors plays central roles in adaptive immunity in murine models; however, their contribution to human immune homeostasis remains poorly defined. In a multigenerational pedigree, we identified 3 patients who carry germ line biallelic missense variants in NFATC1, presenting with recurrent infections, hypogammaglobulinemia, and decreased antibody responses. The compound heterozygous NFATC1 variants identified in these patients caused decreased stability and reduced the binding of DNA and interacting proteins. We observed defects in early activation and proliferation of T and B cells from these patients, amenable to rescue upon genetic reconstitution. Stimulation induced early T-cell activation and proliferation responses were delayed but not lost, reaching that of healthy controls at day 7, indicative of an adaptive capacity of the cells. Assessment of the metabolic capacity of patient T cells revealed that NFATc1 dysfunction rendered T cells unable to engage in glycolysis after stimulation, although oxidative metabolic processes were intact. We hypothesized that NFATc1-mutant T cells could compensate for the energy deficit due to defective glycolysis by using enhanced lipid metabolism as an adaptation, leading to a delayed, but not lost, activation responses. Indeed, we observed increased 13C-labeled palmitate incorporation into citrate, indicating higher fatty acid oxidation, and we demonstrated that metformin and rosiglitazone improved patient T-cell effector functions. Collectively, enabled by our molecular dissection of the consequences of loss-of-function NFATC1 mutations and extending the role of NFATc1 in human immunity beyond receptor signaling, we provide evidence of metabolic plasticity in the context of impaired glycolysis observed in patient T cells, alleviating delayed effector responses.


Asunto(s)
Factores de Transcripción NFATC , Linfocitos T , Humanos , Ratones , Animales , Linfocitos T/metabolismo , Factores de Transcripción NFATC/metabolismo , Linfocitos T CD8-positivos , Glucólisis/genética , Mutación
3.
Nat Commun ; 11(1): 5504, 2020 10 30.
Artículo en Inglés | MEDLINE | ID: mdl-33127880

RESUMEN

Single-cell RNA-sequencing (scRNA-Seq) is a compelling approach to directly and simultaneously measure cellular composition and state, which can otherwise only be estimated by applying deconvolution methods to bulk RNA-Seq estimates. However, it has not yet become a widely used tool in population-scale analyses, due to its prohibitively high cost. Here we show that given the same budget, the statistical power of cell-type-specific expression quantitative trait loci (eQTL) mapping can be increased through low-coverage per-cell sequencing of more samples rather than high-coverage sequencing of fewer samples. We use simulations starting from one of the largest available real single-cell RNA-Seq data from 120 individuals to also show that multiple experimental designs with different numbers of samples, cells per sample and reads per cell could have similar statistical power, and choosing an appropriate design can yield large cost savings especially when multiplexed workflows are considered. Finally, we provide a practical approach on selecting cost-effective designs for maximizing cell-type-specific eQTL power which is available in the form of a web tool.


Asunto(s)
Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Secuencia de Bases , Biología Computacional , Expresión Génica , Perfilación de la Expresión Génica/métodos , Genómica , Humanos
4.
Life Sci Alliance ; 3(11)2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32972997

RESUMEN

Single-cell RNA-sequencing (scRNAseq) technologies are rapidly evolving. Although very informative, in standard scRNAseq experiments, the spatial organization of the cells in the tissue of origin is lost. Conversely, spatial RNA-seq technologies designed to maintain cell localization have limited throughput and gene coverage. Mapping scRNAseq to genes with spatial information increases coverage while providing spatial location. However, methods to perform such mapping have not yet been benchmarked. To fill this gap, we organized the DREAM Single-Cell Transcriptomics challenge focused on the spatial reconstruction of cells from the Drosophila embryo from scRNAseq data, leveraging as silver standard, genes with in situ hybridization data from the Berkeley Drosophila Transcription Network Project reference atlas. The 34 participating teams used diverse algorithms for gene selection and location prediction, while being able to correctly localize clusters of cells. Selection of predictor genes was essential for this task. Predictor genes showed a relatively high expression entropy, high spatial clustering and included prominent developmental genes such as gap and pair-rule genes and tissue markers. Application of the top 10 methods to a zebra fish embryo dataset yielded similar performance and statistical properties of the selected genes than in the Drosophila data. This suggests that methods developed in this challenge are able to extract generalizable properties of genes that are useful to accurately reconstruct the spatial arrangement of cells in tissues.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Análisis Espacial , Algoritmos , Animales , Bases de Datos Genéticas , Drosophila/genética , Predicción/métodos , Regulación del Desarrollo de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma/genética , Pez Cebra/genética
5.
mSystems ; 5(3)2020 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-32487739

RESUMEN

Small noncoding RNAs (sRNAs) are key regulators of bacterial gene expression. Through complementary base pairing, sRNAs affect mRNA stability and translation efficiency. Here, we describe a network inference approach designed to identify sRNA-mediated regulation of transcript levels. We use existing transcriptional data sets and prior knowledge to infer sRNA regulons using our network inference tool, the Inferelator This approach produces genome-wide gene regulatory networks that include contributions by both transcription factors and sRNAs. We show the benefits of estimating and incorporating sRNA activities into network inference pipelines using available experimental data. We also demonstrate how these estimated sRNA regulatory activities can be mined to identify the experimental conditions where sRNAs are most active. We uncover 45 novel experimentally supported sRNA-mRNA interactions in Escherichia coli, outperforming previous network-based efforts. Additionally, our pipeline complements sequence-based sRNA-mRNA interaction prediction methods by adding a data-driven filtering step. Finally, we show the general applicability of our approach by identifying 24 novel, experimentally supported, sRNA-mRNA interactions in Pseudomonas aeruginosa, Staphylococcus aureus, and Bacillus subtilis Overall, our strategy generates novel insights into the functional context of sRNA regulation in multiple bacterial species.IMPORTANCE Individual bacterial genomes can have dozens of small noncoding RNAs with largely unexplored regulatory functions. Although bacterial sRNAs influence a wide range of biological processes, including antibiotic resistance and pathogenicity, our current understanding of sRNA-mediated regulation is far from complete. Most of the available information is restricted to a few well-studied bacterial species; and even in those species, only partial sets of sRNA targets have been characterized in detail. To close this information gap, we developed a computational strategy that takes advantage of available transcriptional data and knowledge about validated and putative sRNA-mRNA interactions for inferring expanded sRNA regulons. Our approach facilitates the identification of experimentally supported novel interactions while filtering out false-positive results. Due to its data-driven nature, our method prioritizes biologically relevant interactions among lists of candidate sRNA-target pairs predicted in silico from sequence analysis or derived from sRNA-mRNA binding experiments.

6.
Genome Biol ; 20(1): 296, 2019 12 23.
Artículo en Inglés | MEDLINE | ID: mdl-31870423

RESUMEN

Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from "regularized negative binomial regression," where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.


Asunto(s)
Análisis de Secuencia de ARN , Análisis de la Célula Individual , Programas Informáticos , Humanos , Análisis de Regresión
7.
Cell ; 177(7): 1888-1902.e21, 2019 06 13.
Artículo en Inglés | MEDLINE | ID: mdl-31178118

RESUMEN

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Programas Informáticos , Transcriptoma , Humanos
8.
Nature ; 555(7697): 457-462, 2018 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-29513653

RESUMEN

Diverse subsets of cortical interneurons have vital roles in higher-order brain functions. To investigate how this diversity is generated, here we used single-cell RNA sequencing to profile the transcriptomes of mouse cells collected along a developmental time course. Heterogeneity within mitotic progenitors in the ganglionic eminences is driven by a highly conserved maturation trajectory, alongside eminence-specific transcription factor expression that seeds the emergence of later diversity. Upon becoming postmitotic, progenitors diverge and differentiate into transcriptionally distinct states, including an interneuron precursor state. By integrating datasets across developmental time points, we identified shared sources of transcriptomic heterogeneity between adult interneurons and their precursors, and uncovered the embryonic emergence of cardinal interneuron subtypes. Our analysis revealed that the transcription factor Mef2c, which is linked to various neuropsychiatric and neurodevelopmental disorders, delineates early precursors of parvalbumin-expressing neurons, and is essential for their development. These findings shed new light on the molecular diversification of early inhibitory precursors, and identify gene modules that may influence the specification of human interneuron subtypes.


Asunto(s)
Diferenciación Celular , Interneuronas/citología , Interneuronas/fisiología , Inhibición Neural , Corteza Visual/citología , Animales , Diferenciación Celular/genética , Embrión de Mamíferos/citología , Femenino , Ganglios/citología , Ganglios/metabolismo , Perfilación de la Expresión Génica , Humanos , Factores de Transcripción MEF2/metabolismo , Masculino , Ratones , Mitosis/genética , Parvalbúminas/metabolismo , ARN Citoplasmático Pequeño/genética , Análisis de la Célula Individual
9.
Nat Methods ; 14(9): 865-868, 2017 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-28759029

RESUMEN

High-throughput single-cell RNA sequencing has transformed our understanding of complex cell populations, but it does not provide phenotypic information such as cell-surface protein levels. Here, we describe cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), a method in which oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements into an efficient, single-cell readout. CITE-seq is compatible with existing single-cell sequencing approaches and scales readily with throughput increases.


Asunto(s)
Mapeo Epitopo/métodos , Epítopos/inmunología , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Análisis de Matrices Tisulares/métodos , Transcriptoma/fisiología
10.
Plant Cell ; 28(10): 2365-2384, 2016 10.
Artículo en Inglés | MEDLINE | ID: mdl-27655842

RESUMEN

Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference.


Asunto(s)
Oryza/metabolismo , Proteínas de Plantas/metabolismo , Agua/metabolismo , Regulación de la Expresión Génica de las Plantas/fisiología , Temperatura , Factores de Transcripción/metabolismo
11.
Elife ; 42015 Nov 26.
Artículo en Inglés | MEDLINE | ID: mdl-26609814

RESUMEN

Plants rely on transcriptional dynamics to respond to multiple climatic fluctuations and contexts in nature. We analyzed the genome-wide gene expression patterns of rice (Oryza sativa) growing in rainfed and irrigated fields during two distinct tropical seasons and determined simple linear models that relate transcriptomic variation to climatic fluctuations. These models combine multiple environmental parameters to account for patterns of expression in the field of co-expressed gene clusters. We examined the similarities of our environmental models between tropical and temperate field conditions, using previously published data. We found that field type and macroclimate had broad impacts on transcriptional responses to environmental fluctuations, especially for genes involved in photosynthesis and development. Nevertheless, variation in solar radiation and temperature at the timescale of hours had reproducible effects across environmental contexts. These results provide a basis for broad-based predictive modeling of plant gene expression in the field.


Asunto(s)
Exposición a Riesgos Ambientales , Regulación de la Expresión Génica de las Plantas , Oryza/crecimiento & desarrollo , Oryza/genética , Clima , Suelo/química , Luz Solar , Transcripción Genética
12.
Mol Syst Biol ; 11(11): 839, 2015 Nov 17.
Artículo en Inglés | MEDLINE | ID: mdl-26577401

RESUMEN

Organisms from all domains of life use gene regulation networks to control cell growth, identity, function, and responses to environmental challenges. Although accurate global regulatory models would provide critical evolutionary and functional insights, they remain incomplete, even for the best studied organisms. Efforts to build comprehensive networks are confounded by challenges including network scale, degree of connectivity, complexity of organism-environment interactions, and difficulty of estimating the activity of regulatory factors. Taking advantage of the large number of known regulatory interactions in Bacillus subtilis and two transcriptomics datasets (including one with 38 separate experiments collected specifically for this study), we use a new combination of network component analysis and model selection to simultaneously estimate transcription factor activities and learn a substantially expanded transcriptional regulatory network for this bacterium. In total, we predict 2,258 novel regulatory interactions and recall 74% of the previously known interactions. We obtained experimental support for 391 (out of 635 evaluated) novel regulatory edges (62% accuracy), thus significantly increasing our understanding of various cell processes, such as spore formation.


Asunto(s)
Bacillus subtilis/genética , Regulación Bacteriana de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Transcriptoma/genética , Bases de Datos Genéticas , Genes Bacterianos/genética , Modelos Genéticos , Esporas Bacterianas/genética , Biología de Sistemas
13.
Bioinformatics ; 31(4): 501-8, 2015 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-25150249

RESUMEN

MOTIVATION: Experiments in animal models are often conducted to infer how humans will respond to stimuli by assuming that the same biological pathways will be affected in both organisms. The limitations of this assumption were tested in the IMPROVER Species Translation Challenge, where 52 stimuli were applied to both human and rat cells and perturbed pathways were identified. In the Inter-species Pathway Perturbation Prediction sub-challenge, multiple teams proposed methods to use rat transcription data from 26 stimuli to predict human gene set and pathway activity under the same perturbations. Submissions were evaluated using three performance metrics on data from the remaining 26 stimuli. RESULTS: We present two approaches, ranked second in this challenge, that do not rely on sequence-based orthology between rat and human genes to translate pathway perturbation state but instead identify transcriptional response orthologs across a set of training conditions. The translation from rat to human accomplished by these so-called direct methods is not dependent on the particular analysis method used to identify perturbed gene sets. In contrast, machine learning-based methods require performing a pathway analysis initially and then mapping the pathway activity between organisms. Unlike most machine learning approaches, direct methods can be used to predict the activation of a human pathway for a new (test) stimuli, even when that pathway was never activated by a training stimuli. AVAILABILITY: Gene expression data are available from ArrayExpress (accession E-MTAB-2091), while software implementations are available from http://bioinformaticsprb.med.wayne.edu?p=50 and http://goo.gl/hJny3h. CONTACT: christoph.hafemeister@nyu.edu or atarca@med.wayne.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Inteligencia Artificial , Citocinas/metabolismo , Perfilación de la Expresión Génica/métodos , Fosfoproteínas/metabolismo , Programas Informáticos , Biología de Sistemas/métodos , Animales , Bronquios/citología , Bronquios/metabolismo , Células Cultivadas , Bases de Datos Factuales , Células Epiteliales/citología , Células Epiteliales/metabolismo , Regulación de la Expresión Génica , Humanos , Modelos Animales , Análisis de Secuencia por Matrices de Oligonucleótidos , Fosforilación , Ratas , Transducción de Señal , Especificidad de la Especie , Investigación Biomédica Traslacional
14.
IEEE Trans Vis Comput Graph ; 20(12): 1903-12, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26356904

RESUMEN

Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Bases de Datos Genéticas , Redes Reguladoras de Genes , Internet , Animales , Ratones , Factores de Transcripción , Interfaz Usuario-Computador
15.
Bioinformatics ; 29(8): 1060-7, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23525069

RESUMEN

MOTIVATION: Inferring global regulatory networks (GRNs) from genome-wide data is a computational challenge central to the field of systems biology. Although the primary data currently used to infer GRNs consist of gene expression and proteomics measurements, there is a growing abundance of alternate data types that can reveal regulatory interactions, e.g. ChIP-Chip, literature-derived interactions, protein-protein interactions. GRN inference requires the development of integrative methods capable of using these alternate data as priors on the GRN structure. Each source of structure priors has its unique biases and inherent potential errors; thus, GRN methods using these data must be robust to noisy inputs. RESULTS: We developed two methods for incorporating structure priors into GRN inference. Both methods [Modified Elastic Net (MEN) and Bayesian Best Subset Regression (BBSR)] extend the previously described Inferelator framework, enabling the use of prior information. We test our methods on one synthetic and two bacterial datasets, and show that both MEN and BBSR infer accurate GRNs even when the structure prior used has significant amounts of error (>90% erroneous interactions). We find that BBSR outperforms MEN at inferring GRNs from expression data and noisy structure priors. AVAILABILITY AND IMPLEMENTATION: Code, datasets and networks presented in this article are available at http://bonneaulab.bio.nyu.edu/software.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Redes Reguladoras de Genes , Algoritmos , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Teorema de Bayes , Escherichia coli/genética , Escherichia coli/metabolismo , Expresión Génica , Modelos Genéticos , Análisis de Regresión , Biología de Sistemas/métodos , Factores de Transcripción/metabolismo
16.
PLoS One ; 7(10): e46965, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23056544

RESUMEN

Post-transcriptional regulation of gene expression contributes to the protein output of a cell, however, methods for measuring translational regulation in complex in vivo systems are lacking. Here, we describe a sensitive method for measuring translational regulation in defined cell populations from heterogeneous tissue in vivo. We adapted the translating ribosome affinity purification (TRAP) methodology to measure the relative occupancy of individual mRNA transcripts in translating ribosomes in the Olig2-positive tumor cell population in a genetically engineered mouse model (GEM) of glioma. Global measurement of paired ribosome-bound and total cellular mRNA populations from tumor cells in vivo identified a broad distribution of relative ribosome occupancies amongst mRNA species that was highly reproducible across biological samples. Comparison of the translation state of glioma cells to non-transformed oligodendrocyte progenitor cells in normal brain identified global alteration of translation in tumor, and specifically of genes involved in cell division and synthetic metabolism. Furthermore, investigation of alteration in steady state translational efficiencies upon loss of PTEN, one of the most frequently mutated and deleted tumor suppressors in glioma, identified differential translation of proteins involved in cellular respiration, canonically regulated by PI3K/Akt signaling, and cellular glycosylation profiles, deregulation of which is known to be associated with tumor progression. Application of the translation efficiency profiling method described here to other biological contexts and conditions would extend our knowledge of the scope and impact of this important mode of gene regulation in complex in vivo systems.


Asunto(s)
Perfilación de la Expresión Génica , Glioma/genética , Biosíntesis de Proteínas/genética , Animales , Respiración de la Célula/genética , Transformación Celular Neoplásica/genética , Femenino , Eliminación de Gen , Glioma/metabolismo , Glioma/patología , Masculino , Ratones , Fosfohidrolasa PTEN/deficiencia , Fosfohidrolasa PTEN/genética , Fosfatidilinositol 3-Quinasas/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN Ribosómico/genética , Transducción de Señal/genética
17.
Artículo en Inglés | MEDLINE | ID: mdl-21358006

RESUMEN

For designing oligonucleotide tiling arrays popular, current methods still rely on simple criteria like Hamming distance or longest common factors, neglecting base stacking effects which strongly contribute to binding energies. Consequently, probes are often prone to cross-hybridization which reduces the signal-to-noise ratio and complicates downstream analysis. We propose the first computationally efficient method using hybridization energy to identify specific oligonucleotide probes. Our Cross-Hybridization Potential (CHP) is computed with a Nearest Neighbor Alignment, which efficiently estimates a lower bound for the Gibbs free energy of the duplex formed by two DNA sequences of bounded length. It is derived from our simplified reformulation of t-gap insertion-deletion-like metrics. The computations are accelerated by a filter using weighted ungapped q-grams to arrive at seeds. The computation of the CHP is implemented in our software OSProbes, available under the GPL, which computes sets of viable probe candidates. The user can choose a trade-off between running time and quality of probes selected. We obtain very favorable results in comparison with prior approaches with respect to specificity and sensitivity for cross-hybridization and genome coverage with high-specificity probes. The combination of OSProbes and our Tileomatic method, which computes optimal tiling paths from candidate sets, yields globally optimal tiling arrays, balancing probe distance, hybridization conditions, and uniqueness of hybridization.


Asunto(s)
Genoma , Sondas de Oligonucleótidos/química , Secuencia de Bases , ADN/química , Perfilación de la Expresión Génica , Hibridación de Ácido Nucleico , Termodinámica
18.
Bioinformatics ; 27(7): 946-52, 2011 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-21266444

RESUMEN

MOTIVATION: Analyzing short time-courses is a frequent and relevant problem in molecular biology, as, for example, 90% of gene expression time-course experiments span at most nine time-points. The biological or clinical questions addressed are elucidating gene regulation by identification of co-expressed genes, predicting response to treatment in clinical, trial-like settings or classifying novel toxic compounds based on similarity of gene expression time-courses to those of known toxic compounds. The latter problem is characterized by irregular and infrequent sample times and a total lack of prior assumptions about the incoming query, which comes in stark contrast to clinical settings and requires to implicitly perform a local, gapped alignment of time series. The current state-of-the-art method (SCOW) uses a variant of dynamic time warping and models time series as higher order polynomials (splines). RESULTS: We suggest to model time-courses monitoring response to toxins by piecewise constant functions, which are modeled as left-right Hidden Markov Models. A Bayesian approach to parameter estimation and inference helps to cope with the short, but highly multivariate time-courses. We improve prediction accuracy by 7% and 4%, respectively, when classifying toxicology and stress response data. We also reduce running times by at least a factor of 140; note that reasonable running times are crucial when classifying response to toxins. In conclusion, we have demonstrated that appropriate reduction of model complexity can result in substantial improvements both in classification performance and running time. AVAILABILITY: A Python package implementing the methods described is freely available under the GPL from http://bioinformatics.rutgers.edu/Software/MVQueries/.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Animales , Teorema de Bayes , Clasificación , Expresión Génica/efectos de los fármacos , Cinética , Ratones , Toxinas Biológicas/farmacología
19.
Bioinformatics ; 25(12): i6-14, 2009 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-19478017

RESUMEN

MOTIVATION: Personalized medicine based on molecular aspects of diseases, such as gene expression profiling, has become increasingly popular. However, one faces multiple challenges when analyzing clinical gene expression data; most of the well-known theoretical issues such as high dimension of feature spaces versus few examples, noise and missing data apply. Special care is needed when designing classification procedures that support personalized diagnosis and choice of treatment. Here, we particularly focus on classification of interferon-beta (IFNbeta) treatment response in Multiple Sclerosis (MS) patients which has attracted substantial attention in the recent past. Half of the patients remain unaffected by IFNbeta treatment, which is still the standard. For them the treatment should be timely ceased to mitigate the side effects. RESULTS: We propose constrained estimation of mixtures of hidden Markov models as a methodology to classify patient response to IFNbeta treatment. The advantages of our approach are that it takes the temporal nature of the data into account and its robustness with respect to noise, missing data and mislabeled samples. Moreover, mixture estimation enables to explore the presence of response sub-groups of patients on the transcriptional level. We clearly outperformed all prior approaches in terms of prediction accuracy, raising it, for the first time, >90%. Additionally, we were able to identify potentially mislabeled samples and to sub-divide the good responders into two sub-groups that exhibited different transcriptional response programs. This is supported by recent findings on MS pathology and therefore may raise interesting clinical follow-up questions. AVAILABILITY: The method is implemented in the GQL framework and is available at http://www.ghmm.org/gql. Datasets are available at http://www.cin.ufpe.br/ approximately igcf/MSConst. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Clasificación/métodos , Humanos , Interferón beta/química , Interferón beta/farmacología , Cadenas de Markov , Esclerosis Múltiple/genética , Esclerosis Múltiple/metabolismo
20.
Bioinformatics ; 24(13): i156-64, 2008 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-18586709

RESUMEN

MOTIVATION: The regulation of proliferation and differentiation of embryonic and adult stem cells into mature cells is central to developmental biology. Gene expression measured in distinguishable developmental stages helps to elucidate underlying molecular processes. In previous work we showed that functional gene modules, which act distinctly in the course of development, can be represented by a mixture of trees. In general, the similarities in the gene expression programs of cell populations reflect the similarities in the differentiation path. RESULTS: We propose a novel model for gene expression profiles and an unsupervised learning method to estimate developmental similarity and infer differentiation pathways. We assess the performance of our model on simulated data and compare it with favorable results to related methods. We also infer differentiation pathways and predict functional modules in gene expression data of lymphoid development. CONCLUSIONS: We demonstrate for the first time how, in principal, the incorporation of structural knowledge about the dependence structure helps to reveal differentiation pathways and potentially relevant functional gene modules from microarray datasets. Our method applies in any area of developmental biology where it is possible to obtain cells of distinguishable differentiation stages. AVAILABILITY: The implementation of our method (GPL license), data and additional results are available at http://algorithmics.molgen.mpg.de/Supplements/InfDif/. SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.


Asunto(s)
Algoritmos , Proteínas de Ciclo Celular/fisiología , Diferenciación Celular/fisiología , Perfilación de la Expresión Génica/métodos , Modelos Biológicos , Transducción de Señal/fisiología , Simulación por Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...