Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
BMC Bioinformatics ; 20(1): 226, 2019 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-31053060

RESUMO

BACKGROUND: RNA sequencing (RNA-seq) has become the standard means of analyzing gene and transcript expression in high-throughput. While previously sequence alignment was a time demanding step, fast alignment methods and even more so transcript counting methods which avoid mapping and quantify gene and transcript expression by evaluating whether a read is compatible with a transcript, have led to significant speed-ups in data analysis. Now, the most time demanding step in the analysis of RNA-seq data is preprocessing the raw sequence data, such as running quality control and adapter, contamination and quality filtering before transcript or gene quantification. To do so, many researchers chain different tools, but a comprehensive, flexible and fast software that covers all preprocessing steps is currently missing. RESULTS: We here present FastqPuri, a light-weight and highly efficient preprocessing tool for fastq data. FastqPuri provides sequence quality reports on the sample and dataset level with new plots which facilitate decision making for subsequent quality filtering. Moreover, FastqPuri efficiently removes adapter sequences and sequences from biological contamination from the data. It accepts both single- and paired-end data in uncompressed or compressed fastq files. FastqPuri can be run stand-alone and is suitable to be run within pipelines. We benchmarked FastqPuri against existing tools and found that FastqPuri is superior in terms of speed, memory usage, versatility and comprehensiveness. CONCLUSIONS: FastqPuri is a new tool which covers all aspects of short read sequence data preprocessing. It was designed for RNA-seq data to meet the needs for fast preprocessing of fastq data to allow transcript and gene counting, but it is suitable to process any short read sequencing data of which high sequence quality is needed, such as for genome assembly or SNV (single nucleotide variant) detection. FastqPuri is most flexible in filtering undesired biological sequences by offering two approaches to optimize speed and memory usage dependent on the total size of the potential contaminating sequences. FastqPuri is available at https://github.com/jengelmann/FastqPuri . It is implemented in C and R and licensed under GPL v3.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Humanos , Software
2.
Methods Mol Biol ; 1526: 205-229, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896744

RESUMO

New technologies allow for high-dimensional profiling of patients. For instance, genome-wide gene expression analysis in tumors or in blood is feasible with microarrays, if all transcripts are known, or even without this restriction using high-throughput RNA sequencing. Other technologies like NMR finger printing allow for high-dimensional profiling of metabolites in blood or urine. Such technologies for high-dimensional patient profiling represent novel possibilities for molecular diagnostics. In clinical profiling studies, researchers aim to predict disease type, survival, or treatment response for new patients using high-dimensional profiles. In this process, they encounter a series of obstacles and pitfalls. We review fundamental issues from machine learning and recommend a procedure for the computational aspects of a clinical profiling study.


Assuntos
Biologia Computacional/métodos , Animais , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Análise de Sequência com Séries de Oligonucleotídeos/métodos
3.
Bioinformatics ; 31(2): 259-61, 2015 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-25260699

RESUMO

UNLABELLED: The R package EasyStrata facilitates the evaluation and visualization of stratified genome-wide association meta-analyses (GWAMAs) results. It provides (i) statistical methods to test and account for between-strata difference as a means to tackle gene-strata interaction effects and (ii) extended graphical features tailored for stratified GWAMA results. The software provides further features also suitable for general GWAMAs including functions to annotate, exclude or highlight specific loci in plots or to extract independent subsets of loci from genome-wide datasets. It is freely available and includes a user-friendly scripting interface that simplifies data handling and allows for combining statistical and graphical functions in a flexible fashion. AVAILABILITY: EasyStrata is available for free (under the GNU General Public License v3) from our Web site www.genepi-regensburg.de/easystrata and from the CRAN R package repository cran.r-project.org/web/packages/EasyStrata/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Genoma Humano , Estudo de Associação Genômica Ampla , Metanálise como Assunto , Software , Conjuntos de Dados como Assunto , Humanos
4.
PLoS One ; 8(11): e78935, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24223867

RESUMO

BACKGROUND: An important phenomenon observed in glioma metabolism is increased aerobic glycolysis in tumor cells, which is generally referred to as the Warburg effect. Transforming growth factor (TGF)-beta2, which we previously showed to be induced by lactic acid, is a key pathophysiological factor in glioblastoma, leading to increased invasion and severe local immunosuppression after proteolytic cleavage from its latency associated peptide. In this study we tested the hypothesis, that lactate regulates TGF-beta2 expression and glioma cell migration via induction of Thrombospondin-1 (THBS-1), a TGF-beta activating protein. METHODS: Lactate levels were reduced by knockdown of LDH-A using specific small interfering RNA (siRNA) and competitive inhibition of LDH-A by sodium oxamate. Knockdown of THBS-1 was performed using specific siRNA. Western Blot, qRT-PCR, and ELISA were used to investigate expression levels of LDH-A, LDH-B, TGF-beta2 and THBS-1. Migration of cells was examined by Spheroid, Scratch and Boyden Chamber assays. RESULTS: Knockdown of LDH-A with subsequent decrease of lactate concentration leads to reduced levels of THBS-1 and TGF-beta2 in glioma cells. Lactate addition increases THBS-1 protein, leading to increased activation of TGF-beta2. Inhibition of THBS-1 reduces TGF-beta2 protein and migration of glioma cells. Addition of synthetic THBS-1 can rescue reduced TGF-beta2 protein levels and glioma cell migration in siLDH-A treated cells. CONCLUSION: We define a regulatory cascade between lactate, THBS-1 and TGF-beta2, leading to enhanced migration of glioma cells. Our results demonstrate a specific interaction between tumor metabolism and migration and provide a better understanding of the mechanisms underlying glioma cell invasion.


Assuntos
Movimento Celular/efeitos dos fármacos , Ácido Láctico/farmacologia , Trombospondina 1/metabolismo , Fator de Crescimento Transformador beta2/metabolismo , Apoptose/efeitos dos fármacos , Apoptose/genética , Western Blotting , Linhagem Celular Tumoral , Movimento Celular/genética , Proliferação de Células/efeitos dos fármacos , Ensaio de Imunoadsorção Enzimática , Glioma/genética , Glioma/metabolismo , Glioma/patologia , Humanos , Isoenzimas/genética , Isoenzimas/metabolismo , L-Lactato Desidrogenase/genética , L-Lactato Desidrogenase/metabolismo , Lactato Desidrogenase 5 , Ácido Láctico/metabolismo , Interferência de RNA , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Trombospondina 1/genética , Ativação Transcricional/efeitos dos fármacos , Fator de Crescimento Transformador beta2/genética , Células Tumorais Cultivadas
5.
J Am Soc Nephrol ; 24(11): 1830-48, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23990680

RESUMO

Mutations of the LMX1B gene cause nail-patella syndrome, a rare autosomal-dominant disorder affecting the development of the limbs, eyes, brain, and kidneys. The characterization of conventional Lmx1b knockout mice has shown that LMX1B regulates the development of podocyte foot processes and slit diaphragms, but studies using podocyte-specific Lmx1b knockout mice have yielded conflicting results regarding the importance of LMX1B for maintaining podocyte structures. In order to address this question, we generated inducible podocyte-specific Lmx1b knockout mice. One week of Lmx1b inactivation in adult mice resulted in proteinuria with only minimal foot process effacement. Notably, expression levels of slit diaphragm and basement membrane proteins remained stable at this time point, and basement membrane charge properties also did not change, suggesting that alternative mechanisms mediate the development of proteinuria in these mice. Cell biological and biophysical experiments with primary podocytes isolated after 1 week of Lmx1b inactivation indicated dysregulation of actin cytoskeleton organization, and time-resolved DNA microarray analysis identified the genes encoding actin cytoskeleton-associated proteins, including Abra and Arl4c, as putative LMX1B targets. Chromatin immunoprecipitation experiments in conditionally immortalized human podocytes and gel shift assays showed that LMX1B recognizes AT-rich binding sites (FLAT elements) in the promoter regions of ABRA and ARL4C, and knockdown experiments in zebrafish support a model in which LMX1B and ABRA act in a common pathway during pronephros development. Our report establishes the importance of LMX1B in fully differentiated podocytes and argues that LMX1B is essential for the maintenance of an appropriately structured actin cytoskeleton in podocytes.


Assuntos
Proteínas com Homeodomínio LIM/fisiologia , Podócitos/citologia , Fatores de Transcrição/fisiologia , Actinas/fisiologia , Envelhecimento , Animais , Apoptose , Diferenciação Celular , Colágeno Tipo IV/genética , Peptídeos e Proteínas de Sinalização Intracelular/genética , Proteínas com Homeodomínio LIM/genética , Proteínas de Membrana/genética , Camundongos , Camundongos Endogâmicos C57BL , Síndrome da Unha-Patela/etiologia , Análise de Sequência com Séries de Oligonucleotídeos , Podócitos/química , Podócitos/ultraestrutura , Proteinúria/etiologia , Fatores de Transcrição/genética , Peixe-Zebra
6.
Blood ; 120(18): e83-92, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22976956

RESUMO

Acute myeloid leukemia (AML) is characterized by molecular heterogeneity. As commonly altered genomic regions point to candidate genes involved in leukemogenesis, we used microarray-based comparative genomic hybridization and single nucleotide polymorphism profiling data of 391 AML cases to further narrow down genomic regions of interest. Targeted resequencing of 1000 genes located in the critical regions was performed in a representative cohort of 50 AML samples comprising all major cytogenetic subgroups. We identified 120 missense/nonsense mutations as well as 60 insertions/deletions affecting 73 different genes (∼ 3.6 tumor-specific aberrations/AML). While most of the newly identified alterations were nonrecurrent, we observed an enrichment of mutations affecting genes involved in epigenetic regulation including known candidates like TET2, TET1, DNMT3A, and DNMT1, as well as mutations in the histone methyltransferases NSD1, EZH2, and MLL3. Furthermore, we found mutations in the splicing factor SFPQ and in the nonclassic regulators of mRNA processing CTCF and RAD21. These splicing-related mutations affected 10% of AML patients in a mutually exclusive manner. In conclusion, we could identify a large number of alterations in genes involved in aberrant splicing and epigenetic regulation in genomic regions commonly altered in AML, highlighting their important role in the molecular pathogenesis of AML.


Assuntos
Montagem e Desmontagem da Cromatina/genética , Leucemia Mieloide Aguda/genética , Mutação , Splicing de RNA/genética , Hibridização Genômica Comparativa , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único
7.
Nature ; 488(7413): 675-9, 2012 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-22914092

RESUMO

The blood­brain barrier (BBB) and the environment of the central nervous system (CNS) guard the nervous tissue from peripheral immune cells. In the autoimmune disease multiple sclerosis, myelin-reactive T-cell blasts are thought to transgress the BBB and create a pro-inflammatory environment in the CNS, thereby making possible a second autoimmune attack that starts from the leptomeningeal vessels and progresses into the parenchyma. Using a Lewis rat model of experimental autoimmune encephalomyelitis, we show here that contrary to the expectations of this concept, T-cell blasts do not efficiently enter the CNS and are not required to prepare the BBB for immune-cell recruitment. Instead, intravenously transferred T-cell blasts gain the capacity to enter the CNS after residing transiently within the lung tissues. Inside the lung tissues, they move along and within the airways to bronchus-associated lymphoid tissues and lung-draining mediastinal lymph nodes before they enter the blood circulation from where they reach the CNS. Effector T cells transferred directly into the airways showed a similar migratory pattern and retained their full pathogenicity. On their way the T cells fundamentally reprogrammed their gene-expression profile, characterized by downregulation of their activation program and upregulation of cellular locomotion molecules together with chemokine and adhesion receptors. The adhesion receptors include ninjurin 1, which participates in T-cell intravascular crawling on cerebral blood vessels. We detected that the lung constitutes a niche not only for activated T cells but also for resting myelin-reactive memory T cells. After local stimulation in the lung, these cells strongly proliferate and, after assuming migratory properties, enter the CNS and induce paralytic disease. The lung could therefore contribute to the activation of potentially autoaggressive T cells and their transition to a migratory mode as a prerequisite to entering their target tissues and inducing autoimmune disease.


Assuntos
Encéfalo/patologia , Movimento Celular , Encefalomielite Autoimune Experimental/imunologia , Encefalomielite Autoimune Experimental/patologia , Pulmão/patologia , Linfócitos T/patologia , Transferência Adotiva , Animais , Autoimunidade/imunologia , Barreira Hematoencefálica/imunologia , Encéfalo/citologia , Encéfalo/imunologia , Moléculas de Adesão Celular Neuronais/metabolismo , Circulação Cerebrovascular , Modelos Animais de Doenças , Perfilação da Expressão Gênica , Memória Imunológica , Pulmão/citologia , Pulmão/imunologia , Ativação Linfocitária , Bainha de Mielina/imunologia , Fatores de Crescimento Neural/metabolismo , Ratos , Ratos Endogâmicos Lew , Linfócitos T/citologia , Linfócitos T/imunologia , Linfócitos T/metabolismo
8.
Cancer Res ; 70(5): 2030-40, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-20145155

RESUMO

Glioblastoma multiforme (GBM) is paradigmatic for the investigation of cancer stem cells (CSC) in solid tumors. Growing evidence suggests that different types of CSC lead to the formation of GBM. This has prompted the present comparison of gene expression profiles between 17 GBM CSC lines and their different putative founder cells. Using a newly derived 24-gene signature, we can now distinguish two subgroups of GBM: Type I CSC lines display "proneural" signature genes and resemble fetal neural stem cell (fNSC) lines, whereas type II CSC lines show "mesenchymal" transcriptional profiles similar to adult NSC (aNSC) lines. Phenotypically, type I CSC lines are CD133 positive and grow as neurospheres. Type II CSC lines, in contrast, display (semi-)adherent growth and lack CD133 expression. Molecular differences between type I and type II CSC lines include the expression of extracellular matrix molecules and the transcriptional activity of the WNT and the transforming growth factor-beta/bone morphogenetic protein signaling pathways. Importantly, these characteristics were not affected by induced adherence on laminin. Comparing CSC lines with their putative cells of origin, we observed greatly increased proliferation and impaired differentiation capacity in both types of CSC lines but no cancer-associated activation of otherwise silent signaling pathways. Thus, our data suggest that the heterogeneous tumor entity GBM may derive from cells that have preserved or acquired properties of either fNSC or aNSC but lost the corresponding differentiation potential. Moreover, we propose a gene signature that enables the subclassification of GBM according to their putative cells of origin.


Assuntos
Antígenos CD/biossíntese , Glioblastoma/genética , Glioblastoma/patologia , Glicoproteínas/biossíntese , Células-Tronco Neoplásicas/fisiologia , Antígeno AC133 , Processos de Crescimento Celular/genética , Linhagem Celular Tumoral , Perfilação da Expressão Gênica , Humanos , Mesoderma/patologia , Células-Tronco Neoplásicas/patologia , Peptídeos , Transcrição Gênica
9.
Blood ; 115(3): e10-9, 2010 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-19965649

RESUMO

Blood of both humans and mice contains 2 main monocyte subsets. Here, we investigated the extent of their similarity using a microarray approach. Approximately 270 genes in humans and 550 genes in mice were differentially expressed between subsets by 2-fold or more. More than 130 of these gene expression differences were conserved between mouse and human monocyte subsets. We confirmed numerous of these differences at the cell surface protein level. Despite overall conservation, some molecules were conversely expressed between the 2 species' subsets, including CD36, CD9, and TREM-1. Other differences included a prominent peroxisome proliferator-activated receptor gamma (PPARgamma) signature in mouse monocytes, which is absent in humans, and strikingly opposed patterns of receptors involved in uptake of apoptotic cells and other phagocytic cargo between human and mouse monocyte subsets. Thus, whereas human and mouse monocyte subsets are far more broadly conserved than currently recognized, important differences between the species deserve consideration when models of human disease are studied in mice.


Assuntos
Perfilação da Expressão Gênica , Monócitos/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Animais , Células Cultivadas , Humanos , Masculino , Camundongos , Camundongos Endogâmicos C57BL
10.
BMC Bioinformatics ; 10: 440, 2009 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-20028526

RESUMO

BACKGROUND: The Affymetrix MitoChip v2.0 is an oligonucleotide tiling array for the resequencing of the human mitochondrial (mt) genome. For each of 16,569 nucleotide positions of the mt genome it holds two sets of four 25-mer probes each that match the heavy and the light strand of a reference mt genome and vary only at their central position to interrogate all four possible alleles. In addition, the MitoChip v2.0 carries alternative local context probes to account for known mtDNA variants. These probes have been neglected in most studies due to the lack of software for their automated analysis. RESULTS: We provide ReseqChip, a free software that automates the process of resequencing mtDNA using multiple local context probes on the MitoChip v2.0. ReseqChip significantly improves base call rate and sequence accuracy. ReseqChip is available at http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live/trunk/Bio/Microarray/Tools/. CONCLUSIONS: ReseqChip allows for the automated consolidation of base calls from alternative local mt genome context probes. It thereby improves the accuracy of resequencing, while reducing the number of non-called bases.


Assuntos
Sequência de Bases , Biologia Computacional/métodos , DNA Mitocondrial/química , Genoma Mitocondrial/genética , Software , Análise de Sequência de DNA
11.
Anal Chem ; 81(14): 5731-9, 2009 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-19522528

RESUMO

Comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry (GC x GC-TOF-MS) was applied to the comparative metabolic fingerprinting of a wild-type versus a double mutant strain of Escherichia coli lacking the transhydrogenases UdhA and PntAB. Using peak lists generated with the Leco ChromaTOF software as input, we developed retention time correction and data alignment tools (INCA). The accuracy of peak alignment and detection of 1.1- to 4-fold changes in metabolite concentration was validated by a spike-in experiment with 20 standard compounds. A list of 48 significant features that differentiated the two E. coli strains was obtained with an estimated false discovery rate (FDR) of <0.05. A total of 27 metabolites, mainly from the citrate cycle, were identified. That the signal intensity of the m/z 73 trace of the trimethylsilyl (TMS) group reflected true differences in metabolite abundance was confirmed by quantification of pyruvate, fumarate, malate, succinate, alpha-ketoglutarate, citrate, cis-aconitate, myo-inositol, and glucose-6-phosphate using compound specific fragment ions and stable isotope labeled standards. Relative standard deviations for metabolite extraction and GC x GC-TOF-MS analysis of those analytes ranged from 13.2 to 26.3% for the universal m/z 73 trace and 7.4 to 24.5% for the analyte specific fragment ion trace.


Assuntos
Escherichia coli/metabolismo , Metabolômica/métodos , Cromatografia Gasosa , Processamento Eletrônico de Dados , Escherichia coli/classificação , Escherichia coli/genética , Modelos Lineares , Espectrometria de Massas , Mutação , NADP/metabolismo , NADP Trans-Hidrogenases/genética , NADP Trans-Hidrogenases/metabolismo , Análise de Componente Principal , Reprodutibilidade dos Testes , Fatores de Tempo
12.
Methods Mol Biol ; 453: 281-96, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18712310

RESUMO

Gene expression profiling using micro-arrays is a modern approach for molecular diagnostics. In clinical micro-array studies, researchers aim to predict disease type, survival, or treatment response using gene expression profiles. In this process, they encounter a series of obstacles and pitfalls. This chapter reviews fundamental issues from machine learning and recommends a procedure for the computational aspects of a clinical micro-array study.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Bases de Dados Genéticas , Humanos
13.
Nucleic Acids Res ; 36(16): e105, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18660515

RESUMO

Novel sequencing technologies permit the rapid production of large sequence data sets. These technologies are likely to revolutionize genetics and biomedical research, but a thorough characterization of the ultra-short read output is necessary. We generated and analyzed two Illumina 1G ultra-short read data sets, i.e. 2.8 million 27mer reads from a Beta vulgaris genomic clone and 12.3 million 36mers from the Helicobacter acinonychis genome. We found that error rates range from 0.3% at the beginning of reads to 3.8% at the end of reads. Wrong base calls are frequently preceded by base G. Base substitution error frequencies vary by 10- to 11-fold, with A > C transversion being among the most frequent and C > G transversions among the least frequent substitution errors. Insertions and deletions of single bases occur at very low rates. When simulating re-sequencing we found a 20-fold sequencing coverage to be sufficient to compensate errors by correct reads. The read coverage of the sequenced regions is biased; the highest read density was found in intervals with elevated GC content. High Solexa quality scores are over-optimistic and low scores underestimate the data quality. Our results show different types of biases and ways to detect them. Such biases have implications on the use and interpretation of Solexa data, for de novo sequencing, re-sequencing, the identification of single nucleotide polymorphisms and DNA methylation sites, as well as for transcriptome analysis.


Assuntos
Análise de Sequência de DNA/normas , Composição de Bases , Beta vulgaris/genética , DNA/química , Helicobacter/genética , Deleção de Sequência , Software
14.
Genome Res ; 17(11): 1697-706, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17908823

RESUMO

The latest revolution in the DNA sequencing field has been brought about by the development of automated sequencers that are capable of generating giga base pair data sets quickly and at low cost. Applications of such technologies seem to be limited to resequencing and transcript discovery, due to the shortness of the generated reads. In order to extend the fields of application to de novo sequencing, we developed the SHARCGS algorithm to assemble short-read (25-40-mer) data with high accuracy and speed. The efficiency of SHARCGS was tested on BAC inserts from three eukaryotic species, on two yeast chromosomes, and on two bacterial genomes (Haemophilus influenzae, Escherichia coli). We show that 30-mer-based BAC assemblies have N50 sizes >20 kbp for Drosophila and Arabidopsis and >4 kbp for human in simulations taking missing reads and wrong base calls into account. We assembled 949,974 contigs with length >50 bp, and only one single contig could not be aligned error-free against the reference sequences. We generated 36-mer reads for the genome of Helicobacter acinonychis on the Illumina 1G sequencing instrument and assembled 937 contigs covering 98% of the genome with an N50 size of 3.7 kbp. With the exception of five contigs that differ in 1-4 positions relative to the reference sequence, all contigs matched the genome error-free. Thus, SHARCGS is a suitable tool for fully exploiting novel sequencing technologies by assembling sequence contigs de novo with high confidence and by outperforming existing assembly algorithms in terms of speed and accuracy.


Assuntos
Algoritmos , Genômica/métodos , Cromossomos Artificiais Bacterianos/química , Mapeamento de Sequências Contíguas , Humanos , Análise de Sequência de DNA/métodos
15.
Mol Biol Evol ; 24(12): 2610-8, 2007 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17890238

RESUMO

B cells develop in the mammalian bone marrow through a sequence of precursor stages, which can be ordered by the recombination status of their immunoglobulin loci. This developmental pathway is functionally similar between mice and man. However, whether this similarity is based on usage of the same genes is unknown. We show that large-scale gene expression patterns differ substantially between human and mouse B-cell development. Among 644 genes which were differentially expressed in 4 early stages of human B-cell development, only 48, 86, and 75 genes could be identified, which are upregulated in both human and mouse pre-BI, large pre-BII, and small pre-BII cells, respectively. A comparison of mouse B- and T-cell development reveals that gene expression patterns of early murine B- and T-cell precursors are most similar, whereas in more differentiated precursors, human and mouse B cells have a more similar gene expression profile. We conclude that large-scale differences in gene expression patterns between human and mouse B-cell precursors may stem from either selective neutrality or compensatory evolution, whereas the few similarities may stem from negative selection. Gene expression patterns are shaped by ontogenic relationships in early and by functional specialization in later stages of development.


Assuntos
Linfócitos B/citologia , Linfócitos B/metabolismo , Evolução Biológica , Diferenciação Celular/genética , Mecanismo Genético de Compensação de Dose , Perfilação da Expressão Gênica , Seleção Genética , Animais , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Masculino , Camundongos , Especificidade da Espécie , Linfócitos T/citologia , Linfócitos T/metabolismo , Regulação para Cima
16.
Bioinformatics ; 23(17): 2256-64, 2007 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-17586546

RESUMO

MOTIVATION: Clustering algorithms are widely used in the analysis of microarray data. In clinical studies, they are often applied to find groups of co-regulated genes. Clustering, however, can also stratify patients by similarity of their gene expression profiles, thereby defining novel disease entities based on molecular characteristics. Several distance-based cluster algorithms have been suggested, but little attention has been given to the distance measure between patients. Even with the Euclidean metric, including and excluding genes from the analysis leads to different distances between the same objects, and consequently different clustering results. RESULTS: We describe a new clustering algorithm, in which gene selection is used to derive biologically meaningful clusterings of samples by combining expression profiles and functional annotation data. According to gene annotations, candidate gene sets with specific functional characterizations are generated. Each set defines a different distance measure between patients, leading to different clusterings. These clusterings are filtered using a resampling-based significance measure. Significant clusterings are reported together with the underlying gene sets and their functional definition. CONCLUSIONS: Our method reports clusterings defined by biologically focused sets of genes. In annotation-driven clusterings, we have recovered clinically relevant patient subgroups through biologically plausible sets of genes as well as new subgroupings. We conjecture that our method has the potential to reveal so far unknown, clinically relevant classes of patients in an unsupervised manner. AVAILABILITY: We provide the R package adSplit as part of Bioconductor release 1.9 and on http://compdiag.molgen.mpg.de/software.


Assuntos
Algoritmos , Biomarcadores Tumorais/análise , Perfilação da Expressão Gênica/métodos , Proteínas de Neoplasias/análise , Neoplasias/diagnóstico , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Diagnóstico por Computador/métodos , Humanos
17.
Blood ; 110(4): 1291-300, 2007 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-17485551

RESUMO

Core binding factor (CBF) leukemias, characterized by either inv(16)/t(16;16) or t(8;21), constitute acute myeloid leukemia (AML) subgroups with favorable prognosis. However, there exists substantial biologic and clinical heterogeneity within these cytogenetic groups that is not fully reflected by the current classification system. To improve the molecular characterization we profiled gene expression in a large series (n = 93) of AML patients with CBF leukemia [(inv (16), n = 55; t(8;21), n = 38)]. By unsupervised hierarchical clustering we were able to define a subgroup of CBF cases (n = 35) characterized by shorter overall survival times (P = .03). While there was no obvious correlation with fusion gene transcript levels, FLT3 tyrosine kinase domain, KIT, and NRAS mutations, the newly defined inv(16)/t(8;21) subgroup was associated with elevated white blood cell counts and FLT3 internal tandem duplications (P = .011 and P = .026, respectively). Supervised analyses of gene expression suggested alternative cooperating pathways leading to transformation. In the "favorable" CBF leukemias, antiapoptotic mechanisms and deregulated mTOR signaling and, in the newly defined "unfavorable" subgroup, aberrant MAPK signaling and chemotherapy-resistance mechanisms might play a role. While the leukemogenic relevance of these signatures remains to be validated, their existence nevertheless supports a prognostically relevant biologic basis for the heterogeneity observed in CBF leukemia.


Assuntos
Biomarcadores Tumorais/genética , Fatores de Ligação ao Core/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Leucemia Mieloide Aguda/genética , Mutação/genética , Adulto , Idoso , Aberrações Cromossômicas , Inversão Cromossômica , Análise por Conglomerados , Feminino , Humanos , Cariotipagem , Leucemia Mieloide Aguda/classificação , Masculino , Pessoa de Meia-Idade , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Proteínas Proto-Oncogênicas c-kit/genética , Taxa de Sobrevida , Tirosina Quinase 3 Semelhante a fms/genética
18.
Clin Cancer Res ; 12(15): 4553-61, 2006 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-16899601

RESUMO

PURPOSE: In childhood acute lymphoblastic leukemia (ALL), approximately 25% of patients suffer from relapse. In recurrent disease, despite intensified therapy, overall cure rates of 40% remain unsatisfactory and survival rates are particularly poor in certain subgroups. The probability of long-term survival after relapse is predicted from well-established prognostic factors (i.e., time and site of relapse, immunophenotype, and minimal residual disease). However, the underlying biological determinants of these prognostic factors remain poorly understood. EXPERIMENTAL DESIGN: Aiming at identifying molecular pathways associated with these clinically well-defined prognostic factors, we did gene expression profiling on 60 prospectively collected samples of first relapse patients enrolled on the relapse trial ALL-REZ BFM 2002 of the Berlin-Frankfurt-Münster study group. RESULTS: We show here that patients with very early relapse of ALL are characterized by a distinctive gene expression pattern. We identified a set of 83 genes differentially expressed in very early relapsed ALL compared with late relapsed disease. The vast majority of genes were up-regulated and many were late cell cycle genes with a function in mitosis. In addition, samples from patients with very early relapse showed a significant increase in the percentage of S and G(2)-M phase cells and this correlated well with the expression level of cell cycle genes. CONCLUSIONS: Very early relapse of ALL is characterized by an increased proliferative capacity of leukemic blasts and up-regulated mitotic genes. The latter suggests that novel drugs, targeting late cell cycle proteins, might be beneficial for these patients that typically face a dismal prognosis.


Assuntos
Proteínas de Ciclo Celular/genética , Perfilação da Expressão Gênica , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Ciclo Celular/genética , Proteínas de Ciclo Celular/biossíntese , Divisão Celular/genética , Proliferação de Células , Criança , Fase G2/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Valor Preditivo dos Testes , Prognóstico , Estudos Prospectivos , Recidiva , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , Regulação para Cima/genética
19.
Bioinformatics ; 22(18): 2315-6, 2006 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-16844712

RESUMO

UNLABELLED: OrderedList is a Bioconductor compliant package for meta-analysis based on ordered gene lists like those resulting from differential gene expression analysis. Our package quantifies the similarity between gene lists. The significance of the similarity score is estimated from random scores computed on perturbed data. OrderedList illustrates list similarity in intuitive plots and determines the score-driving genes for further analysis. AVAILABILITY: http://www.bioconductor.org CONTACT: claudio.lottaz@molgen.mpg.de SUPPLEMENTARY INFORMATION: Please visit our webpage on http://compdiag.molgen.mpg.de/software.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Modelos Biológicos , Família Multigênica/fisiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Inteligência Artificial , Simulação por Computador , Reconhecimento Automatizado de Padrão
20.
BMC Bioinformatics ; 6: 211, 2005 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-16122395

RESUMO

BACKGROUND: Genome wide microarray studies have the potential to unveil novel disease entities. Clinically homogeneous groups of patients can have diverse gene expression profiles. The definition of novel subclasses based on gene expression is a difficult problem not addressed systematically by currently available software tools. RESULTS: We present a computational tool for semi-supervised molecular disease entity detection. It automatically discovers molecular heterogeneities in phenotypically defined disease entities and suggests alternative molecular sub-entities of clinical phenotypes. This is done using both gene expression data and functional gene annotations. We provide stam, a Bioconductor compliant software package for the statistical programming environment R. We demonstrate that our tool detects gene expression patterns, which are characteristic for only a subset of patients from an established disease entity. We call such expression patterns molecular symptoms. Furthermore, stam finds novel sub-group stratifications of patients according to the absence or presence of molecular symptoms. CONCLUSION: Our software is easy to install and can be applied to a wide range of datasets. It provides the potential to reveal so far indistinguishable patient sub-groups of clinical relevance.


Assuntos
Computadores Moleculares , Perfilação da Expressão Gênica/métodos , Análise Serial de Proteínas/métodos , Software , Calibragem , Análise por Conglomerados , Humanos , Internet , Fenótipo , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA