Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Brief Bioinform ; 19(2): 179-187, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-27802932

RESUMO

Motivation: Despite being essential for numerous clinical and research applications, high-resolution human leukocyte antigen (HLA) typing remains challenging and laboratory tests are also time-consuming and labour intensive. With next-generation sequencing data becoming widely accessible, on-demand in silico HLA typing offers an economical and efficient alternative. Results: In this study we evaluate the HLA typing accuracy and efficiency of five computational HLA typing methods by comparing their predictions against a curated set of > 1000 published polymerase chain reaction-derived HLA genotypes on three different data sets (whole genome sequencing, whole exome sequencing and transcriptomic sequencing data). The highest accuracy at clinically relevant resolution (four digits) we observe is 81% on RNAseq data by PHLAT and 99% accuracy by OptiType when limited to Class I genes only. We also observed variability between the tools for resource consumption, with runtime ranging from an average of 5 h (HLAminer) to 7 min (seq2HLA) and memory from 12.8 GB (HLA-VBSeq) to 0.46 GB (HLAminer) per sample. While a minimal coverage is required, other factors also determine prediction accuracy and the results between tools do not correlate well. Therefore, by combining tools, there is the potential to develop a highly accurate ensemble method that is able to deliver fast, economical HLA typing from existing sequencing data.


Assuntos
Algoritmos , Antígenos HLA/genética , Teste de Histocompatibilidade/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Exoma , Genótipo , Humanos
2.
Clin Chem ; 60(8): 1105-14, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24899692

RESUMO

BACKGROUND: We describe a novel approach that harnesses the ubiquity of copy number deletion polymorphisms in human genomes to definitively detect and quantify chimeric DNA in clinical samples. Unlike other molecular approaches to chimerism analysis, the copy number deletion (CND) method targets genomic loci (>50 base pairs in length) that are wholly absent from wild-type (i.e., self) background DNA sequences in a sex-independent manner. METHODS: Bespoke quantitative PCR (qPCR) CND assays were developed and validated using a series of DNA standards and chimeric plasma DNA samples collected from 2 allogeneic kidney transplant recipients and 12 pregnant women. Assay performance and informativeness were assessed using appropriate statistical methods. RESULTS: The CND qPCR assays showed high sensitivity, precision, and reliability for linear quantification of DNA chimerism down to 16 genomic equivalents (i.e., 106 pg). Fetal fraction (%) in 12 singleton male pregnancies was calculated using the CND qPCR approach, which showed closer agreement with single-nucleotide polymorphism-based massively parallel sequencing than the SRY (sex determining region Y) (Y chromosome) qPCR assay. The latter consistently underestimated the fetal fraction relative to the other methods. We also were able to measure biological changes in plasma nonself DNA concentrations in 2 renal transplant recipients. CONCLUSIONS: The CND qPCR technique is suitable for measurement of chimerism for monitoring of rejection in allogeneic organ transplantation and quantification of the cell-free fetal DNA fraction in maternal plasma samples used for noninvasive prenatal genetic testing.


Assuntos
Quimera/genética , Variações do Número de Cópias de DNA , Humanos , Limite de Detecção , Reação em Cadeia da Polimerase/métodos , Reprodutibilidade dos Testes
3.
BMC Genomics ; 11: 540, 2010 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-20925945

RESUMO

BACKGROUND: The demands of microarray expression technologies for quantities of RNA place a limit on the questions they can address. As a consequence, the RNA requirements have reduced over time as technologies have improved. In this paper we investigate the costs of reducing the starting quantity of RNA for the Illumina BeadArray platform. This we do via a dilution data set generated from two reference RNA sources that have become the standard for investigations into microarray and sequencing technologies. RESULTS: We find that the starting quantity of RNA has an effect on observed intensities despite the fact that the quantity of cRNA being hybridized remains constant. We see a loss of sensitivity when using lower quantities of RNA, but no great rise in the false positive rate. Even with 10 ng of starting RNA, the positive results are reliable although many differentially expressed genes are missed. We see that there is some scope for combining data from samples that have contributed differing quantities of RNA, but note also that sample sizes should increase to compensate for the loss of signal-to-noise when using low quantities of starting RNA. CONCLUSIONS: The BeadArray platform maintains a low false discovery rate even when small amounts of starting RNA are used. In contrast, the sensitivity of the platform drops off noticeably over the same range. Thus, those conducting experiments should not opt for low quantities of starting RNA without consideration of the costs of doing so. The implications for experimental design, and the integration of data from different starting quantities, are complex.


Assuntos
Microesferas , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , RNA/análise , Reações Falso-Positivas , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , RNA/genética , Padrões de Referência
4.
Nucleic Acids Res ; 36(15): 4823-32, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18653532

RESUMO

The alternative splicing code that controls and coordinates the transcriptome in complex multicellular organisms remains poorly understood. It has long been argued that regulation of alternative splicing relies on combinatorial interactions between multiple proteins, and that tissue-specific splicing decisions most likely result from differences in the concentration and/or activity of these proteins. However, large-scale data to systematically address this issue have just recently started to become available. Here we show that splicing factor gene expression signatures can be identified that reflect cell type and tissue-specific patterns of alternative splicing. We used a computational approach to analyze microarray-based gene expression profiles of splicing factors from mouse, chimpanzee and human tissues. Our results show that brain and testis, the two tissues with highest levels of alternative splicing events, have the largest number of splicing factor genes that are most highly differentially expressed. We further identified SR protein kinases and small nuclear ribonucleoprotein particle (snRNP) proteins among the splicing factor genes that are most highly differentially expressed in a particular tissue. These results indicate the power of generating signature-based predictions as an initial computational approach into a global view of tissue-specific alternative splicing regulation.


Assuntos
Processamento Alternativo , Perfilação da Expressão Gênica , Proteínas de Ligação a RNA/metabolismo , Animais , Diferenciação Celular/genética , Linhagem Celular , Biologia Computacional , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Pan troglodytes/genética , RNA Mensageiro/análise , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/genética , Distribuição Tecidual
5.
Nucleic Acids Res ; 34(20): e136, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17041235

RESUMO

We describe an optimized microarray method for identifying genome-wide CpG island methylation called microarray-based methylation assessment of single samples (MMASS) which directly compares methylated to unmethylated sequences within a single sample. To improve previous methods we used bioinformatic analysis to predict an optimized combination of methylation-sensitive enzymes that had the highest utility for CpG-island probes and different methods to produce unmethylated representations of test DNA for more sensitive detection of differential methylation by hybridization. Subtraction or methylation-dependent digestion with McrBC was used with optimized (MMASS-v2) or previously described (MMASS-v1, MMASS-sub) methylation-sensitive enzyme combinations and compared with a published McrBC method. Comparison was performed using DNA from the cell line HCT116. We show that the distribution of methylation microarray data is inherently skewed and requires exogenous spiked controls for normalization and that analysis of digestion of methylated and unmethylated control sequences together with linear fit models of replicate data showed superior statistical power for the MMASS-v2 method. Comparison with previous methylation data for HCT116 and validation of CpG islands from PXMP4, SFRP2, DCC, RARB and TSEN2 confirmed the accuracy of MMASS-v2 results. The MMASS-v2 method offers improved sensitivity and statistical power for high-throughput microarray identification of differential methylation.


Assuntos
Ilhas de CpG , Metilação de DNA , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linhagem Celular Tumoral , Biologia Computacional , Sondas de DNA/química , Enzimas de Restrição do DNA , Genes Neoplásicos , Genômica/métodos , Humanos , Análise de Sequência de DNA
6.
Cancer Res ; 66(15): 7606-14, 2006 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-16885360

RESUMO

Metabolite profiling using (1)H nuclear magnetic resonance (NMR) spectroscopy was used to investigate the metabolic changes associated with deletion of the gene for the transcriptional coactivator p300 in the human colon carcinoma cell line HCT116. Multivariate statistical methods were used to distinguish between metabolite patterns that were dependent on cell growth conditions and those that were specifically associated with loss of p300 function. In the absence of serum, wild-type cells showed slower growth, which was accompanied by a marked decrease in phosphocholine concentration, which was not observed in otherwise isogenic cell lines lacking p300. In the presence of serum, several metabolites were identified as being significantly different between the two cell types, including glutamate and glutamine, a nicotinamide-related compound and glycerophosphocholine (GPC). However, in the absence of serum, these metabolites, with the exception of GPC, were not significantly different, leading us to conclude that most of these changes were context dependent. Transcript profiling, using DNA microarrays, showed changes in the levels of transcripts for several enzymes involved in choline metabolism, which might explain the change in GPC concentration. Localized in vivo (1)H NMR measurements on the tumors formed following s.c. implantation of these cells into mice showed an increase in the intensity of the peak from choline-containing compounds in the p300(-) tumors. These data show that NMR-based metabolite profiling has sufficient sensitivity to identify the metabolic consequences of p300 gene deletion in tumor cells in vitro and in vivo.


Assuntos
Neoplasias do Colo/genética , Neoplasias do Colo/metabolismo , Fatores de Transcrição de p300-CBP/deficiência , Animais , Meios de Cultivo Condicionados , Deleção de Genes , Perfilação da Expressão Gênica , Células HCT116 , Humanos , Camundongos , Camundongos SCID , Ressonância Magnética Nuclear Biomolecular , Transplante Heterólogo , Fatores de Transcrição de p300-CBP/genética
7.
BMC Bioinformatics ; 8: 26, 2007 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-17254358

RESUMO

BACKGROUND: There are mechanisms, notably ozone degradation, that can damage a single channel of two-channel microarray experiments. Resulting analyses therefore often choose between the unacceptable inclusion of poor quality data or the unpalatable exclusion of some (possibly a lot of) good quality data along with the bad. Two such approaches would be a single channel analysis using some of the data from all of the arrays, and an analysis of all of the data, but only from unaffected arrays. In this paper we examine a 'combined' approach to the analysis of such affected experiments that uses all of the unaffected data. RESULTS: A simulation experiment shows that while a single channel analysis performs relatively well when the majority of arrays are affected, and excluding affected arrays performs relatively well when few arrays are affected (as would be expected in both cases), the combined approach out-performs both. There are benefits to actively estimating the key-parameter of the approach, but whether these compensate for the increased computational cost and complexity over just setting that parameter to take a fixed value is not clear. Inclusion of ozone-affected data results in poor performance, with a clear spatial effect in the damage being apparent. CONCLUSION: There is no need to exclude unaffected data in order to remove those which are damaged. The combined approach discussed here is shown to out-perform more usual approaches, although it seems that if the damage is limited to very few arrays, or extends to very nearly all, then the benefits will be limited. In other circumstances though, large improvements in performance can be achieved by adopting such an approach.


Assuntos
Artefatos , Biomarcadores Tumorais/análise , Perfilação da Expressão Gênica/métodos , Microscopia de Fluorescência por Excitação Multifotônica/métodos , Proteínas de Neoplasias/análise , Neoplasias/diagnóstico , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Humanos , Neoplasias/metabolismo , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
Ann Clin Transl Neurol ; 4(5): 318-325, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28491899

RESUMO

OBJECTIVE: To explore the diagnostic utility and cost effectiveness of whole exome sequencing (WES) in a cohort of individuals with peripheral neuropathy. METHODS: Singleton WES was performed in individuals recruited though one pediatric and one adult tertiary center between February 2014 and December 2015. Initial analysis was restricted to a virtual panel of 55 genes associated with peripheral neuropathies. Patients with uninformative results underwent expanded analysis of the WES data. Data on the cost of prior investigations and assessments performed for diagnostic purposes in each patient was collected. RESULTS: Fifty patients with a peripheral neuropathy were recruited (median age 18 years; range 2-68 years). The median time from initial presentation to study enrollment was 6 years 9 months (range 2 months-62 years), and the average cost of prior investigations and assessments for diagnostic purposes AU$4013 per patient. Eleven individuals received a diagnosis from the virtual panel. Eight individuals received a diagnosis following expanded analysis of the WES data, increasing the overall diagnostic yield to 38%. Two additional individuals were diagnosed with pathogenic copy number variants through SNP microarray. CONCLUSIONS: This study provides evidence that WES has a high diagnostic utility and is cost effective in patients with a peripheral neuropathy. Expanded analysis of WES data significantly improves the diagnostic yield in patients in whom a diagnosis is not found on the initial targeted analysis. This is primarily due to diagnosis of conditions caused by newly discovered genes and the resolution of complex and atypical phenotypes.

9.
BMC Med Genomics ; 8: 29, 2015 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-26081108

RESUMO

BACKGROUND: High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. METHODS: In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. RESULTS: We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. CONCLUSIONS: These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.


Assuntos
Clivagem do DNA , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Sistema Livre de Células , Cromossomos Humanos/genética , DNA/sangue , Feminino , Genômica , Humanos , Gravidez
10.
Genome Med ; 7(1): 68, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26217397

RESUMO

The benefits of implementing high throughput sequencing in the clinic are quickly becoming apparent. However, few freely available bioinformatics pipelines have been built from the ground up with clinical genomics in mind. Here we present Cpipe, a pipeline designed specifically for clinical genetic disease diagnostics. Cpipe was developed by the Melbourne Genomics Health Alliance, an Australian initiative to promote common approaches to genomics across healthcare institutions. As such, Cpipe has been designed to provide fast, effective and reproducible analysis, while also being highly flexible and customisable to meet the individual needs of diverse clinical settings. Cpipe is being shared with the clinical sequencing community as an open source project and is available at http://cpipeline.org.

11.
PLoS One ; 9(7): e102079, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25014031

RESUMO

We apply a novel gene expression network analysis to a cohort of 182 recently reported candidate Epileptic Encephalopathy genes to identify those most likely to be true Epileptic Encephalopathy genes. These candidate genes were identified as having single variants of likely pathogenic significance discovered in a large-scale massively parallel sequencing study. Candidate Epileptic Encephalopathy genes were prioritized according to their co-expression with 29 known Epileptic Encephalopathy genes. We utilized developing brain and adult brain gene expression data from the Allen Human Brain Atlas (AHBA) and compared this to data from Celsius: a large, heterogeneous gene expression data warehouse. We show replicable prioritization results using these three independent gene expression resources, two of which are brain-specific, with small sample size, and the third derived from a heterogeneous collection of tissues with large sample size. Of the nineteen genes that we predicted with the highest likelihood to be true Epileptic Encephalopathy genes, two (GNAO1 and GRIN2B) have recently been independently reported and confirmed. We compare our results to those produced by an established in silico prioritization approach called Endeavour, and finally present gene expression networks for the known and candidate Epileptic Encephalopathy genes. This highlights sub-networks of gene expression, particularly in the network derived from the adult AHBA gene expression dataset. These networks give clues to the likely biological interactions between Epileptic Encephalopathy genes, potentially highlighting underlying mechanisms and avenues for therapeutic targets.


Assuntos
Biomarcadores/metabolismo , Biologia Computacional , Bases de Dados Genéticas , Epilepsia/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Adulto , Humanos , Análise de Sequência com Séries de Oligonucleotídeos
12.
PLoS One ; 9(1): e86993, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24489824

RESUMO

Pregnant women carry a mixture of cell-free DNA fragments from self and fetus (non-self) in their circulation. In recent years multiple independent studies have demonstrated the ability to detect fetal trisomies such as trisomy 21, the cause of Down syndrome, by Next-Generation Sequencing of maternal plasma. The current clinical tests based on this approach show very high sensitivity and specificity, although as yet they have not become the standard diagnostic test. Here we describe improvements to the analysis of the sequencing data by reducing GC bias and better handling of the genomic repeats. We show substantial improvements in the sensitivity of the standard trisomy 21 statistical tests, which we measure by artificially reducing read coverage. We also explore the bias stemming from the natural cleavage of plasma DNA by examining DNA motifs and position specific base distributions. We propose a model to correct this fragmentation bias and observe that incorporating this bias does not lead to any further improvements in the detection of fetal trisomy. The improved bias corrections that we demonstrate in this work can be readily adopted into existing fetal trisomy detection protocols and should also lead to improvements in sub-chromosomal copy number variation detection.


Assuntos
DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Diagnóstico Pré-Natal , Trissomia/diagnóstico , Adulto , Viés , DNA/sangue , Feminino , Feto , Testes Genéticos , Idade Gestacional , Humanos , Cariotipagem , Gravidez , Trissomia/genética
13.
Cell Cycle ; 7(22): 3525-33, 2008 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-19001879

RESUMO

Invasive urothelial cell carcinoma (UCC) is characterized by increased chromosomal instability and follows an aggressive clinical course in contrast to non-invasive disease. To identify molecular processes that confer and maintain an aggressive malignant phenotype, we used a high-throughput genome-wide approach to interrogate a cohort of high and low clinical risk UCC tumors. Differential expression analyses highlighted cohesive dysregulation of critical genes involved in the G(2)/M checkpoint in aggressive UCC. Hierarchical clustering based on DNA Damage Response (DDR) genes separated tumors according to a pre-defined clinical risk phenotype. Using array-comparative genomic hybridization, we confirmed that the DDR was disrupted in tumors displaying high genomic instability. We identified DNA copy number gains at 20q13.2-q13.3 (AURKA locus) and determined that overexpression of AURKA accompanied dysregulation of DDR genes in high risk tumors. We postulated that DDR-deficient UCC tumors are advantaged by a selective pressure for AURKA associated override of M phase barriers and confirmed this in an independent tissue microarray series. This mechanism that enables cancer cells to maintain an aggressive phenotype forms a rationale for targeting AURKA as a therapeutic strategy in advanced stage UCC.


Assuntos
Reparo do DNA/genética , Regulação Neoplásica da Expressão Gênica , Proteínas Serina-Treonina Quinases/genética , Neoplasias Urológicas/genética , Neoplasias Urológicas/patologia , Aurora Quinase A , Aurora Quinases , Proteínas de Ciclo Celular/genética , Cromossomos Humanos Par 20 , Estudos de Coortes , Dosagem de Genes , Perfilação da Expressão Gênica , Genômica , Humanos , Invasividade Neoplásica , Fenótipo , Urotélio/patologia
14.
Nat Biotechnol ; 26(7): 779-85, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18612301

RESUMO

DNA methylation is an indispensible epigenetic modification required for regulating the expression of mammalian genomes. Immunoprecipitation-based methods for DNA methylome analysis are rapidly shifting the bottleneck in this field from data generation to data analysis, necessitating the development of better analytical tools. In particular, an inability to estimate absolute methylation levels remains a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling. To address this issue, we developed a cross-platform algorithm-Bayesian tool for methylation analysis (Batman)-for analyzing methylated DNA immunoprecipitation (MeDIP) profiles generated using oligonucleotide arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). We developed the latter approach to provide a high-resolution whole-genome DNA methylation profile (DNA methylome) of a mammalian genome. Strong correlation of our data, obtained using mature human spermatozoa, with those obtained using bisulfite sequencing suggest that combining MeDIP-seq or MeDIP-chip with Batman provides a robust, quantitative and cost-effective functional genomic strategy for elucidating the function of DNA methylation.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina/métodos , Mapeamento Cromossômico/métodos , Metilação de DNA , DNA/genética , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Teorema de Bayes , Dados de Sequência Molecular
15.
Genome Res ; 18(9): 1518-29, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18577705

RESUMO

We report a novel resource (methylation profiles of DNA, or mPod) for human genome-wide tissue-specific DNA methylation profiles. mPod consists of three fully integrated parts, genome-wide DNA methylation reference profiles of 13 normal somatic tissues, placenta, sperm, and an immortalized cell line, a visualization tool that has been integrated with the Ensembl genome browser and a new algorithm for the analysis of immunoprecipitation-based DNA methylation profiles. We demonstrate the utility of our resource by identifying the first comprehensive genome-wide set of tissue-specific differentially methylated regions (tDMRs) that may play a role in cellular identity and the regulation of tissue-specific genome function. We also discuss the implications of our findings with respect to the regulatory potential of regions with varied CpG density, gene expression, transcription factor motifs, gene ontology, and correlation with other epigenetic marks such as histone modifications.


Assuntos
Metilação de DNA , Genoma Humano , Software , Algoritmos , Ilhas de CpG , DNA/metabolismo , Epigênese Genética , Perfilação da Expressão Gênica/métodos , Humanos
16.
Genome Biol ; 8(10): R228, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17961237

RESUMO

BACKGROUND: Large-scale high throughput studies using microarray technology have established that copy number variation (CNV) throughout the genome is more frequent than previously thought. Such variation is known to play an important role in the presence and development of phenotypes such as HIV-1 infection and Alzheimer's disease. However, methods for analyzing the complex data produced and identifying regions of CNV are still being refined. RESULTS: We describe the presence of a genome-wide technical artifact, spatial autocorrelation or 'wave', which occurs in a large dataset used to determine the location of CNV across the genome. By removing this artifact we are able to obtain both a more biologically meaningful clustering of the data and an increase in the number of CNVs identified by current calling methods without a major increase in the number of false positives detected. Moreover, removing this artifact is critical for the development of a novel model-based CNV calling algorithm - CNVmix - that uses cross-sample information to identify regions of the genome where CNVs occur. For regions of CNV that are identified by both CNVmix and current methods, we demonstrate that CNVmix is better able to categorize samples into groups that represent copy number gains or losses. CONCLUSION: Removing artifactual 'waves' (which appear to be a general feature of array comparative genomic hybridization (aCGH) datasets) and using cross-sample information when identifying CNVs enables more biological information to be extracted from aCGH experiments designed to investigate copy number variation in normal individuals.


Assuntos
Algoritmos , Dosagem de Genes/genética , Variação Genética , Análise em Microsséries/métodos , Hibridização de Ácido Nucleico/genética , Interpretação Estatística de Dados
17.
PLoS One ; 2(10): e1061, 2007 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-17957245

RESUMO

Maintaining quiescent cells in G0 phase is achieved in part through the multiprotein subunit complex known as DREAM, and in human cell lines the transcription factor E2F4 directs this complex to its cell cycle targets. We found that E2F4 binds a highly overlapping set of human genes among three diverse primary tissues and an asynchronous cell line, which suggests that tissue-specific binding partners and chromatin structure have minimal influence on E2F4 targeting. To investigate the conservation of these transcription factor binding events, we identified the mouse genes bound by E2f4 in seven primary mouse tissues and a cell line. E2f4 bound a set of mouse genes that was common among mouse tissues, but largely distinct from the genes bound in human. The evolutionarily conserved set of E2F4 bound genes is highly enriched for functionally relevant regulatory interactions important for maintaining cellular quiescence. In contrast, we found minimal mRNA expression perturbations in this core set of E2f4 bound genes in the liver, kidney, and testes of E2f4 null mice. Thus, the regulatory mechanisms maintaining quiescence are robust even to complete loss of conserved transcription factor binding events.


Assuntos
Fator de Transcrição E2F4/genética , Fator de Transcrição E2F4/metabolismo , Regulação da Expressão Gênica , Animais , Células Cultivadas , Sequência Conservada , Evolução Molecular , Genoma , Heterozigoto , Humanos , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Modelos Biológicos , Distribuição Tecidual , Fatores de Transcrição/metabolismo
18.
Genome Biol ; 8(10): R214, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17922911

RESUMO

BACKGROUND: MicroRNAs (miRNAs), a class of short non-coding RNAs found in many plants and animals, often act post-transcriptionally to inhibit gene expression. RESULTS: Here we report the analysis of miRNA expression in 93 primary human breast tumors, using a bead-based flow cytometric miRNA expression profiling method. Of 309 human miRNAs assayed, we identify 133 miRNAs expressed in human breast and breast tumors. We used mRNA expression profiling to classify the breast tumors as luminal A, luminal B, basal-like, HER2+ and normal-like. A number of miRNAs are differentially expressed between these molecular tumor subtypes and individual miRNAs are associated with clinicopathological factors. Furthermore, we find that miRNAs could classify basal versus luminal tumor subtypes in an independent data set. In some cases, changes in miRNA expression correlate with genomic loss or gain; in others, changes in miRNA expression are likely due to changes in primary transcription and or miRNA biogenesis. Finally, the expression of DICER1 and AGO2 is correlated with tumor subtype and may explain some of the changes in miRNA expression observed. CONCLUSION: This study represents the first integrated analysis of miRNA expression, mRNA expression and genomic changes in human breast cancer and may serve as a basis for functional studies of the role of miRNAs in the etiology of breast cancer. Furthermore, we demonstrate that bead-based flow cytometric miRNA expression profiling might be a suitable platform to classify breast cancer into prognostic molecular subtypes.


Assuntos
Neoplasias da Mama/metabolismo , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Glândulas Mamárias Humanas/metabolismo , MicroRNAs/metabolismo , RNA Mensageiro/metabolismo , Proteínas Argonautas , Neoplasias da Mama/classificação , RNA Helicases DEAD-box/metabolismo , Primers do DNA , Endorribonucleases/metabolismo , Fator de Iniciação 2 em Eucariotos/metabolismo , Feminino , Citometria de Fluxo , Humanos , MicroRNAs/genética , Ribonuclease III
19.
Genome Biol ; 8(10): R215, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17925008

RESUMO

BACKGROUND: The characterization of copy number alteration patterns in breast cancer requires high-resolution genome-wide profiling of a large panel of tumor specimens. To date, most genome-wide array comparative genomic hybridization studies have used tumor panels of relatively large tumor size and high Nottingham Prognostic Index (NPI) that are not as representative of breast cancer demographics. RESULTS: We performed an oligo-array-based high-resolution analysis of copy number alterations in 171 primary breast tumors of relatively small size and low NPI, which was therefore more representative of breast cancer demographics. Hierarchical clustering over the common regions of alteration identified a novel subtype of high-grade estrogen receptor (ER)-negative breast cancer, characterized by a low genomic instability index. We were able to validate the existence of this genomic subtype in one external breast cancer cohort. Using matched array expression data we also identified the genomic regions showing the strongest coordinate expression changes ('hotspots'). We show that several of these hotspots are located in the phosphatome, kinome and chromatinome, and harbor members of the 122-breast cancer CAN-list. Furthermore, we identify frequently amplified hotspots on 8q22.3 (EDD1, WDSOF1), 8q24.11-13 (THRAP6, DCC1, SQLE, SPG8) and 11q14.1 (NDUFC2, ALG8, USP35) associated with significantly worse prognosis. Amplification of any of these regions identified 37 samples with significantly worse overall survival (hazard ratio (HR) = 2.3 (1.3-1.4) p = 0.003) and time to distant metastasis (HR = 2.6 (1.4-5.1) p = 0.004) independently of NPI. CONCLUSION: We present strong evidence for the existence of a novel subtype of high-grade ER-negative tumors that is characterized by a low genomic instability index. We also provide a genome-wide list of common copy number alteration regions in breast cancer that show strong coordinate aberrant expression, and further identify novel frequently amplified regions that correlate with poor prognosis. Many of the genes associated with these regions represent likely novel oncogenes or tumor suppressors.


Assuntos
Neoplasias da Mama/genética , Cromossomos Humanos Par 11/genética , Cromossomos Humanos Par 8/genética , Dosagem de Genes/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Hibridização de Ácido Nucleico/métodos , Oncogenes/genética , Neoplasias da Mama/classificação , Feminino , Instabilidade Genômica , Genômica/métodos , Humanos , Receptores de Estrogênio/metabolismo
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa