Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Blood ; 135(25): 2235-2251, 2020 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-32384151

RESUMO

Aging is associated with significant changes in the hematopoietic system, including increased inflammation, impaired hematopoietic stem cell (HSC) function, and increased incidence of myeloid malignancy. Inflammation of aging ("inflammaging") has been proposed as a driver of age-related changes in HSC function and myeloid malignancy, but mechanisms linking these phenomena remain poorly defined. We identified loss of miR-146a as driving aging-associated inflammation in AML patients. miR-146a expression declined in old wild-type mice, and loss of miR-146a promoted premature HSC aging and inflammation in young miR-146a-null mice, preceding development of aging-associated myeloid malignancy. Using single-cell assays of HSC quiescence, stemness, differentiation potential, and epigenetic state to probe HSC function and population structure, we found that loss of miR-146a depleted a subpopulation of primitive, quiescent HSCs. DNA methylation and transcriptome profiling implicated NF-κB, IL6, and TNF as potential drivers of HSC dysfunction, activating an inflammatory signaling relay promoting IL6 and TNF secretion from mature miR-146a-/- myeloid and lymphoid cells. Reducing inflammation by targeting Il6 or Tnf was sufficient to restore single-cell measures of miR-146a-/- HSC function and subpopulation structure and reduced the incidence of hematological malignancy in miR-146a-/- mice. miR-146a-/- HSCs exhibited enhanced sensitivity to IL6 stimulation, indicating that loss of miR-146a affects HSC function via both cell-extrinsic inflammatory signals and increased cell-intrinsic sensitivity to inflammation. Thus, loss of miR-146a regulates cell-extrinsic and -intrinsic mechanisms linking HSC inflammaging to the development of myeloid malignancy.


Assuntos
Envelhecimento/genética , Inflamação/genética , Interleucina-6/fisiologia , Leucemia Mieloide Aguda/etiologia , MicroRNAs/genética , Fator de Necrose Tumoral alfa/fisiologia , Adolescente , Adulto , Idoso , Envelhecimento/imunologia , Animais , Diferenciação Celular , Autorrenovação Celular , Senescência Celular , Citocinas/biossíntese , Metilação de DNA , Feminino , Células-Tronco Hematopoéticas/metabolismo , Células-Tronco Hematopoéticas/patologia , Humanos , Inflamação/fisiopatologia , Interleucina-6/antagonistas & inibidores , Masculino , Camundongos , Camundongos Knockout , MicroRNAs/biossíntese , Pessoa de Meia-Idade , NF-kappa B/fisiologia , Análise de Célula Única , Transcriptoma , Fator de Necrose Tumoral alfa/antagonistas & inibidores , Adulto Jovem
2.
Genome Res ; 21(12): 2224-41, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21926179

RESUMO

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Assuntos
Genoma/fisiologia , Genômica/métodos , Análise de Sequência de DNA/métodos
3.
BMC Genomics ; 14: 550, 2013 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-23941359

RESUMO

BACKGROUND: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. RESULTS: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. CONCLUSIONS: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.


Assuntos
Duplicação Gênica/genética , Perfilação da Expressão Gênica/métodos , Fusão Gênica/genética , Genômica , Neoplasias da Mama/genética , Éxons/genética , Humanos , Leucemia Mieloide Aguda/genética , Anotação de Sequência Molecular , RNA Mensageiro/genética , Estatística como Assunto
4.
Proc Natl Acad Sci U S A ; 107(38): 16589-94, 2010 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-20807748

RESUMO

The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination "knockins" in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5' of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type-specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.


Assuntos
Encéfalo/metabolismo , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Nucleico , Animais , Diferenciação Celular/genética , Biologia Computacional , Bases de Dados Genéticas , Células-Tronco Embrionárias/citologia , Células-Tronco Embrionárias/metabolismo , Expressão Gênica , Perfilação da Expressão Gênica/estatística & dados numéricos , Técnicas de Introdução de Genes , Genes Reporter , Genômica , Humanos , Camundongos , Camundongos Transgênicos , Neurônios/citologia , Neurônios/metabolismo
5.
Leukemia ; 37(4): 776-787, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36788336

RESUMO

We recently described a 16-gene expression signature for improved risk stratification of acute myeloid leukemia (AML) patients called the AML Prognostic Score (APS). A subset of APS-high-risk AML patients showed increased levels of focal adhesion kinase (FAK), encoded by the Protein Tyrosine Kinase 2 (PTK2) gene, which was correlated with RUNX1 mutations. RUNX1 mutant cells are more sensitive to PTK2 inhibitors. As we were not able to detect RUNX1-binding sites in the PTK2 promoter, we hypothesized that RUNX1 might regulate micro(mi)RNAs that repress PTK2, such that loss-of-function RUNX1 mutations would result in reduced miRNA expression and derepression of PTK2. Examination of paired RNA-seq and miRNA-seq data from 301 AML cases revealed two miRNAs that positively correlated with RUNX1 expression, contained RUNX1-binding sites in their promoters and were predicted to target PTK2. We show that the hsa-let7a-2-3p and hsa-miR-135a-5p promoters are regulated by RUNX1, and that PTK2 is a direct target of both miRNAs. Even in the absence of RUNX1 mutations, hsa-let7a-2-3p and hsa-miR-135a-5p regulate PTK2 expression, and reduced expression of these two miRNAs sensitizes AML cells to PTK2 inhibition. These data explain how RUNX1 regulates PTK2, and identify potential miRNA biomarkers for targeting AML with PTK2 inhibitors.


Assuntos
Leucemia Mieloide Aguda , MicroRNAs , Humanos , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Quinase 1 de Adesão Focal , Proteína-Tirosina Quinases de Adesão Focal , Leucemia Mieloide Aguda/genética , MicroRNAs/genética , MicroRNAs/metabolismo
6.
BMC Genomics ; 12: 450, 2011 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-21923906

RESUMO

BACKGROUND: As scientists continue to pursue various 'omics-based research, there is a need for high quality data for the most fundamental 'omics of all: genomics. The bacterium Paenibacillus larvae is the causative agent of the honey bee disease American foulbrood. If untreated, it can lead to the demise of an entire hive; the highly social nature of bees also leads to easy disease spread, between both individuals and colonies. Biologists have studied this organism since the early 1900s, and a century later, the molecular mechanism of infection remains elusive. Transcriptomics and proteomics, because of their ability to analyze multiple genes and proteins in a high-throughput manner, may be very helpful to its study. However, the power of these methodologies is severely limited without a complete genome; we undertake to address that deficiency here. RESULTS: We used the Illumina GAIIx platform and conventional Sanger sequencing to generate a 182-fold sequence coverage of the P. larvae genome, and assembled the data using ABySS into a total of 388 contigs spanning 4.5 Mbp. Comparative genomics analysis against fully-sequenced soil bacteria P. JDR2 and P. vortex showed that regions of poor conservation may contain putative virulence factors. We used GLIMMER to predict 3568 gene models, and named them based on homology revealed by BLAST searches; proteases, hemolytic factors, toxins, and antibiotic resistance enzymes were identified in this way. Finally, mass spectrometry was used to provide experimental evidence that at least 35% of the genes are expressed at the protein level. CONCLUSIONS: This update on the genome of P. larvae and annotation represents an immense advancement from what we had previously known about this species. We provide here a reliable resource that can be used to elucidate the mechanism of infection, and by extension, more effective methods to control and cure this widespread honey bee disease.


Assuntos
Abelhas/microbiologia , Genoma Bacteriano , Paenibacillus/genética , Animais , Hibridização Genômica Comparativa , Biologia Computacional , DNA Bacteriano/genética , Anotação de Sequência Molecular , Proteômica , Análise de Sequência de DNA
7.
J Mol Diagn ; 23(4): 455-466, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33486075

RESUMO

Clinical reporting of solid tumor sequencing requires reliable assessment of the accuracy and reproducibility of each assay. Somatic mutation variant allele fractions may be below 10% in many samples due to sample heterogeneity, tumor clonality, and/or sample degradation in fixatives such as formalin. The toolkits available to the clinical sequencing community for correlating assay design parameters with assay sensitivity remain limited, and large-scale empirical assessments are often relied upon due to the lack of clear theoretical grounding. To address this uncertainty, a theoretical model was developed for predicting the expected variant calling sensitivity for a given library complexity and sequencing depth. Binomial models were found to be appropriate when assay sensitivity was only limited by library complexity or sequencing depth, but functional scaling for library complexity was necessary when both library complexity and sequencing depth were co-limiting. This model was empirically validated with sequencing experiments by using a series of DNA input amounts and sequencing depths. Based on these findings, a workflow is proposed for determining the limiting factors to sensitivity in different assay designs, and the formulas for these scenarios are presented. The approach described here provides designers of clinical assays with the methods to theoretically predict assay design outcomes a priori, potentially reducing burden in clinical tumor assay design and validation efforts.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Estatísticos , Neoplasias/genética , Reação em Cadeia da Polimerase/métodos , Alelos , DNA/genética , DNA/isolamento & purificação , Humanos , Limite de Detecção , Mutação , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
Nat Commun ; 12(1): 2474, 2021 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-33931648

RESUMO

As more clinically-relevant genomic features of myeloid malignancies are revealed, it has become clear that targeted clinical genetic testing is inadequate for risk stratification. Here, we develop and validate a clinical transcriptome-based assay for stratification of acute myeloid leukemia (AML). Comparison of ribonucleic acid sequencing (RNA-Seq) to whole genome and exome sequencing reveals that a standalone RNA-Seq assay offers the greatest diagnostic return, enabling identification of expressed gene fusions, single nucleotide and short insertion/deletion variants, and whole-transcriptome expression information. Expression data from 154 AML patients are used to develop a novel AML prognostic score, which is strongly associated with patient outcomes across 620 patients from three independent cohorts, and 42 patients from a prospective cohort. When combined with molecular risk guidelines, the risk score allows for the re-stratification of 22.1 to 25.3% of AML patients from three independent cohorts into correct risk groups. Within the adverse-risk subgroup, we identify a subset of patients characterized by dysregulated integrin signaling and RUNX1 or TP53 mutation. We show that these patients may benefit from therapy with inhibitors of focal adhesion kinase, encoded by PTK2, demonstrating additional utility of transcriptome-based testing for therapy selection in myeloid malignancy.


Assuntos
Biomarcadores Tumorais/metabolismo , Regulação Neoplásica da Expressão Gênica/genética , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/metabolismo , Biomarcadores Tumorais/genética , Linhagem Celular Tumoral , Estudos de Coortes , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Subunidade alfa 2 de Fator de Ligação ao Core/metabolismo , Feminino , Fusão Gênica , Humanos , Mutação INDEL , Integrinas/genética , Integrinas/metabolismo , Leucemia Mieloide Aguda/genética , Masculino , Polimorfismo de Nucleotídeo Único , Prognóstico , Estudos Prospectivos , RNA-Seq , Fatores de Risco , Transdução de Sinais/genética , Análise de Sobrevida , Transcriptoma , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo , Sequenciamento do Exoma , Sequenciamento Completo do Genoma
9.
BMC Genomics ; 11: 536, 2010 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-20920358

RESUMO

BACKGROUND: Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. RESULTS: We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes) consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE). CONCLUSIONS: We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.


Assuntos
Besouros/microbiologia , Genes Fúngicos/genética , Insetos Vetores/microbiologia , Ophiostomatales/genética , Pinus/microbiologia , Casca de Planta/microbiologia , Árvores/microbiologia , Animais , Besouros/efeitos dos fármacos , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Regulação Fúngica da Expressão Gênica/efeitos dos fármacos , Biblioteca Gênica , Insetos Vetores/efeitos dos fármacos , Redes e Vias Metabólicas/efeitos dos fármacos , Redes e Vias Metabólicas/genética , Micélio/efeitos dos fármacos , Micélio/genética , Ophiostomatales/efeitos dos fármacos , Ophiostomatales/isolamento & purificação , Floema/química , Floema/efeitos dos fármacos , Pinus/efeitos dos fármacos , Casca de Planta/efeitos dos fármacos , Extratos Vegetais/farmacologia , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Esporos Fúngicos/efeitos dos fármacos , Esporos Fúngicos/genética , Árvores/efeitos dos fármacos
10.
J Mol Diagn ; 22(2): 141-146, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31837431

RESUMO

Sample tracking and identity are essential when processing multiple samples in parallel. Sequencing applications often involve high sample numbers, and the data are frequently used in a clinical setting. As such, a simple and accurate intrinsic sample tracking process through a sequencing pipeline is essential. Various solutions have been implemented to verify sample identity, including variant detection at the start and end of the pipeline using arrays or genotyping, bioinformatic comparisons, and optical barcoding of samples. None of these approaches are optimal. To establish a more effective approach using genetic barcoding, we developed a panel of unique DNA sequences cloned into a common vector. A unique DNA sequence is added to the sample when it is first received and can be detected by PCR and/or sequencing at any stage of the process. The control sequences are approximately 200 bases long with low identity to any sequence in the National Center for Biotechnology Information nonredundant database (<30 bases) and contain no long homopolymer (>7) stretches. When a spiked next-generation sequencing library is sequenced, sequence reads derived from this control sequence are generated along with the standard sequencing run and are used to confirm sample identity and determine cross-contamination levels. This approach is used in our targeted clinical diagnostic whole-genome and RNA-sequencing pipelines and is an inexpensive, flexible, and platform-agnostic solution.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Biologia Computacional , Contaminação por DNA , Bases de Dados de Ácidos Nucleicos , Biblioteca Gênica , Humanos , Padrões de Referência , Reprodutibilidade dos Testes , Análise de Sequência de DNA
11.
Nat Cell Biol ; 22(5): 526-533, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32251398

RESUMO

Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the most common structural genomic variant in myelodysplastic syndromes (MDS)1. Lenalidomide (LEN) is the treatment of choice for patients with del(5q) MDS, but half of the responding patients become resistant2 within 2 years. TP53 mutations are detected in ~20% of LEN-resistant patients3. Here we show that patients who become resistant to LEN harbour recurrent variants of TP53 or RUNX1. LEN upregulated RUNX1 protein and function in a CRBN- and TP53-dependent manner in del(5q) cells, and mutation or downregulation of RUNX1 rendered cells resistant to LEN. LEN induced megakaryocytic differentiation of del(5q) cells followed by cell death that was dependent on calpain activation and CSNK1A1 degradation4,5. We also identified GATA2 as a LEN-responsive gene that is required for LEN-induced megakaryocyte differentiation. Megakaryocytic gene-promoter analyses suggested that LEN-induced degradation of IKZF1 enables a RUNX1-GATA2 complex to drive megakaryocytic differentiation. Overexpression of GATA2 restored LEN sensitivity in the context of RUNX1 or TP53 mutations by enhancing LEN-induced megakaryocytic differentiation. Screening for mutations that block LEN-induced megakaryocytic differentiation should identify patients who are resistant to LEN.


Assuntos
Diferenciação Celular/efeitos dos fármacos , Diferenciação Celular/genética , Cromossomos Humanos Par 5/genética , Lenalidomida/farmacologia , Megacariócitos/efeitos dos fármacos , Síndromes Mielodisplásicas/genética , Linhagem Celular , Cromossomos Humanos Par 5/efeitos dos fármacos , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Regulação para Baixo/efeitos dos fármacos , Regulação para Baixo/genética , Fator de Transcrição GATA2/genética , Células HEK293 , Humanos , Mutação/efeitos dos fármacos , Mutação/genética , Proteína Supressora de Tumor p53/genética
12.
Int J Lab Hematol ; 41 Suppl 1: 117-125, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-31069982

RESUMO

Clinical genetic testing in the myeloid malignancies is undergoing a rapid transition from the era of cytogenetics and single-gene testing to an era dominated by next-generation sequencing (NGS). This transition promises to better reveal the genetic alterations underlying disease, but there are distinct risks and benefits associated with different NGS testing platforms. NGS offers the potential benefit of being able to survey alterations across a wider set of genes, but analytic and clinical challenges associated with incidental findings, germ line variation, turnaround time, and limits of detection must be addressed. Additionally, transcriptome-based testing may offer several distinct benefits beyond traditional DNA-based methods. In addition to testing at disease diagnosis, research indicates potential benefits of genetic testing both prior to disease onset and at remission. In this review, we discuss the transition from the era of cytogenetics and single-gene tests to the era of NGS panels and genome-wide sequencing-highlighting both the potential and drawbacks of these novel technologies.


Assuntos
Biomarcadores Tumorais/genética , Predisposição Genética para Doença , Testes Genéticos/métodos , Genômica/métodos , Neoplasias Hematológicas/genética , Transtornos Mieloproliferativos/genética , Análise de Sequência de DNA/métodos , Humanos
13.
J Mol Diagn ; 21(4): 705-717, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31055024

RESUMO

Formalin fixation is the standard method for the preservation of tissue for diagnostic purposes, including pathologic review and molecular assays. However, this method is known to cause artifacts that can affect the accuracy of molecular genetic test results. We assessed the applicability of alternative fixatives to determine whether these perform significantly better on next-generation sequencing assays, and whether adequate morphology is retained for primary diagnosis, in a prospective study using a clinical-grade, laboratory-developed targeted resequencing assay. Several parameters relating to sequencing quality and variant calling were examined and quantified in tumor and normal colon epithelial tissues. We identified an alternative fixative that suppresses many formalin-related artifacts while retaining adequate morphology for pathologic review.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Fixação de Tecidos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Imuno-Histoquímica , Inclusão em Parafina , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas
14.
Sci Rep ; 8(1): 6951, 2018 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-29725024

RESUMO

Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.


Assuntos
Neoplasias Hematológicas/genética , Leucemia Mieloide Aguda/genética , Síndromes Mielodisplásicas/genética , Transcriptoma , Teorema de Bayes , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Neoplasias Hematológicas/diagnóstico , Humanos , Leucemia Mieloide Aguda/diagnóstico , Síndromes Mielodisplásicas/diagnóstico , Análise de Sequência de RNA
15.
Pac Symp Biocomput ; : 347-58, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25592595

RESUMO

In eukaryotic cells, alternative cleavage of 3' untranslated regions (UTRs) can affect transcript stability, transport and translation. For polyadenylated (poly(A)) transcripts, cleavage sites can be characterized with short-read sequencing using specialized library construction methods. However, for large-scale cohort studies as well as for clinical sequencing applications, it is desirable to characterize such events using RNA-seq data, as the latter are already widely applied to identify other relevant information, such as mutations, alternative splicing and chimeric transcripts. Here we describe KLEAT, an analysis tool that uses de novo assembly of RNA-seq data to characterize cleavage sites on 3' UTRs. We demonstrate the performance of KLEAT on three cell line RNA-seq libraries constructed and sequenced by the ENCODE project, and assembled using Trans-ABySS. Validating the KLEAT predictions with matched ENCODE RNA-seq and RNA-PET libraries, we show that the tool has over 90% positive predictive value when there are at least three RNA-seq reads supporting a poly(A) tail and requiring at least three RNA-PET reads mapping within 100 nucleotides as validation. We also compare the performance of KLEAT with other popular RNA-seq analysis pipelines that reconstruct 3' UTR ends, and show that it performs favourably, based on an ROC-like curve.


Assuntos
Transcriptoma , Regiões 3' não Traduzidas , Sítios de Ligação , Linhagem Celular , Biologia Computacional , Biblioteca Gênica , Humanos , Curva ROC , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de RNA/estatística & dados numéricos
16.
Genome Biol ; 14(3): R27, 2013 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-23537049

RESUMO

BACKGROUND: The mountain pine beetle, Dendroctonus ponderosae Hopkins, is the most serious insect pest of western North American pine forests. A recent outbreak destroyed more than 15 million hectares of pine forests, with major environmental effects on forest health, and economic effects on the forest industry. The outbreak has in part been driven by climate change, and will contribute to increased carbon emissions through decaying forests. RESULTS: We developed a genome sequence resource for the mountain pine beetle to better understand the unique aspects of this insect's biology. A draft de novo genome sequence was assembled from paired-end, short-read sequences from an individual field-collected male pupa, and scaffolded using mate-paired, short-read genomic sequences from pooled field-collected pupae, paired-end short-insert whole-transcriptome shotgun sequencing reads of mRNA from adult beetle tissues, and paired-end Sanger EST sequences from various life stages. We describe the cytochrome P450, glutathione S-transferase, and plant cell wall-degrading enzyme gene families important to the survival of the mountain pine beetle in its harsh and nutrient-poor host environment, and examine genome-wide single-nucleotide polymorphism variation. A horizontally transferred bacterial sucrose-6-phosphate hydrolase was evident in the genome, and its tissue-specific transcription suggests a functional role for this beetle. CONCLUSIONS: Despite Coleoptera being the largest insect order with over 400,000 described species, including many agricultural and forest pest species, this is only the second genome sequence reported in Coleoptera, and will provide an important resource for the Curculionoidea and other insects.


Assuntos
Besouros/genética , Ecossistema , Florestas , Genoma de Inseto/genética , Animais , Parede Celular/metabolismo , Besouros/enzimologia , Feminino , Transferência Genética Horizontal/genética , Ligação Genética , Heterozigoto , Masculino , Família Multigênica , Filogenia , Células Vegetais/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Sequências Repetitivas de Ácido Nucleico/genética , Homologia de Sequência do Ácido Nucleico , Cromossomos Sexuais/genética , Sintenia/genética
17.
J Mol Diagn ; 15(6): 796-809, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24094589

RESUMO

Individuals who inherit mutations in BRCA1 or BRCA2 are predisposed to breast and ovarian cancers. However, identifying mutations in these large genes by conventional dideoxy sequencing in a clinical testing laboratory is both time consuming and costly, and similar challenges exist for other large genes, or sets of genes, with relevance in the clinical setting. Second-generation sequencing technologies have the potential to improve the efficiency and throughput of clinical diagnostic sequencing, once clinically validated methods become available. We have developed a method for detection of variants based on automated small-amplicon PCR followed by sample pooling and sequencing with a second-generation instrument. To demonstrate the suitability of this method for clinical diagnostic sequencing, we analyzed the coding exons and the intron-exon boundaries of BRCA1 and BRCA2 in 91 hereditary breast cancer patient samples. Our method generated high-quality sequence coverage across all targeted regions, with median coverage greater than 4000-fold for each sample in pools of 24. Sensitive and specific automated variant detection, without false-positive or false-negative results, was accomplished with a standard software pipeline using bwa for sequence alignment and samtools for variant detection. We experimentally derived a minimum threshold of 100-fold sequence depth for confident variant detection. The results demonstrate that this method is suitable for sensitive, automatable, high-throughput sequence variant detection in the clinical laboratory.


Assuntos
Análise Mutacional de DNA/métodos , Genes BRCA1 , Genes BRCA2 , Síndrome Hereditária de Câncer de Mama e Ovário/genética , Sequência de Bases , Frequência do Gene , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Estudos Prospectivos , Sensibilidade e Especificidade
18.
Gigascience ; 2(1): 10, 2013 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-23870653

RESUMO

BACKGROUND: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. RESULTS: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. CONCLUSIONS: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

19.
Insect Biochem Mol Biol ; 42(8): 525-36, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22516182

RESUMO

Bark beetles (Coleoptera: Curculionidae: Scolytinae) are major insect pests of many woody plants around the world. The mountain pine beetle (MPB), Dendroctonus ponderosae Hopkins, is a significant historical pest of western North American pine forests. It is currently devastating pine forests in western North America--particularly in British Columbia, Canada--and is beginning to expand its host range eastward into the Canadian boreal forest, which extends to the Atlantic coast of North America. Limited genomic resources are available for this and other bark beetle pests, restricting the use of genomics-based information to help monitor, predict, and manage the spread of these insects. To overcome these limitations, we generated comprehensive transcriptome resources from fourteen full-length enriched cDNA libraries through paired-end Sanger sequencing of 100,000 cDNA clones, and single-end Roche 454 pyrosequencing of three of these cDNA libraries. Hybrid de novo assembly of the 3.4 million sequences resulted in 20,571 isotigs in 14,410 isogroups and 246,848 singletons. In addition, over 2300 non-redundant full-length cDNA clones putatively containing complete open reading frames, including 47 cytochrome P450s, were sequenced fully to high quality. This first large-scale genomics resource for bark beetles provides the relevant sequence information for gene discovery; functional and population genomics; comparative analyses; and for future efforts to annotate the MPB genome. These resources permit the study of this beetle at the molecular level and will inform research in other Dendroctonus spp. and more generally in the Curculionidae and other Coleoptera.


Assuntos
Besouros/genética , Pinus/parasitologia , Transcriptoma , Regiões 3' não Traduzidas , Regiões 5' não Traduzidas , Animais , Antenas de Artrópodes/metabolismo , Besouros/metabolismo , Sistema Enzimático do Citocromo P-450/metabolismo , Corpo Adiposo/metabolismo , Feminino , Masculino , Família Multigênica , Fases de Leitura Aberta , Análise de Sequência de DNA
20.
Genome Biol ; 10(9): R94, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19747388

RESUMO

Sequencing-by-synthesis technologies can reduce the cost of generating de novo genome assemblies. We report a method for assembling draft genome sequences of eukaryotic organisms that integrates sequence information from different sources, and demonstrate its effectiveness by assembling an approximately 32.5 Mb draft genome sequence for the forest pathogen Grosmannia clavigera, an ascomycete fungus. We also developed a method for assessing draft assemblies using Illumina paired end read data and demonstrate how we are using it to guide future sequence finishing. Our results demonstrate that eukaryotic genome sequences can be accurately assembled by combining Illumina, 454 and Sanger sequence data.


Assuntos
Ascomicetos/genética , Genoma Fúngico/genética , Análise de Sequência de DNA/métodos , Algoritmos , Proteínas Fúngicas/genética , Genômica/métodos , Fases de Leitura Aberta/genética , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA