Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Blood ; 135(25): 2235-2251, 2020 06 18.
Artículo en Inglés | MEDLINE | ID: mdl-32384151

RESUMEN

Aging is associated with significant changes in the hematopoietic system, including increased inflammation, impaired hematopoietic stem cell (HSC) function, and increased incidence of myeloid malignancy. Inflammation of aging ("inflammaging") has been proposed as a driver of age-related changes in HSC function and myeloid malignancy, but mechanisms linking these phenomena remain poorly defined. We identified loss of miR-146a as driving aging-associated inflammation in AML patients. miR-146a expression declined in old wild-type mice, and loss of miR-146a promoted premature HSC aging and inflammation in young miR-146a-null mice, preceding development of aging-associated myeloid malignancy. Using single-cell assays of HSC quiescence, stemness, differentiation potential, and epigenetic state to probe HSC function and population structure, we found that loss of miR-146a depleted a subpopulation of primitive, quiescent HSCs. DNA methylation and transcriptome profiling implicated NF-κB, IL6, and TNF as potential drivers of HSC dysfunction, activating an inflammatory signaling relay promoting IL6 and TNF secretion from mature miR-146a-/- myeloid and lymphoid cells. Reducing inflammation by targeting Il6 or Tnf was sufficient to restore single-cell measures of miR-146a-/- HSC function and subpopulation structure and reduced the incidence of hematological malignancy in miR-146a-/- mice. miR-146a-/- HSCs exhibited enhanced sensitivity to IL6 stimulation, indicating that loss of miR-146a affects HSC function via both cell-extrinsic inflammatory signals and increased cell-intrinsic sensitivity to inflammation. Thus, loss of miR-146a regulates cell-extrinsic and -intrinsic mechanisms linking HSC inflammaging to the development of myeloid malignancy.


Asunto(s)
Envejecimiento/genética , Inflamación/genética , Interleucina-6/fisiología , Leucemia Mieloide Aguda/etiología , MicroARNs/genética , Factor de Necrosis Tumoral alfa/fisiología , Adolescente , Adulto , Anciano , Envejecimiento/inmunología , Animales , Diferenciación Celular , Autorrenovación de las Células , Senescencia Celular , Citocinas/biosíntesis , Metilación de ADN , Femenino , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/patología , Humanos , Inflamación/fisiopatología , Interleucina-6/antagonistas & inhibidores , Masculino , Ratones , Ratones Noqueados , MicroARNs/biosíntesis , Persona de Mediana Edad , FN-kappa B/fisiología , Análisis de la Célula Individual , Transcriptoma , Factor de Necrosis Tumoral alfa/antagonistas & inhibidores , Adulto Joven
2.
Genome Res ; 21(12): 2224-41, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21926179

RESUMEN

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Asunto(s)
Genoma/fisiología , Genómica/métodos , Análisis de Secuencia de ADN/métodos
3.
BMC Genomics ; 14: 550, 2013 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-23941359

RESUMEN

BACKGROUND: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. RESULTS: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. CONCLUSIONS: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.


Asunto(s)
Duplicación de Gen/genética , Perfilación de la Expresión Génica/métodos , Fusión Génica/genética , Genómica , Neoplasias de la Mama/genética , Exones/genética , Humanos , Leucemia Mieloide Aguda/genética , Anotación de Secuencia Molecular , ARN Mensajero/genética , Estadística como Asunto
4.
Proc Natl Acad Sci U S A ; 107(38): 16589-94, 2010 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-20807748

RESUMEN

The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination "knockins" in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5' of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type-specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.


Asunto(s)
Encéfalo/metabolismo , Regiones Promotoras Genéticas , Secuencias Reguladoras de Ácidos Nucleicos , Animales , Diferenciación Celular/genética , Biología Computacional , Bases de Datos Genéticas , Células Madre Embrionarias/citología , Células Madre Embrionarias/metabolismo , Expresión Génica , Perfilación de la Expresión Génica/estadística & datos numéricos , Técnicas de Sustitución del Gen , Genes Reporteros , Genómica , Humanos , Ratones , Ratones Transgénicos , Neuronas/citología , Neuronas/metabolismo
5.
Leukemia ; 37(4): 776-787, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36788336

RESUMEN

We recently described a 16-gene expression signature for improved risk stratification of acute myeloid leukemia (AML) patients called the AML Prognostic Score (APS). A subset of APS-high-risk AML patients showed increased levels of focal adhesion kinase (FAK), encoded by the Protein Tyrosine Kinase 2 (PTK2) gene, which was correlated with RUNX1 mutations. RUNX1 mutant cells are more sensitive to PTK2 inhibitors. As we were not able to detect RUNX1-binding sites in the PTK2 promoter, we hypothesized that RUNX1 might regulate micro(mi)RNAs that repress PTK2, such that loss-of-function RUNX1 mutations would result in reduced miRNA expression and derepression of PTK2. Examination of paired RNA-seq and miRNA-seq data from 301 AML cases revealed two miRNAs that positively correlated with RUNX1 expression, contained RUNX1-binding sites in their promoters and were predicted to target PTK2. We show that the hsa-let7a-2-3p and hsa-miR-135a-5p promoters are regulated by RUNX1, and that PTK2 is a direct target of both miRNAs. Even in the absence of RUNX1 mutations, hsa-let7a-2-3p and hsa-miR-135a-5p regulate PTK2 expression, and reduced expression of these two miRNAs sensitizes AML cells to PTK2 inhibition. These data explain how RUNX1 regulates PTK2, and identify potential miRNA biomarkers for targeting AML with PTK2 inhibitors.


Asunto(s)
Leucemia Mieloide Aguda , MicroARNs , Humanos , Subunidad alfa 2 del Factor de Unión al Sitio Principal/genética , Quinasa 1 de Adhesión Focal , Proteína-Tirosina Quinasas de Adhesión Focal , Leucemia Mieloide Aguda/genética , MicroARNs/genética , MicroARNs/metabolismo
6.
BMC Genomics ; 12: 450, 2011 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-21923906

RESUMEN

BACKGROUND: As scientists continue to pursue various 'omics-based research, there is a need for high quality data for the most fundamental 'omics of all: genomics. The bacterium Paenibacillus larvae is the causative agent of the honey bee disease American foulbrood. If untreated, it can lead to the demise of an entire hive; the highly social nature of bees also leads to easy disease spread, between both individuals and colonies. Biologists have studied this organism since the early 1900s, and a century later, the molecular mechanism of infection remains elusive. Transcriptomics and proteomics, because of their ability to analyze multiple genes and proteins in a high-throughput manner, may be very helpful to its study. However, the power of these methodologies is severely limited without a complete genome; we undertake to address that deficiency here. RESULTS: We used the Illumina GAIIx platform and conventional Sanger sequencing to generate a 182-fold sequence coverage of the P. larvae genome, and assembled the data using ABySS into a total of 388 contigs spanning 4.5 Mbp. Comparative genomics analysis against fully-sequenced soil bacteria P. JDR2 and P. vortex showed that regions of poor conservation may contain putative virulence factors. We used GLIMMER to predict 3568 gene models, and named them based on homology revealed by BLAST searches; proteases, hemolytic factors, toxins, and antibiotic resistance enzymes were identified in this way. Finally, mass spectrometry was used to provide experimental evidence that at least 35% of the genes are expressed at the protein level. CONCLUSIONS: This update on the genome of P. larvae and annotation represents an immense advancement from what we had previously known about this species. We provide here a reliable resource that can be used to elucidate the mechanism of infection, and by extension, more effective methods to control and cure this widespread honey bee disease.


Asunto(s)
Abejas/microbiología , Genoma Bacteriano , Paenibacillus/genética , Animales , Hibridación Genómica Comparativa , Biología Computacional , ADN Bacteriano/genética , Anotación de Secuencia Molecular , Proteómica , Análisis de Secuencia de ADN
7.
J Mol Diagn ; 23(4): 455-466, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33486075

RESUMEN

Clinical reporting of solid tumor sequencing requires reliable assessment of the accuracy and reproducibility of each assay. Somatic mutation variant allele fractions may be below 10% in many samples due to sample heterogeneity, tumor clonality, and/or sample degradation in fixatives such as formalin. The toolkits available to the clinical sequencing community for correlating assay design parameters with assay sensitivity remain limited, and large-scale empirical assessments are often relied upon due to the lack of clear theoretical grounding. To address this uncertainty, a theoretical model was developed for predicting the expected variant calling sensitivity for a given library complexity and sequencing depth. Binomial models were found to be appropriate when assay sensitivity was only limited by library complexity or sequencing depth, but functional scaling for library complexity was necessary when both library complexity and sequencing depth were co-limiting. This model was empirically validated with sequencing experiments by using a series of DNA input amounts and sequencing depths. Based on these findings, a workflow is proposed for determining the limiting factors to sensitivity in different assay designs, and the formulas for these scenarios are presented. The approach described here provides designers of clinical assays with the methods to theoretically predict assay design outcomes a priori, potentially reducing burden in clinical tumor assay design and validation efforts.


Asunto(s)
Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Estadísticos , Neoplasias/genética , Reacción en Cadena de la Polimerasa/métodos , Alelos , ADN/genética , ADN/aislamiento & purificación , Humanos , Límite de Detección , Mutación , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
8.
Nat Commun ; 12(1): 2474, 2021 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-33931648

RESUMEN

As more clinically-relevant genomic features of myeloid malignancies are revealed, it has become clear that targeted clinical genetic testing is inadequate for risk stratification. Here, we develop and validate a clinical transcriptome-based assay for stratification of acute myeloid leukemia (AML). Comparison of ribonucleic acid sequencing (RNA-Seq) to whole genome and exome sequencing reveals that a standalone RNA-Seq assay offers the greatest diagnostic return, enabling identification of expressed gene fusions, single nucleotide and short insertion/deletion variants, and whole-transcriptome expression information. Expression data from 154 AML patients are used to develop a novel AML prognostic score, which is strongly associated with patient outcomes across 620 patients from three independent cohorts, and 42 patients from a prospective cohort. When combined with molecular risk guidelines, the risk score allows for the re-stratification of 22.1 to 25.3% of AML patients from three independent cohorts into correct risk groups. Within the adverse-risk subgroup, we identify a subset of patients characterized by dysregulated integrin signaling and RUNX1 or TP53 mutation. We show that these patients may benefit from therapy with inhibitors of focal adhesion kinase, encoded by PTK2, demonstrating additional utility of transcriptome-based testing for therapy selection in myeloid malignancy.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Regulación Neoplásica de la Expresión Génica/genética , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/metabolismo , Biomarcadores de Tumor/genética , Línea Celular Tumoral , Estudios de Cohortes , Subunidad alfa 2 del Factor de Unión al Sitio Principal/genética , Subunidad alfa 2 del Factor de Unión al Sitio Principal/metabolismo , Femenino , Fusión Génica , Humanos , Mutación INDEL , Integrinas/genética , Integrinas/metabolismo , Leucemia Mieloide Aguda/genética , Masculino , Polimorfismo de Nucleótido Simple , Pronóstico , Estudios Prospectivos , RNA-Seq , Factores de Riesgo , Transducción de Señal/genética , Análisis de Supervivencia , Transcriptoma , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/metabolismo , Secuenciación del Exoma , Secuenciación Completa del Genoma
9.
BMC Genomics ; 11: 536, 2010 Oct 04.
Artículo en Inglés | MEDLINE | ID: mdl-20920358

RESUMEN

BACKGROUND: Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. RESULTS: We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes) consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE). CONCLUSIONS: We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.


Asunto(s)
Escarabajos/microbiología , Genes Fúngicos/genética , Insectos Vectores/microbiología , Ophiostomatales/genética , Pinus/microbiología , Corteza de la Planta/microbiología , Árboles/microbiología , Animales , Escarabajos/efectos de los fármacos , Bases de Datos Genéticas , Etiquetas de Secuencia Expresada , Regulación Fúngica de la Expresión Génica/efectos de los fármacos , Biblioteca de Genes , Insectos Vectores/efectos de los fármacos , Redes y Vías Metabólicas/efectos de los fármacos , Redes y Vías Metabólicas/genética , Micelio/efectos de los fármacos , Micelio/genética , Ophiostomatales/efectos de los fármacos , Ophiostomatales/aislamiento & purificación , Floema/química , Floema/efectos de los fármacos , Pinus/efectos de los fármacos , Corteza de la Planta/efectos de los fármacos , Extractos Vegetales/farmacología , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Esporas Fúngicas/efectos de los fármacos , Esporas Fúngicas/genética , Árboles/efectos de los fármacos
10.
J Mol Diagn ; 22(2): 141-146, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31837431

RESUMEN

Sample tracking and identity are essential when processing multiple samples in parallel. Sequencing applications often involve high sample numbers, and the data are frequently used in a clinical setting. As such, a simple and accurate intrinsic sample tracking process through a sequencing pipeline is essential. Various solutions have been implemented to verify sample identity, including variant detection at the start and end of the pipeline using arrays or genotyping, bioinformatic comparisons, and optical barcoding of samples. None of these approaches are optimal. To establish a more effective approach using genetic barcoding, we developed a panel of unique DNA sequences cloned into a common vector. A unique DNA sequence is added to the sample when it is first received and can be detected by PCR and/or sequencing at any stage of the process. The control sequences are approximately 200 bases long with low identity to any sequence in the National Center for Biotechnology Information nonredundant database (<30 bases) and contain no long homopolymer (>7) stretches. When a spiked next-generation sequencing library is sequenced, sequence reads derived from this control sequence are generated along with the standard sequencing run and are used to confirm sample identity and determine cross-contamination levels. This approach is used in our targeted clinical diagnostic whole-genome and RNA-sequencing pipelines and is an inexpensive, flexible, and platform-agnostic solution.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Biología Computacional , Contaminación de ADN , Bases de Datos de Ácidos Nucleicos , Biblioteca de Genes , Humanos , Estándares de Referencia , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN
11.
Nat Cell Biol ; 22(5): 526-533, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32251398

RESUMEN

Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the most common structural genomic variant in myelodysplastic syndromes (MDS)1. Lenalidomide (LEN) is the treatment of choice for patients with del(5q) MDS, but half of the responding patients become resistant2 within 2 years. TP53 mutations are detected in ~20% of LEN-resistant patients3. Here we show that patients who become resistant to LEN harbour recurrent variants of TP53 or RUNX1. LEN upregulated RUNX1 protein and function in a CRBN- and TP53-dependent manner in del(5q) cells, and mutation or downregulation of RUNX1 rendered cells resistant to LEN. LEN induced megakaryocytic differentiation of del(5q) cells followed by cell death that was dependent on calpain activation and CSNK1A1 degradation4,5. We also identified GATA2 as a LEN-responsive gene that is required for LEN-induced megakaryocyte differentiation. Megakaryocytic gene-promoter analyses suggested that LEN-induced degradation of IKZF1 enables a RUNX1-GATA2 complex to drive megakaryocytic differentiation. Overexpression of GATA2 restored LEN sensitivity in the context of RUNX1 or TP53 mutations by enhancing LEN-induced megakaryocytic differentiation. Screening for mutations that block LEN-induced megakaryocytic differentiation should identify patients who are resistant to LEN.


Asunto(s)
Diferenciación Celular/efectos de los fármacos , Diferenciación Celular/genética , Cromosomas Humanos Par 5/genética , Lenalidomida/farmacología , Megacariocitos/efectos de los fármacos , Síndromes Mielodisplásicos/genética , Línea Celular , Cromosomas Humanos Par 5/efectos de los fármacos , Subunidad alfa 2 del Factor de Unión al Sitio Principal/genética , Regulación hacia Abajo/efectos de los fármacos , Regulación hacia Abajo/genética , Factor de Transcripción GATA2/genética , Células HEK293 , Humanos , Mutación/efectos de los fármacos , Mutación/genética , Proteína p53 Supresora de Tumor/genética
12.
Int J Lab Hematol ; 41 Suppl 1: 117-125, 2019 May.
Artículo en Inglés | MEDLINE | ID: mdl-31069982

RESUMEN

Clinical genetic testing in the myeloid malignancies is undergoing a rapid transition from the era of cytogenetics and single-gene testing to an era dominated by next-generation sequencing (NGS). This transition promises to better reveal the genetic alterations underlying disease, but there are distinct risks and benefits associated with different NGS testing platforms. NGS offers the potential benefit of being able to survey alterations across a wider set of genes, but analytic and clinical challenges associated with incidental findings, germ line variation, turnaround time, and limits of detection must be addressed. Additionally, transcriptome-based testing may offer several distinct benefits beyond traditional DNA-based methods. In addition to testing at disease diagnosis, research indicates potential benefits of genetic testing both prior to disease onset and at remission. In this review, we discuss the transition from the era of cytogenetics and single-gene tests to the era of NGS panels and genome-wide sequencing-highlighting both the potential and drawbacks of these novel technologies.


Asunto(s)
Biomarcadores de Tumor/genética , Predisposición Genética a la Enfermedad , Pruebas Genéticas/métodos , Genómica/métodos , Neoplasias Hematológicas/genética , Trastornos Mieloproliferativos/genética , Análisis de Secuencia de ADN/métodos , Humanos
13.
J Mol Diagn ; 21(4): 705-717, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31055024

RESUMEN

Formalin fixation is the standard method for the preservation of tissue for diagnostic purposes, including pathologic review and molecular assays. However, this method is known to cause artifacts that can affect the accuracy of molecular genetic test results. We assessed the applicability of alternative fixatives to determine whether these perform significantly better on next-generation sequencing assays, and whether adequate morphology is retained for primary diagnosis, in a prospective study using a clinical-grade, laboratory-developed targeted resequencing assay. Several parameters relating to sequencing quality and variant calling were examined and quantified in tumor and normal colon epithelial tissues. We identified an alternative fixative that suppresses many formalin-related artifacts while retaining adequate morphology for pathologic review.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Fijación del Tejido , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Inmunohistoquímica , Adhesión en Parafina , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas
14.
Sci Rep ; 8(1): 6951, 2018 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-29725024

RESUMEN

Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.


Asunto(s)
Neoplasias Hematológicas/genética , Leucemia Mieloide Aguda/genética , Síndromes Mielodisplásicos/genética , Transcriptoma , Teorema de Bayes , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Neoplasias Hematológicas/diagnóstico , Humanos , Leucemia Mieloide Aguda/diagnóstico , Síndromes Mielodisplásicos/diagnóstico , Análisis de Secuencia de ARN
15.
Pac Symp Biocomput ; : 347-58, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25592595

RESUMEN

In eukaryotic cells, alternative cleavage of 3' untranslated regions (UTRs) can affect transcript stability, transport and translation. For polyadenylated (poly(A)) transcripts, cleavage sites can be characterized with short-read sequencing using specialized library construction methods. However, for large-scale cohort studies as well as for clinical sequencing applications, it is desirable to characterize such events using RNA-seq data, as the latter are already widely applied to identify other relevant information, such as mutations, alternative splicing and chimeric transcripts. Here we describe KLEAT, an analysis tool that uses de novo assembly of RNA-seq data to characterize cleavage sites on 3' UTRs. We demonstrate the performance of KLEAT on three cell line RNA-seq libraries constructed and sequenced by the ENCODE project, and assembled using Trans-ABySS. Validating the KLEAT predictions with matched ENCODE RNA-seq and RNA-PET libraries, we show that the tool has over 90% positive predictive value when there are at least three RNA-seq reads supporting a poly(A) tail and requiring at least three RNA-PET reads mapping within 100 nucleotides as validation. We also compare the performance of KLEAT with other popular RNA-seq analysis pipelines that reconstruct 3' UTR ends, and show that it performs favourably, based on an ROC-like curve.


Asunto(s)
Transcriptoma , Regiones no Traducidas 3' , Sitios de Unión , Línea Celular , Biología Computacional , Biblioteca de Genes , Humanos , Curva ROC , Alineación de Secuencia/estadística & datos numéricos , Análisis de Secuencia de ARN/estadística & datos numéricos
16.
Genome Biol ; 14(3): R27, 2013 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-23537049

RESUMEN

BACKGROUND: The mountain pine beetle, Dendroctonus ponderosae Hopkins, is the most serious insect pest of western North American pine forests. A recent outbreak destroyed more than 15 million hectares of pine forests, with major environmental effects on forest health, and economic effects on the forest industry. The outbreak has in part been driven by climate change, and will contribute to increased carbon emissions through decaying forests. RESULTS: We developed a genome sequence resource for the mountain pine beetle to better understand the unique aspects of this insect's biology. A draft de novo genome sequence was assembled from paired-end, short-read sequences from an individual field-collected male pupa, and scaffolded using mate-paired, short-read genomic sequences from pooled field-collected pupae, paired-end short-insert whole-transcriptome shotgun sequencing reads of mRNA from adult beetle tissues, and paired-end Sanger EST sequences from various life stages. We describe the cytochrome P450, glutathione S-transferase, and plant cell wall-degrading enzyme gene families important to the survival of the mountain pine beetle in its harsh and nutrient-poor host environment, and examine genome-wide single-nucleotide polymorphism variation. A horizontally transferred bacterial sucrose-6-phosphate hydrolase was evident in the genome, and its tissue-specific transcription suggests a functional role for this beetle. CONCLUSIONS: Despite Coleoptera being the largest insect order with over 400,000 described species, including many agricultural and forest pest species, this is only the second genome sequence reported in Coleoptera, and will provide an important resource for the Curculionoidea and other insects.


Asunto(s)
Escarabajos/genética , Ecosistema , Bosques , Genoma de los Insectos/genética , Animales , Pared Celular/metabolismo , Escarabajos/enzimología , Femenino , Transferencia de Gen Horizontal/genética , Ligamiento Genético , Heterocigoto , Masculino , Familia de Multigenes , Filogenia , Células Vegetales/metabolismo , Polimorfismo de Nucleótido Simple/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Homología de Secuencia de Ácido Nucleico , Cromosomas Sexuales/genética , Sintenía/genética
17.
J Mol Diagn ; 15(6): 796-809, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24094589

RESUMEN

Individuals who inherit mutations in BRCA1 or BRCA2 are predisposed to breast and ovarian cancers. However, identifying mutations in these large genes by conventional dideoxy sequencing in a clinical testing laboratory is both time consuming and costly, and similar challenges exist for other large genes, or sets of genes, with relevance in the clinical setting. Second-generation sequencing technologies have the potential to improve the efficiency and throughput of clinical diagnostic sequencing, once clinically validated methods become available. We have developed a method for detection of variants based on automated small-amplicon PCR followed by sample pooling and sequencing with a second-generation instrument. To demonstrate the suitability of this method for clinical diagnostic sequencing, we analyzed the coding exons and the intron-exon boundaries of BRCA1 and BRCA2 in 91 hereditary breast cancer patient samples. Our method generated high-quality sequence coverage across all targeted regions, with median coverage greater than 4000-fold for each sample in pools of 24. Sensitive and specific automated variant detection, without false-positive or false-negative results, was accomplished with a standard software pipeline using bwa for sequence alignment and samtools for variant detection. We experimentally derived a minimum threshold of 100-fold sequence depth for confident variant detection. The results demonstrate that this method is suitable for sensitive, automatable, high-throughput sequence variant detection in the clinical laboratory.


Asunto(s)
Análisis Mutacional de ADN/métodos , Genes BRCA1 , Genes BRCA2 , Síndrome de Cáncer de Mama y Ovario Hereditario/genética , Secuencia de Bases , Frecuencia de los Genes , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Estudios Prospectivos , Sensibilidad y Especificidad
18.
Gigascience ; 2(1): 10, 2013 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-23870653

RESUMEN

BACKGROUND: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. RESULTS: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. CONCLUSIONS: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

19.
Insect Biochem Mol Biol ; 42(8): 525-36, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22516182

RESUMEN

Bark beetles (Coleoptera: Curculionidae: Scolytinae) are major insect pests of many woody plants around the world. The mountain pine beetle (MPB), Dendroctonus ponderosae Hopkins, is a significant historical pest of western North American pine forests. It is currently devastating pine forests in western North America--particularly in British Columbia, Canada--and is beginning to expand its host range eastward into the Canadian boreal forest, which extends to the Atlantic coast of North America. Limited genomic resources are available for this and other bark beetle pests, restricting the use of genomics-based information to help monitor, predict, and manage the spread of these insects. To overcome these limitations, we generated comprehensive transcriptome resources from fourteen full-length enriched cDNA libraries through paired-end Sanger sequencing of 100,000 cDNA clones, and single-end Roche 454 pyrosequencing of three of these cDNA libraries. Hybrid de novo assembly of the 3.4 million sequences resulted in 20,571 isotigs in 14,410 isogroups and 246,848 singletons. In addition, over 2300 non-redundant full-length cDNA clones putatively containing complete open reading frames, including 47 cytochrome P450s, were sequenced fully to high quality. This first large-scale genomics resource for bark beetles provides the relevant sequence information for gene discovery; functional and population genomics; comparative analyses; and for future efforts to annotate the MPB genome. These resources permit the study of this beetle at the molecular level and will inform research in other Dendroctonus spp. and more generally in the Curculionidae and other Coleoptera.


Asunto(s)
Escarabajos/genética , Pinus/parasitología , Transcriptoma , Regiones no Traducidas 3' , Regiones no Traducidas 5' , Animales , Antenas de Artrópodos/metabolismo , Escarabajos/metabolismo , Sistema Enzimático del Citocromo P-450/metabolismo , Cuerpo Adiposo/metabolismo , Femenino , Masculino , Familia de Multigenes , Sistemas de Lectura Abierta , Análisis de Secuencia de ADN
20.
Genome Biol ; 10(9): R94, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19747388

RESUMEN

Sequencing-by-synthesis technologies can reduce the cost of generating de novo genome assemblies. We report a method for assembling draft genome sequences of eukaryotic organisms that integrates sequence information from different sources, and demonstrate its effectiveness by assembling an approximately 32.5 Mb draft genome sequence for the forest pathogen Grosmannia clavigera, an ascomycete fungus. We also developed a method for assessing draft assemblies using Illumina paired end read data and demonstrate how we are using it to guide future sequence finishing. Our results demonstrate that eukaryotic genome sequences can be accurately assembled by combining Illumina, 454 and Sanger sequence data.


Asunto(s)
Ascomicetos/genética , Genoma Fúngico/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Proteínas Fúngicas/genética , Genómica/métodos , Sistemas de Lectura Abierta/genética , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA