Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 526(7575): 700-4, 2015 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-26466568

RESUMEN

Neuroblastoma is a malignant paediatric tumour of the sympathetic nervous system. Roughly half of these tumours regress spontaneously or are cured by limited therapy. By contrast, high-risk neuroblastomas have an unfavourable clinical course despite intensive multimodal treatment, and their molecular basis has remained largely elusive. Here we have performed whole-genome sequencing of 56 neuroblastomas (high-risk, n = 39; low-risk, n = 17) and discovered recurrent genomic rearrangements affecting a chromosomal region at 5p15.33 proximal of the telomerase reverse transcriptase gene (TERT). These rearrangements occurred only in high-risk neuroblastomas (12/39, 31%) in a mutually exclusive fashion with MYCN amplifications and ATRX mutations, which are known genetic events in this tumour type. In an extended case series (n = 217), TERT rearrangements defined a subgroup of high-risk tumours with particularly poor outcome. Despite a large structural diversity of these rearrangements, they all induced massive transcriptional upregulation of TERT. In the remaining high-risk tumours, TERT expression was also elevated in MYCN-amplified tumours, whereas alternative lengthening of telomeres was present in neuroblastomas without TERT or MYCN alterations, suggesting that telomere lengthening represents a central mechanism defining this subtype. The 5p15.33 rearrangements juxtapose the TERT coding sequence to strong enhancer elements, resulting in massive chromatin remodelling and DNA methylation of the affected region. Supporting a functional role of TERT, neuroblastoma cell lines bearing rearrangements or amplified MYCN exhibited both upregulated TERT expression and enzymatic telomerase activity. In summary, our findings show that remodelling of the genomic context abrogates transcriptional silencing of TERT in high-risk neuroblastoma and places telomerase activation in the centre of transformation in a large fraction of these tumours.


Asunto(s)
Regulación Neoplásica de la Expresión Génica/genética , Genoma Humano/genética , Neuroblastoma/genética , Neuroblastoma/patología , Recombinación Genética/genética , Telomerasa/genética , Telomerasa/metabolismo , Línea Celular Tumoral , Transformación Celular Neoplásica/genética , Cromatina/genética , Cromatina/metabolismo , Cromosomas Humanos Par 5/genética , ADN Helicasas/genética , Metilación de ADN , Elementos de Facilitación Genéticos/genética , Activación Enzimática/genética , Amplificación de Genes/genética , Silenciador del Gen , Humanos , Lactante , Proteína Proto-Oncogénica N-Myc , Neuroblastoma/clasificación , Neuroblastoma/enzimología , Proteínas Nucleares/genética , Proteínas Oncogénicas/genética , Pronóstico , ARN Mensajero/análisis , ARN Mensajero/genética , Riesgo , Translocación Genética/genética , Regulación hacia Arriba/genética , Proteína Nuclear Ligada al Cromosoma X
2.
BMC Bioinformatics ; 20(1): 405, 2019 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-31345161

RESUMEN

BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


Asunto(s)
ARN/genética , Alineación de Secuencia , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Algoritmos , Secuencia de Bases , Bases de Datos de Ácidos Nucleicos , Humanos , Intrones/genética , Curva ROC , Factores de Tiempo
3.
Immunity ; 31(6): 941-52, 2009 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-20064451

RESUMEN

Interleukin-21 (IL-21) is a pleiotropic cytokine that induces expression of transcription factor BLIMP1 (encoded by Prdm1), which regulates plasma cell differentiation and T cell homeostasis. We identified an IL-21 response element downstream of Prdm1 that binds the transcription factors STAT3 and IRF4, which are required for optimal Prdm1 expression. Genome-wide ChIP-Seq mapping of STAT3- and IRF4-binding sites showed that most regions with IL-21-induced STAT3 binding also bound IRF4 in vivo and furthermore revealed that the noncanonical TTCnnnTAA GAS motif critical in Prdm1 was broadly used for STAT3 binding. Comparing genome-wide expression array data to binding sites revealed that most IL-21-regulated genes were associated with combined STAT3-IRF4 sites rather than pure STAT3 sites. Correspondingly, ChIP-Seq analysis of Irf4(-/-) T cells showed greatly diminished STAT3 binding after IL-21 treatment, and Irf4(-/-) mice showed impaired IL-21-induced Tfh cell differentiation in vivo. These results reveal broad cooperative gene regulation by STAT3 and IRF4.


Asunto(s)
Regulación de la Expresión Génica , Factores Reguladores del Interferón/metabolismo , Interleucinas/metabolismo , Factor de Transcripción STAT3/metabolismo , Factores de Transcripción/genética , Animales , Linfocitos B/inmunología , Secuencia de Bases , Sitios de Unión , Linfocitos T CD4-Positivos/inmunología , Diferenciación Celular , Estudio de Asociación del Genoma Completo , Factores Reguladores del Interferón/genética , Intrones , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Datos de Secuencia Molecular , Factor 1 de Unión al Dominio 1 de Regulación Positiva , Factor de Transcripción STAT3/genética
4.
Cephalalgia ; 38(5): 912-932, 2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-28699403

RESUMEN

Background The trigeminal ganglion contains neurons that relay sensations of pain, touch, pressure, and many other somatosensory modalities to the central nervous system. The ganglion is also a reservoir for latent herpes virus 1 infection. To gain a better understanding of molecular factors contributing to migraine and headache, transcriptome analyses were performed on postmortem human trigeminal ganglia. Methods RNA-Seq measurements of gene expression were conducted on small sub-regions of 16 human trigeminal ganglia. The samples were also characterized for transcripts derived from viral and microbial genomes. Herpes simplex virus 1 (HSV-1) antibodies in blood were measured using the luciferase immunoprecipitation assay. Results Observed molecular heterogeneity could be explained by sampling of anatomically distinct sub-regions of the excised ganglia consistent with neurally-enriched and non-neural, i.e. Schwann cell, enriched subregions. The levels of HSV-1 transcripts detected in trigeminal ganglia correlated with blood levels of HSV-1 antibodies. Multiple migraine susceptibility genes were strongly expressed in neurally-enriched trigeminal samples, while others were expressed in blood vessels. Conclusions These data provide a comprehensive human trigeminal transcriptome and a framework for evaluation of inhomogeneous post-mortem tissues through extensive quality control and refined downstream analyses for RNA-Seq methodologies. Expression profiling of migraine susceptibility genes identified by genetic association appears to emphasize the blood vessel component of the trigeminovascular system. Other genes displayed enriched expression in the trigeminal compared to dorsal root ganglion, and in-depth transcriptomic analysis of the KCNK18 gene underlying familial migraine shows selective neural expression within two specific populations of ganglionic neurons. These data suggest that expression profiling of migraine-associated genes can extend and amplify the underlying neurobiological insights obtained from genetic association studies.


Asunto(s)
Herpesvirus Humano 1/genética , Canales de Potasio/genética , ARN/genética , Análisis de Secuencia de ARN/métodos , Ganglio del Trigémino/patología , Adolescente , Adulto , Autopsia , Femenino , Humanos , Masculino , Persona de Mediana Edad , Ganglio del Trigémino/fisiología , Ganglio del Trigémino/virología , Adulto Joven
5.
Nucleic Acids Res ; 43(Database issue): D737-42, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25392405

RESUMEN

The non-human primate reference transcriptome resource (NHPRTR, available online at http://nhprtr.org/) aims to generate comprehensive RNA-seq data from a wide variety of non-human primates (NHPs), from lemurs to hominids. In the 2012 Phase I of the NHPRTR project, 19 billion fragments or 3.8 terabases of transcriptome sequences were collected from pools of ∼ 20 tissues in 15 species and subspecies. Here we describe a major expansion of NHPRTR by adding 10.1 billion fragments of tissue-specific RNA-seq data. For this effort, we selected 11 of the original 15 NHP species and subspecies and constructed total RNA libraries for the same ∼ 15 tissues in each. The sequence quality is such that 88% of the reads align to human reference sequences, allowing us to compute the full list of expression abundance across all tissues for each species, using the reads mapped to human genes. This update also includes improved transcript annotations derived from RNA-seq data for rhesus and cynomolgus macaques, two of the most commonly used NHP models and additional RNA-seq data compiled from related projects. Together, these comprehensive reference transcriptomes from multiple primates serve as a valuable community resource for genome annotation, gene dynamics and comparative functional analysis.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Primates/genética , Análisis de Secuencia de ARN , Animales , Internet , Macaca , Anotación de Secuencia Molecular , Especificidad de Órganos , Estándares de Referencia , Alineación de Secuencia/normas
6.
Nucleic Acids Res ; 41(Database issue): D906-14, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23203872

RESUMEN

RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Primates/genética , Transcriptoma , Animales , Genoma Humano , Humanos , Internet , Primates/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARN
7.
Mol Pain ; 10: 44, 2014 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-25123163

RESUMEN

BACKGROUND: Three neuropeptides, gastrin releasing peptide (GRP), natriuritic precursor peptide B (NPPB), and neuromedin B (NMB) have been proposed to play roles in itch sensation. However, the tissues in which these peptides are expressed and their positions in the itch circuit has recently become the subject of debate. Here we used next-gen RNA-Seq to examine the expression of transcripts coding for GRP, NPPB, NMB, and other peptides in DRG, trigeminal ganglion, and the spinal cord as well as expression levels for their cognate receptors in these tissues. RESULTS: RNA-Seq demonstrates that GRP is not transcribed in mouse, rat, or human sensory ganglia. NPPB, which activates natriuretic peptide receptor 1 (NPR1), is well expressed in mouse DRG and less so in rat and human, whereas NPPA, which also acts on the NPR1 receptor, is expressed in all three species. Analysis of transcripts expressed in the spinal cord of mouse, rat, and human reveals no expression of Nppb, but unambiguously detects expression of Grp and the GRP-receptor (Grpr). The transcripts coding for NMB and tachykinin peptides are among the most highly expressed in DRG. Bioinformatics comparisons using the sequence of the peptides used to produce GRP-antibodies with proteome databases revealed that the C-terminal primary sequence of NMB and Substance P can potentially account for results from previous studies which showed GRP-immunostaining in the DRG. CONCLUSIONS: RNA-Seq corroborates a primary itch afferent role for NPPB in mouse and potentially NPPB and NPPA in rats and humans, but does not support GRP as a primary itch neurotransmitter in mouse, rat, or humans. As such, our results are at odds with the initial proposal of Sun and Chen (2007) that GRP is expressed in DRG. By contrast, our data strongly support an itch pathway where the itch-inducing actions of GRP are exerted through its release from spinal cord neurons.


Asunto(s)
Ganglios Espinales/metabolismo , Péptido Liberador de Gastrina/metabolismo , Péptido Natriurético Encefálico/metabolismo , Médula Espinal/citología , Ganglio del Trigémino/metabolismo , Animales , Secuencia de Bases , Biología Computacional , Péptido Liberador de Gastrina/genética , Humanos , Ratones , Péptido Natriurético Encefálico/genética , Ratas , Receptores de Neuropéptido/genética , Receptores de Neuropéptido/metabolismo , Especificidad de la Especie
8.
Nat Commun ; 15(1): 4950, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38862496

RESUMEN

The advent of civilian spaceflight challenges scientists to precisely describe the effects of spaceflight on human physiology, particularly at the molecular and cellular level. Newer, nanopore-based sequencing technologies can quantitatively map changes in chemical structure and expression at single molecule resolution across entire isoforms. We perform long-read, direct RNA nanopore sequencing, as well as Ultima high-coverage RNA-sequencing, of whole blood sampled longitudinally from four SpaceX Inspiration4 astronauts at seven timepoints, spanning pre-flight, day of return, and post-flight recovery. We report key genetic pathways, including changes in erythrocyte regulation, stress induction, and immune changes affected by spaceflight. We also present the first m6A methylation profiles for a human space mission, suggesting a significant spike in m6A levels immediately post-flight. These data and results represent the first longitudinal long-read RNA profiles and RNA modification maps for each gene for astronauts, improving our understanding of the human transcriptome's dynamic response to spaceflight.


Asunto(s)
Astronautas , Análisis de Secuencia de ARN , Vuelo Espacial , Humanos , Análisis de Secuencia de ARN/métodos , Transcriptoma/genética , Ingravidez , Masculino , Hematopoyesis/genética , Secuenciación de Nanoporos/métodos , Adulto , ARN/genética , ARN/sangre , Metilación , Persona de Mediana Edad
9.
Nat Biotechnol ; 2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37679545

RESUMEN

Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.

10.
Genome Res ; 19(12): 2288-99, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19767418

RESUMEN

The organization of mammalian DNA replication is poorly understood. We have produced high-resolution dynamic maps of the timing of replication in human erythroid, mesenchymal, and embryonic stem (ES) cells using TimEX, a method that relies on gaussian convolution of massive, highly redundant determinations of DNA copy-number variations during S phase to produce replication timing profiles. We first obtained timing maps of 3% of the genome using high-density oligonucleotide tiling arrays and then extended the TimEX method genome-wide using massively parallel sequencing. We show that in untransformed human cells, timing of replication is highly regulated and highly synchronous, and that many genomic segments are replicated in temporal transition regions devoid of initiation, where replication forks progress unidirectionally from origins that can be hundreds of kilobases away. Absence of initiation in one transition region is shown at the molecular level by single molecule analysis of replicated DNA (SMARD). Comparison of ES and erythroid cells replication patterns revealed that these cells replicate about 20% of their genome in different quarters of S phase. Importantly, we detected a strong inverse relationship between timing of replication and distance to the closest expressed gene. This relationship can be used to predict tissue-specific timing of replication profiles from expression data and genomic annotations. We also provide evidence that early origins of replication are preferentially located near highly expressed genes, that mid-firing origins are located near moderately expressed genes, and that late-firing origins are located far from genes.


Asunto(s)
Momento de Replicación del ADN , Replicación del ADN , Células Madre Embrionarias , Células Eritroides , Perfilación de la Expresión Génica , Células Madre Mesenquimatosas , Fase S , Diferenciación Celular , ADN/biosíntesis , ADN/genética , Células Madre Embrionarias/citología , Células Madre Embrionarias/metabolismo , Células Eritroides/citología , Células Eritroides/metabolismo , Dosificación de Gen , Humanos , Células Madre Mesenquimatosas/citología , Células Madre Mesenquimatosas/metabolismo , Distribución Normal
11.
Proc Natl Acad Sci U S A ; 106(44): 18674-9, 2009 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-19846761

RESUMEN

Endogenous small interfering RNAs (endo-siRNAs) regulate diverse gene expression programs in eukaryotes by either binding and cleaving mRNA targets or mediating heterochromatin formation; however, the mechanisms of endo-siRNA biogenesis, sorting, and target regulation remain poorly understood. Here we report the identification and function of a specific class of germline-generated endo-siRNAs in Caenorhabditis elegans that are 26 nt in length and contain a guanine at the first nucleotide position (i.e., 26G RNAs). 26G RNAs regulate gene expression during spermatogenesis and zygotic development, and their biogenesis requires the ERI-1 exonuclease and the RRF-3 RNA-dependent RNA polymerase (RdRP). Remarkably, we identified two nonoverlapping subclasses of 26G RNAs that sort into specific RNA-induced silencing complexes (RISCs) and differentially regulate distinct mRNA targets. Class I 26G RNAs target genes are expressed during spermatogenesis, whereas class II 26G RNAs are maternally inherited and silence gene expression during zygotic development. These findings implicate a class of endo-siRNAs in the global regulation of transcriptional programs required for fertility and development.


Asunto(s)
Caenorhabditis elegans/embriología , Caenorhabditis elegans/genética , Regulación del Desarrollo de la Expresión Génica , Guanina/metabolismo , ARN Interferente Pequeño/metabolismo , Espermatogénesis/genética , Cigoto/metabolismo , Animales , Proteínas de Caenorhabditis elegans/metabolismo , Exorribonucleasas/metabolismo , Silenciador del Gen , Células Germinativas/metabolismo , Masculino , ARN de Helminto/clasificación , ARN de Helminto/metabolismo , ARN Interferente Pequeño/biosíntesis , ARN Interferente Pequeño/clasificación , Análisis de Secuencia de ADN
12.
J Pain ; 21(9-10): 988-1004, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31931229

RESUMEN

Understanding molecular alterations associated with peripheral inflammation is a critical factor in selectively controlling acute and persistent pain. The present report employs in situ hybridization of the 2 opioid precursor mRNAs coupled with quantitative measurements of 2 peptides derived from the prodynorphin and proenkephalin precursor proteins: dynorphin A 1-8 and [Met5]-enkephalin-Arg6-Gly7-Leu8. In dorsal spinal cord ipsilateral to the inflammation, dynorphin A 1-8 was elevated after inflammation, and persisted as long as the inflammation was sustained. Qualitative identification by high performance liquid chromatography and gel permeation chromatography revealed the major immunoreactive species in control and inflamed extracts to be dynorphin A 1-8. In situ hybridization in spinal cord after administration of the inflammatory agent, carrageenan, showed increased expression of prodynorphin (Pdyn) mRNA somatotopically in medial superficial dorsal horn neurons. The fold increase in preproenkephalin mRNA (Penk) was comparatively lower, although the basal expression is substantially higher than Pdyn. While Pdyn is not expressed in the dorsal root ganglion (DRG) in basal conditions, it can be induced by nerve injury, but not by inflammation alone. A bioinformatic meta-analysis of multiple nerve injury datasets confirmed Pdyn upregulation in DRG across different nerve injury models. These data support the idea that activation of endogenous opioids, notably dynorphin, is a dynamic indicator of persistent pain states in spinal cord and of nerve injury in DRG. PERSPECTIVE: This is a systematic, quantitative assessment of dynorphin and enkephalin peptides and mRNA in dorsal spinal cord and DRG neurons in response to peripheral inflammation and axotomy. These studies form the foundational framework for understanding how endogenous spinal opioid peptides are involved in nociceptive circuit modulation.


Asunto(s)
Dinorfinas/metabolismo , Encefalinas/metabolismo , Ganglios Espinales/metabolismo , Hiperalgesia/metabolismo , Mediadores de Inflamación/metabolismo , Médula Espinal/metabolismo , Animales , Dinorfinas/análisis , Encefalinas/análisis , Ganglios Espinales/química , Mediadores de Inflamación/análisis , Masculino , Péptidos Opioides/análisis , Péptidos Opioides/metabolismo , Fragmentos de Péptidos/análisis , Fragmentos de Péptidos/metabolismo , ARN Mensajero/análisis , ARN Mensajero/metabolismo , Ratas , Ratas Sprague-Dawley , Médula Espinal/química
13.
bioRxiv ; 2020 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-32511352

RESUMEN

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has caused thousands of deaths worldwide, including >18,000 in New York City (NYC) alone. The sudden emergence of this pandemic has highlighted a pressing clinical need for rapid, scalable diagnostics that can detect infection, interrogate strain evolution, and identify novel patient biomarkers. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs, plus a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, bacterial, and viral profiling. We applied both technologies across 857 SARS-CoV-2 clinical specimens and 86 NYC subway samples, providing a broad molecular portrait of the COVID-19 NYC outbreak. Our results define new features of SARS-CoV-2 evolution, nominate a novel, NYC-enriched viral subclade, reveal specific host responses in interferon, ACE, hematological, and olfaction pathways, and examine risks associated with use of ACE inhibitors and angiotensin receptor blockers. Together, these findings have immediate applications to SARS-CoV-2 diagnostics, public health, and new therapeutic targets.

14.
BMC Genomics ; 10: 264, 2009 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-19523228

RESUMEN

BACKGROUND: Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC) reference RNA samples using Roche's 454 Genome Sequencer FLX. RESULTS: We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values

Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ARN/métodos , ADN Complementario/genética , Bases de Datos Genéticas , Biblioteca de Genes , Genoma Humano , Humanos , Control de Calidad , Estándares de Referencia , Sensibilidad y Especificidad , Alineación de Secuencia , Programas Informáticos
15.
Nat Biotechnol ; 24(9): 1123-31, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16964226

RESUMEN

We have assessed the utility of RNA titration samples for evaluating microarray platform performance and the impact of different normalization methods on the results obtained. As part of the MicroArray Quality Control project, we investigated the performance of five commercial microarray platforms using two independent RNA samples and two titration mixtures of these samples. Focusing on 12,091 genes common across all platforms, we determined the ability of each platform to detect the correct titration response across the samples. Global deviations from the response predicted by the titration ratios were observed. These differences could be explained by variations in relative amounts of messenger RNA as a fraction of total RNA between the two independent samples. Overall, both the qualitative and quantitative correspondence across platforms was high. In summary, titration samples may be regarded as a valuable tool, not only for assessing microarray platform performance and different analysis methods, but also for determining some underlying biological features of the samples.


Asunto(s)
Análisis de Falla de Equipo/métodos , Perfilación de la Expresión Génica/instrumentación , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , ARN/análisis , ARN/genética , Algoritmos , Valores de Referencia , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Estados Unidos
16.
Front Cell Dev Biol ; 7: 299, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31824949

RESUMEN

Secreted proteins (SPs) play important roles in diverse important biological processes; however, a comprehensive and high-quality list of human SPs is still lacking. Here we identified 6,943 high-confidence human SPs (3,522 of them are novel) based on 330,427 human proteins derived from databases of UniProt, Ensembl, AceView, and RefSeq. Notably, 6,267 of 6,943 (90.3%) SPs have the supporting evidences from a large amount of mass spectrometry (MS) and RNA-seq data. We found that the SPs were broadly expressed in diverse tissues as well as human body fluid, and a significant portion of them exhibited tissue-specific expression. Moreover, 14 cancer-specific SPs that their expression levels were significantly associated with the patients' survival of eight different tumors were identified, which could be potential prognostic biomarkers. Strikingly, 89.21% of 6,943 SPs (2,927 novel SPs) contain known protein domains. Those novel SPs we mainly enriched with the known domains regarding immunity, such as Immunoglobulin V-set and C1-set domain. Specifically, we constructed a user-friendly and freely accessible database, SPRomeDB (www.unimd.org/SPRomeDB), to catalog those SPs. Our comprehensive SP identification and characterization gain insights into human secretome and provide valuable resource for future researches.

17.
Nucleic Acids Res ; 34(14): 3917-28, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16914452

RESUMEN

We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56,419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37,670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants.


Asunto(s)
Empalme Alternativo , ADN Complementario/química , Genoma Humano , Proteínas/genética , ARN Mensajero/química , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Secuencia de Bases , Biología Computacional/métodos , Exones , Variación Genética , Genómica/métodos , Humanos , Proteínas/química , Proteínas/fisiología , ARN Mensajero/metabolismo , Análisis de Secuencia de ADN
19.
PLoS Biol ; 2(6): e162, 2004 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15103394

RESUMEN

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.


Asunto(s)
Biología Computacional/métodos , ADN Complementario/genética , Bases de Datos Genéticas , Genes/fisiología , Genoma Humano , Empalme Alternativo/genética , Genes/genética , Humanos , Internet , Repeticiones de Microsatélite/genética , Sistemas de Lectura Abierta/genética , Polimorfismo Genético , Polimorfismo de Nucleótido Simple , Estructura Terciaria de Proteína
20.
BMC Syst Biol ; 9: 75, 2015 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-26542228

RESUMEN

BACKGROUND: Cellular function and diversity are orchestrated by complex interactions of fundamental biomolecules including DNA, RNA and proteins. Technological advances in genomics, epigenomics, transcriptomics and proteomics have enabled massively parallel and unbiased measurements. Such high-throughput technologies have been extensively used to carry out broad, unbiased studies, particularly in the context of human diseases. Nevertheless, a unified analysis of the genome, epigenome, transcriptome and proteome of a single human cell type to obtain a coherent view of the complex interplay between various biomolecules has not yet been undertaken. Here, we report the first multi-omic analysis of human primary naïve CD4+ T cells isolated from a single individual. RESULTS: Integrating multi-omics datasets allowed us to investigate genome-wide methylation and its effect on mRNA/protein expression patterns, extent of RNA editing under normal physiological conditions and allele specific expression in naïve CD4+ T cells. In addition, we carried out a multi-omic comparative analysis of naïve with primary resting memory CD4+ T cells to identify molecular changes underlying T cell differentiation. This analysis provided mechanistic insights into how several molecules involved in T cell receptor signaling are regulated at the DNA, RNA and protein levels. Phosphoproteomics revealed downstream signaling events that regulate these two cellular states. Availability of multi-omics data from an identical genetic background also allowed us to employ novel proteogenomics approaches to identify individual-specific variants and putative novel protein coding regions in the human genome. CONCLUSIONS: We utilized multiple high-throughput technologies to derive a comprehensive profile of two primary human cell types, naïve CD4+ T cells and memory CD4+ T cells, from a single donor. Through vertical as well as horizontal integration of whole genome sequencing, methylation arrays, RNA-Seq, miRNA-Seq, proteomics, and phosphoproteomics, we derived an integrated and comparative map of these two closely related immune cells and identified potential molecular effectors of immune cell differentiation following antigen encounter.


Asunto(s)
Linfocitos T CD4-Positivos/metabolismo , Inmunidad Innata/fisiología , Modelos Biológicos , Metilación de ADN , Epigenómica , Perfilación de la Expresión Génica , Variación Genética , Genoma Humano , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Inmunidad Innata/genética , Fosforilación , Proteómica , Edición de ARN/efectos de los fármacos , ARN Mensajero/metabolismo , Transducción de Señal/genética , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA