Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
Nature ; 526(7575): 700-4, 2015 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-26466568

RESUMO

Neuroblastoma is a malignant paediatric tumour of the sympathetic nervous system. Roughly half of these tumours regress spontaneously or are cured by limited therapy. By contrast, high-risk neuroblastomas have an unfavourable clinical course despite intensive multimodal treatment, and their molecular basis has remained largely elusive. Here we have performed whole-genome sequencing of 56 neuroblastomas (high-risk, n = 39; low-risk, n = 17) and discovered recurrent genomic rearrangements affecting a chromosomal region at 5p15.33 proximal of the telomerase reverse transcriptase gene (TERT). These rearrangements occurred only in high-risk neuroblastomas (12/39, 31%) in a mutually exclusive fashion with MYCN amplifications and ATRX mutations, which are known genetic events in this tumour type. In an extended case series (n = 217), TERT rearrangements defined a subgroup of high-risk tumours with particularly poor outcome. Despite a large structural diversity of these rearrangements, they all induced massive transcriptional upregulation of TERT. In the remaining high-risk tumours, TERT expression was also elevated in MYCN-amplified tumours, whereas alternative lengthening of telomeres was present in neuroblastomas without TERT or MYCN alterations, suggesting that telomere lengthening represents a central mechanism defining this subtype. The 5p15.33 rearrangements juxtapose the TERT coding sequence to strong enhancer elements, resulting in massive chromatin remodelling and DNA methylation of the affected region. Supporting a functional role of TERT, neuroblastoma cell lines bearing rearrangements or amplified MYCN exhibited both upregulated TERT expression and enzymatic telomerase activity. In summary, our findings show that remodelling of the genomic context abrogates transcriptional silencing of TERT in high-risk neuroblastoma and places telomerase activation in the centre of transformation in a large fraction of these tumours.


Assuntos
Regulação Neoplásica da Expressão Gênica/genética , Genoma Humano/genética , Neuroblastoma/genética , Neuroblastoma/patologia , Recombinação Genética/genética , Telomerase/genética , Telomerase/metabolismo , Linhagem Celular Tumoral , Transformação Celular Neoplásica/genética , Cromatina/genética , Cromatina/metabolismo , Cromossomos Humanos Par 5/genética , DNA Helicases/genética , Metilação de DNA , Elementos Facilitadores Genéticos/genética , Ativação Enzimática/genética , Amplificação de Genes/genética , Inativação Gênica , Humanos , Lactente , Proteína Proto-Oncogênica N-Myc , Neuroblastoma/classificação , Neuroblastoma/enzimologia , Proteínas Nucleares/genética , Proteínas Oncogênicas/genética , Prognóstico , RNA Mensageiro/análise , RNA Mensageiro/genética , Risco , Translocação Genética/genética , Regulação para Cima/genética , Proteína Nuclear Ligada ao X
2.
BMC Bioinformatics ; 20(1): 405, 2019 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345161

RESUMO

BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


Assuntos
RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , Software , Algoritmos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , Curva ROC , Fatores de Tempo
3.
Immunity ; 31(6): 941-52, 2009 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-20064451

RESUMO

Interleukin-21 (IL-21) is a pleiotropic cytokine that induces expression of transcription factor BLIMP1 (encoded by Prdm1), which regulates plasma cell differentiation and T cell homeostasis. We identified an IL-21 response element downstream of Prdm1 that binds the transcription factors STAT3 and IRF4, which are required for optimal Prdm1 expression. Genome-wide ChIP-Seq mapping of STAT3- and IRF4-binding sites showed that most regions with IL-21-induced STAT3 binding also bound IRF4 in vivo and furthermore revealed that the noncanonical TTCnnnTAA GAS motif critical in Prdm1 was broadly used for STAT3 binding. Comparing genome-wide expression array data to binding sites revealed that most IL-21-regulated genes were associated with combined STAT3-IRF4 sites rather than pure STAT3 sites. Correspondingly, ChIP-Seq analysis of Irf4(-/-) T cells showed greatly diminished STAT3 binding after IL-21 treatment, and Irf4(-/-) mice showed impaired IL-21-induced Tfh cell differentiation in vivo. These results reveal broad cooperative gene regulation by STAT3 and IRF4.


Assuntos
Regulação da Expressão Gênica , Fatores Reguladores de Interferon/metabolismo , Interleucinas/metabolismo , Fator de Transcrição STAT3/metabolismo , Fatores de Transcrição/genética , Animais , Linfócitos B/imunologia , Sequência de Bases , Sítios de Ligação , Linfócitos T CD4-Positivos/imunologia , Diferenciação Celular , Estudo de Associação Genômica Ampla , Fatores Reguladores de Interferon/genética , Íntrons , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Dados de Sequência Molecular , Fator 1 de Ligação ao Domínio I Regulador Positivo , Fator de Transcrição STAT3/genética
4.
Cephalalgia ; 38(5): 912-932, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-28699403

RESUMO

Background The trigeminal ganglion contains neurons that relay sensations of pain, touch, pressure, and many other somatosensory modalities to the central nervous system. The ganglion is also a reservoir for latent herpes virus 1 infection. To gain a better understanding of molecular factors contributing to migraine and headache, transcriptome analyses were performed on postmortem human trigeminal ganglia. Methods RNA-Seq measurements of gene expression were conducted on small sub-regions of 16 human trigeminal ganglia. The samples were also characterized for transcripts derived from viral and microbial genomes. Herpes simplex virus 1 (HSV-1) antibodies in blood were measured using the luciferase immunoprecipitation assay. Results Observed molecular heterogeneity could be explained by sampling of anatomically distinct sub-regions of the excised ganglia consistent with neurally-enriched and non-neural, i.e. Schwann cell, enriched subregions. The levels of HSV-1 transcripts detected in trigeminal ganglia correlated with blood levels of HSV-1 antibodies. Multiple migraine susceptibility genes were strongly expressed in neurally-enriched trigeminal samples, while others were expressed in blood vessels. Conclusions These data provide a comprehensive human trigeminal transcriptome and a framework for evaluation of inhomogeneous post-mortem tissues through extensive quality control and refined downstream analyses for RNA-Seq methodologies. Expression profiling of migraine susceptibility genes identified by genetic association appears to emphasize the blood vessel component of the trigeminovascular system. Other genes displayed enriched expression in the trigeminal compared to dorsal root ganglion, and in-depth transcriptomic analysis of the KCNK18 gene underlying familial migraine shows selective neural expression within two specific populations of ganglionic neurons. These data suggest that expression profiling of migraine-associated genes can extend and amplify the underlying neurobiological insights obtained from genetic association studies.


Assuntos
Herpesvirus Humano 1/genética , Canais de Potássio/genética , RNA/genética , Análise de Sequência de RNA/métodos , Gânglio Trigeminal/patologia , Adolescente , Adulto , Autopsia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Gânglio Trigeminal/fisiologia , Gânglio Trigeminal/virologia , Adulto Jovem
5.
Nucleic Acids Res ; 43(Database issue): D737-42, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25392405

RESUMO

The non-human primate reference transcriptome resource (NHPRTR, available online at http://nhprtr.org/) aims to generate comprehensive RNA-seq data from a wide variety of non-human primates (NHPs), from lemurs to hominids. In the 2012 Phase I of the NHPRTR project, 19 billion fragments or 3.8 terabases of transcriptome sequences were collected from pools of ∼ 20 tissues in 15 species and subspecies. Here we describe a major expansion of NHPRTR by adding 10.1 billion fragments of tissue-specific RNA-seq data. For this effort, we selected 11 of the original 15 NHP species and subspecies and constructed total RNA libraries for the same ∼ 15 tissues in each. The sequence quality is such that 88% of the reads align to human reference sequences, allowing us to compute the full list of expression abundance across all tissues for each species, using the reads mapped to human genes. This update also includes improved transcript annotations derived from RNA-seq data for rhesus and cynomolgus macaques, two of the most commonly used NHP models and additional RNA-seq data compiled from related projects. Together, these comprehensive reference transcriptomes from multiple primates serve as a valuable community resource for genome annotation, gene dynamics and comparative functional analysis.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Primatas/genética , Análise de Sequência de RNA , Animais , Internet , Macaca , Anotação de Sequência Molecular , Especificidade de Órgãos , Padrões de Referência , Alinhamento de Sequência/normas
6.
Nucleic Acids Res ; 41(Database issue): D906-14, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203872

RESUMO

RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Primatas/genética , Transcriptoma , Animais , Genoma Humano , Humanos , Internet , Primatas/metabolismo , Alinhamento de Sequência , Análise de Sequência de RNA
7.
Mol Pain ; 10: 44, 2014 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-25123163

RESUMO

BACKGROUND: Three neuropeptides, gastrin releasing peptide (GRP), natriuritic precursor peptide B (NPPB), and neuromedin B (NMB) have been proposed to play roles in itch sensation. However, the tissues in which these peptides are expressed and their positions in the itch circuit has recently become the subject of debate. Here we used next-gen RNA-Seq to examine the expression of transcripts coding for GRP, NPPB, NMB, and other peptides in DRG, trigeminal ganglion, and the spinal cord as well as expression levels for their cognate receptors in these tissues. RESULTS: RNA-Seq demonstrates that GRP is not transcribed in mouse, rat, or human sensory ganglia. NPPB, which activates natriuretic peptide receptor 1 (NPR1), is well expressed in mouse DRG and less so in rat and human, whereas NPPA, which also acts on the NPR1 receptor, is expressed in all three species. Analysis of transcripts expressed in the spinal cord of mouse, rat, and human reveals no expression of Nppb, but unambiguously detects expression of Grp and the GRP-receptor (Grpr). The transcripts coding for NMB and tachykinin peptides are among the most highly expressed in DRG. Bioinformatics comparisons using the sequence of the peptides used to produce GRP-antibodies with proteome databases revealed that the C-terminal primary sequence of NMB and Substance P can potentially account for results from previous studies which showed GRP-immunostaining in the DRG. CONCLUSIONS: RNA-Seq corroborates a primary itch afferent role for NPPB in mouse and potentially NPPB and NPPA in rats and humans, but does not support GRP as a primary itch neurotransmitter in mouse, rat, or humans. As such, our results are at odds with the initial proposal of Sun and Chen (2007) that GRP is expressed in DRG. By contrast, our data strongly support an itch pathway where the itch-inducing actions of GRP are exerted through its release from spinal cord neurons.


Assuntos
Gânglios Espinais/metabolismo , Peptídeo Liberador de Gastrina/metabolismo , Peptídeo Natriurético Encefálico/metabolismo , Medula Espinal/citologia , Gânglio Trigeminal/metabolismo , Animais , Sequência de Bases , Biologia Computacional , Peptídeo Liberador de Gastrina/genética , Humanos , Camundongos , Peptídeo Natriurético Encefálico/genética , Ratos , Receptores de Neuropeptídeos/genética , Receptores de Neuropeptídeos/metabolismo , Especificidade da Espécie
8.
Nat Commun ; 15(1): 4950, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38862496

RESUMO

The advent of civilian spaceflight challenges scientists to precisely describe the effects of spaceflight on human physiology, particularly at the molecular and cellular level. Newer, nanopore-based sequencing technologies can quantitatively map changes in chemical structure and expression at single molecule resolution across entire isoforms. We perform long-read, direct RNA nanopore sequencing, as well as Ultima high-coverage RNA-sequencing, of whole blood sampled longitudinally from four SpaceX Inspiration4 astronauts at seven timepoints, spanning pre-flight, day of return, and post-flight recovery. We report key genetic pathways, including changes in erythrocyte regulation, stress induction, and immune changes affected by spaceflight. We also present the first m6A methylation profiles for a human space mission, suggesting a significant spike in m6A levels immediately post-flight. These data and results represent the first longitudinal long-read RNA profiles and RNA modification maps for each gene for astronauts, improving our understanding of the human transcriptome's dynamic response to spaceflight.


Assuntos
Astronautas , Análise de Sequência de RNA , Voo Espacial , Humanos , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Ausência de Peso , Masculino , Hematopoese/genética , Sequenciamento por Nanoporos/métodos , Adulto , RNA/genética , RNA/sangue , Metilação , Pessoa de Meia-Idade
9.
Sci Data ; 11(1): 892, 2024 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-39152166

RESUMO

Next-generation sequencing (NGS) has revolutionized genomic research by enabling high-throughput, cost-effective genome and transcriptome sequencing accelerating personalized medicine for complex diseases, including cancer. Whole genome/transcriptome sequencing (WGS/WTS) provides comprehensive insights, while targeted sequencing is more cost-effective and sensitive. In comparison to short-read sequencing, which still dominates the field due to high speed and cost-effectiveness, long-read sequencing can overcome alignment limitations and better discriminate similar sequences from alternative transcripts or repetitive regions. Hybrid sequencing combines the best strengths of different technologies for a more comprehensive view of genomic/transcriptomic variations. Understanding each technology's strengths and limitations is critical for translating cutting-edge technologies into clinical applications. In this study, we sequenced DNA and RNA libraries of reference samples using various targeted DNA and RNA panels and the whole transcriptome on both short-read and long-read platforms. This study design enables a comprehensive analysis of sequencing technologies, targeting protocols, and library preparation methods. Our expanded profiling landscape establishes a reference point for assessing current sequencing technologies, facilitating informed decision-making in genomic research and precision medicine.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA-Seq , Análise de Sequência de DNA/métodos , Transcriptoma , Análise de Sequência de RNA , Medicina de Precisão
10.
Nat Biotechnol ; 2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37679545

RESUMO

Certified RNA reference materials are indispensable for assessing the reliability of RNA sequencing to detect intrinsically small biological differences in clinical settings, such as molecular subtyping of diseases. As part of the Quartet Project for quality control and data integration of multi-omics profiling, we established four RNA reference materials derived from immortalized B-lymphoblastoid cell lines from four members of a monozygotic twin family. Additionally, we constructed ratio-based transcriptome-wide reference datasets between two samples, providing cross-platform and cross-laboratory 'ground truth'. Investigation of the intrinsically subtle biological differences among the Quartet samples enables sensitive assessment of cross-batch integration of transcriptomic measurements at the ratio level. The Quartet RNA reference materials, combined with the ratio-based reference datasets, can serve as unique resources for assessing and improving the quality of transcriptomic data in clinical and biological settings.

11.
Genome Res ; 19(12): 2288-99, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19767418

RESUMO

The organization of mammalian DNA replication is poorly understood. We have produced high-resolution dynamic maps of the timing of replication in human erythroid, mesenchymal, and embryonic stem (ES) cells using TimEX, a method that relies on gaussian convolution of massive, highly redundant determinations of DNA copy-number variations during S phase to produce replication timing profiles. We first obtained timing maps of 3% of the genome using high-density oligonucleotide tiling arrays and then extended the TimEX method genome-wide using massively parallel sequencing. We show that in untransformed human cells, timing of replication is highly regulated and highly synchronous, and that many genomic segments are replicated in temporal transition regions devoid of initiation, where replication forks progress unidirectionally from origins that can be hundreds of kilobases away. Absence of initiation in one transition region is shown at the molecular level by single molecule analysis of replicated DNA (SMARD). Comparison of ES and erythroid cells replication patterns revealed that these cells replicate about 20% of their genome in different quarters of S phase. Importantly, we detected a strong inverse relationship between timing of replication and distance to the closest expressed gene. This relationship can be used to predict tissue-specific timing of replication profiles from expression data and genomic annotations. We also provide evidence that early origins of replication are preferentially located near highly expressed genes, that mid-firing origins are located near moderately expressed genes, and that late-firing origins are located far from genes.


Assuntos
Período de Replicação do DNA , Replicação do DNA , Células-Tronco Embrionárias , Células Eritroides , Perfilação da Expressão Gênica , Células-Tronco Mesenquimais , Fase S , Diferenciação Celular , DNA/biossíntese , DNA/genética , Células-Tronco Embrionárias/citologia , Células-Tronco Embrionárias/metabolismo , Células Eritroides/citologia , Células Eritroides/metabolismo , Dosagem de Genes , Humanos , Células-Tronco Mesenquimais/citologia , Células-Tronco Mesenquimais/metabolismo , Distribuição Normal
12.
Proc Natl Acad Sci U S A ; 106(44): 18674-9, 2009 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-19846761

RESUMO

Endogenous small interfering RNAs (endo-siRNAs) regulate diverse gene expression programs in eukaryotes by either binding and cleaving mRNA targets or mediating heterochromatin formation; however, the mechanisms of endo-siRNA biogenesis, sorting, and target regulation remain poorly understood. Here we report the identification and function of a specific class of germline-generated endo-siRNAs in Caenorhabditis elegans that are 26 nt in length and contain a guanine at the first nucleotide position (i.e., 26G RNAs). 26G RNAs regulate gene expression during spermatogenesis and zygotic development, and their biogenesis requires the ERI-1 exonuclease and the RRF-3 RNA-dependent RNA polymerase (RdRP). Remarkably, we identified two nonoverlapping subclasses of 26G RNAs that sort into specific RNA-induced silencing complexes (RISCs) and differentially regulate distinct mRNA targets. Class I 26G RNAs target genes are expressed during spermatogenesis, whereas class II 26G RNAs are maternally inherited and silence gene expression during zygotic development. These findings implicate a class of endo-siRNAs in the global regulation of transcriptional programs required for fertility and development.


Assuntos
Caenorhabditis elegans/embriologia , Caenorhabditis elegans/genética , Regulação da Expressão Gênica no Desenvolvimento , Guanina/metabolismo , RNA Interferente Pequeno/metabolismo , Espermatogênese/genética , Zigoto/metabolismo , Animais , Proteínas de Caenorhabditis elegans/metabolismo , Exorribonucleases/metabolismo , Inativação Gênica , Células Germinativas/metabolismo , Masculino , RNA de Helmintos/classificação , RNA de Helmintos/metabolismo , RNA Interferente Pequeno/biossíntese , RNA Interferente Pequeno/classificação , Análise de Sequência de DNA
13.
J Pain ; 21(9-10): 988-1004, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31931229

RESUMO

Understanding molecular alterations associated with peripheral inflammation is a critical factor in selectively controlling acute and persistent pain. The present report employs in situ hybridization of the 2 opioid precursor mRNAs coupled with quantitative measurements of 2 peptides derived from the prodynorphin and proenkephalin precursor proteins: dynorphin A 1-8 and [Met5]-enkephalin-Arg6-Gly7-Leu8. In dorsal spinal cord ipsilateral to the inflammation, dynorphin A 1-8 was elevated after inflammation, and persisted as long as the inflammation was sustained. Qualitative identification by high performance liquid chromatography and gel permeation chromatography revealed the major immunoreactive species in control and inflamed extracts to be dynorphin A 1-8. In situ hybridization in spinal cord after administration of the inflammatory agent, carrageenan, showed increased expression of prodynorphin (Pdyn) mRNA somatotopically in medial superficial dorsal horn neurons. The fold increase in preproenkephalin mRNA (Penk) was comparatively lower, although the basal expression is substantially higher than Pdyn. While Pdyn is not expressed in the dorsal root ganglion (DRG) in basal conditions, it can be induced by nerve injury, but not by inflammation alone. A bioinformatic meta-analysis of multiple nerve injury datasets confirmed Pdyn upregulation in DRG across different nerve injury models. These data support the idea that activation of endogenous opioids, notably dynorphin, is a dynamic indicator of persistent pain states in spinal cord and of nerve injury in DRG. PERSPECTIVE: This is a systematic, quantitative assessment of dynorphin and enkephalin peptides and mRNA in dorsal spinal cord and DRG neurons in response to peripheral inflammation and axotomy. These studies form the foundational framework for understanding how endogenous spinal opioid peptides are involved in nociceptive circuit modulation.


Assuntos
Dinorfinas/metabolismo , Encefalinas/metabolismo , Gânglios Espinais/metabolismo , Hiperalgesia/metabolismo , Mediadores da Inflamação/metabolismo , Medula Espinal/metabolismo , Animais , Dinorfinas/análise , Encefalinas/análise , Gânglios Espinais/química , Mediadores da Inflamação/análise , Masculino , Peptídeos Opioides/análise , Peptídeos Opioides/metabolismo , Fragmentos de Peptídeos/análise , Fragmentos de Peptídeos/metabolismo , RNA Mensageiro/análise , RNA Mensageiro/metabolismo , Ratos , Ratos Sprague-Dawley , Medula Espinal/química
14.
bioRxiv ; 2020 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-32511352

RESUMO

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has caused thousands of deaths worldwide, including >18,000 in New York City (NYC) alone. The sudden emergence of this pandemic has highlighted a pressing clinical need for rapid, scalable diagnostics that can detect infection, interrogate strain evolution, and identify novel patient biomarkers. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs, plus a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, bacterial, and viral profiling. We applied both technologies across 857 SARS-CoV-2 clinical specimens and 86 NYC subway samples, providing a broad molecular portrait of the COVID-19 NYC outbreak. Our results define new features of SARS-CoV-2 evolution, nominate a novel, NYC-enriched viral subclade, reveal specific host responses in interferon, ACE, hematological, and olfaction pathways, and examine risks associated with use of ACE inhibitors and angiotensin receptor blockers. Together, these findings have immediate applications to SARS-CoV-2 diagnostics, public health, and new therapeutic targets.

15.
BMC Genomics ; 10: 264, 2009 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-19523228

RESUMO

BACKGROUND: Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC) reference RNA samples using Roche's 454 Genome Sequencer FLX. RESULTS: We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values

Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de RNA/métodos , DNA Complementar/genética , Bases de Dados Genéticas , Biblioteca Gênica , Genoma Humano , Humanos , Controle de Qualidade , Padrões de Referência , Sensibilidade e Especificidade , Alinhamento de Sequência , Software
16.
Nat Biotechnol ; 24(9): 1123-31, 2006 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-16964226

RESUMO

We have assessed the utility of RNA titration samples for evaluating microarray platform performance and the impact of different normalization methods on the results obtained. As part of the MicroArray Quality Control project, we investigated the performance of five commercial microarray platforms using two independent RNA samples and two titration mixtures of these samples. Focusing on 12,091 genes common across all platforms, we determined the ability of each platform to detect the correct titration response across the samples. Global deviations from the response predicted by the titration ratios were observed. These differences could be explained by variations in relative amounts of messenger RNA as a fraction of total RNA between the two independent samples. Overall, both the qualitative and quantitative correspondence across platforms was high. In summary, titration samples may be regarded as a valuable tool, not only for assessing microarray platform performance and different analysis methods, but also for determining some underlying biological features of the samples.


Assuntos
Análise de Falha de Equipamento/métodos , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/normas , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Análise de Sequência com Séries de Oligonucleotídeos/normas , RNA/análise , RNA/genética , Algoritmos , Valores de Referência , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Estados Unidos
17.
Front Cell Dev Biol ; 7: 299, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31824949

RESUMO

Secreted proteins (SPs) play important roles in diverse important biological processes; however, a comprehensive and high-quality list of human SPs is still lacking. Here we identified 6,943 high-confidence human SPs (3,522 of them are novel) based on 330,427 human proteins derived from databases of UniProt, Ensembl, AceView, and RefSeq. Notably, 6,267 of 6,943 (90.3%) SPs have the supporting evidences from a large amount of mass spectrometry (MS) and RNA-seq data. We found that the SPs were broadly expressed in diverse tissues as well as human body fluid, and a significant portion of them exhibited tissue-specific expression. Moreover, 14 cancer-specific SPs that their expression levels were significantly associated with the patients' survival of eight different tumors were identified, which could be potential prognostic biomarkers. Strikingly, 89.21% of 6,943 SPs (2,927 novel SPs) contain known protein domains. Those novel SPs we mainly enriched with the known domains regarding immunity, such as Immunoglobulin V-set and C1-set domain. Specifically, we constructed a user-friendly and freely accessible database, SPRomeDB (www.unimd.org/SPRomeDB), to catalog those SPs. Our comprehensive SP identification and characterization gain insights into human secretome and provide valuable resource for future researches.

18.
Nucleic Acids Res ; 34(14): 3917-28, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16914452

RESUMO

We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56,419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37,670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants.


Assuntos
Processamento Alternativo , DNA Complementar/química , Genoma Humano , Proteínas/genética , RNA Mensageiro/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional/métodos , Éxons , Variação Genética , Genômica/métodos , Humanos , Proteínas/química , Proteínas/fisiologia , RNA Mensageiro/metabolismo , Análise de Sequência de DNA
20.
PLoS Biol ; 2(6): e162, 2004 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-15103394

RESUMO

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.


Assuntos
Biologia Computacional/métodos , DNA Complementar/genética , Bases de Dados Genéticas , Genes/fisiologia , Genoma Humano , Processamento Alternativo/genética , Genes/genética , Humanos , Internet , Repetições de Microssatélites/genética , Fases de Leitura Aberta/genética , Polimorfismo Genético , Polimorfismo de Nucleotídeo Único , Estrutura Terciária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA