Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 4.067
Filtrar
1.
Medicine (Baltimore) ; 98(34): e16916, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31441872

RESUMO

BACKGROUND: Colorectal Cancer (CRC) is a highly heterogeneous disease. RNA profiles of bulk tumors have enabled transcriptional classification of CRC. However, such ways of sequencing can only target a cell colony and obscure the signatures of distinct cell populations. Alternatively, single-cell RNA sequencing (scRNA-seq), which can provide unbiased analysis of all cell types, opens the possibility to map cellular heterogeneity of CRC unbiasedly. METHODS: In this study, we utilized scRNA-seq to profile cells from cancer tissue of a CRC patient. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed to understand the roles of genes within the clusters. RESULTS AND CONCLUSION: The 2824 cells were analyzed and categorized into 5 distinct clusters by scRNA-seq. For every cluster, specific cell markers can be applied, indicating each 1 of them different from another. We discovered that the tumor of CRC displayed a clear sign of heterogenicity, while genes within each cluster serve different functions. GO term analysis also stated that different cluster's relatedness towards the tumor of CRC differs. Three clusters participate in peripheral works in cells, including, energy transport, extracellular matrix generation, etc; Genes in other 2 clusters participate more in immunology processes. Lastly, trajectory plot analysis also supports the viewpoint, in that some clusters present in different states and pseudo-time, while others present in a single state or pseudo time. Our analysis provides more insight into the heterogeneity of CRC, which can provide assistance to further researches on this topic.


Assuntos
Neoplasias Colorretais/genética , Perfilação da Expressão Gênica/métodos , Heterogeneidade Genética , Análise de Sequência de RNA/métodos , Idoso , Biomarcadores Tumorais/genética , Feminino , Humanos
2.
BMC Bioinformatics ; 20(1): 418, 2019 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-31409293

RESUMO

BACKGROUND: Standard RNAseq methods using bulk RNA and recent single-cell RNAseq methods use DNA barcodes to identify samples and cells, and the barcoded cDNAs are pooled into a library pool before high throughput sequencing. In cases of single-cell and low-input RNAseq methods, the library is further amplified by PCR after the pooling. Preparation of hundreds or more samples for a large study often requires multiple library pools. However, sometimes correlation between expression profiles among the libraries is low and batch effect biases make integration of data between library pools difficult. RESULTS: We investigated 166 technical replicates in 14 RNAseq libraries made using the STRT method. The patterns of the library biases differed by genes, and uneven library yields were associated with library biases. The former bias was corrected using the NBGLM-LBC algorithm, which we present in the current study. The latter bias could not be corrected directly, but could be solved by omitting libraries with particularly low yields. A simulation experiment suggested that the library bias correction using NBGLM-LBC requires a consistent sample layout. The NBGLM-LBC correction method was applied to an expression profile for a cohort study of childhood acute respiratory illness, and the library biases were resolved. CONCLUSIONS: The R source code for the library bias correction named NBGLM-LBC is available at https://shka.github.io/NBGLM-LBC and https://shka.bitbucket.io/NBGLM-LBC . This method is applicable to correct the library biases in various studies that use highly multiplexed sequencing-based profiling methods with a consistent sample layout with samples to be compared (e.g., "cases" and "controls") equally distributed in each library.


Assuntos
Biblioteca Gênica , Análise de Sequência de RNA/métodos , Transcriptoma , Linhagem Celular , Análise por Conglomerados , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Componente Principal , RNA/química , RNA/metabolismo , Interface Usuário-Computador
3.
BMC Bioinformatics ; 20(1): 379, 2019 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-31286861

RESUMO

BACKGROUND: Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction. RESULTS: Here, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets. CONCLUSIONS: We discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.


Assuntos
Perfilação da Expressão Gênica/métodos , Redes Neurais (Computação) , RNA Mensageiro/química , Análise de Sequência de RNA/métodos , Aprendizado de Máquina não Supervisionado , Análise por Conglomerados , RNA Mensageiro/metabolismo , Análise de Célula Única
4.
BMC Bioinformatics ; 20(1): 369, 2019 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-31262249

RESUMO

BACKGROUND: Single cell RNA sequencing (scRNA-seq) brings unprecedented opportunities for mapping the heterogeneity of complex cellular environments such as bone marrow, and provides insight into many cellular processes. Single cell RNA-seq has a far larger fraction of missing data reported as zeros (dropouts) than traditional bulk RNA-seq, and unsupervised clustering combined with Principal Component Analysis (PCA) can be used to overcome this limitation. After clustering, however, one has to interpret the average expression of markers on each cluster to identify the corresponding cell types, and this is normally done by hand by an expert curator. RESULTS: We present a computational tool for processing single cell RNA-seq data that uses a voting algorithm to automatically identify cells based on approval votes received by known molecular markers. Using a stochastic procedure that accounts for imbalances in the number of known molecular signatures for different cell types, the method computes the statistical significance of the final approval score and automatically assigns a cell type to clusters without an expert curator. We demonstrate the utility of the tool in the analysis of eight samples of bone marrow from the Human Cell Atlas. The tool provides a systematic identification of cell types in bone marrow based on a list of markers of immune cell types, and incorporates a suite of visualization tools that can be overlaid on a t-SNE representation. The software is freely available as a Python package at https://github.com/sdomanskyi/DigitalCellSorter . CONCLUSIONS: This methodology assures that extensive marker to cell type matching information is taken into account in a systematic way when assigning cell clusters to cell types. Moreover, the method allows for a high throughput processing of multiple scRNA-seq datasets, since it does not involve an expert curator, and it can be applied recursively to obtain cell sub-types. The software is designed to allow the user to substitute the marker to cell type matching information and apply the methodology to different cellular environments.


Assuntos
Células da Medula Óssea/citologia , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Células da Medula Óssea/metabolismo , Análise por Conglomerados , Humanos , Análise de Componente Principal , Análise de Célula Única
5.
BMC Bioinformatics ; 20(1): 388, 2019 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-31299886

RESUMO

BACKGROUND: Single-cell RNA-sequencing technologies provide a powerful tool for systematic dissection of cellular heterogeneity. However, the prevalence of dropout events imposes complications during data analysis and, despite numerous efforts from the community, this challenge has yet to be solved. RESULTS: Here we present a computational method, called RESCUE, to mitigate the dropout problem by imputing gene expression levels using information from other cells with similar patterns. Unlike existing methods, we use an ensemble-based approach to minimize the feature selection bias on imputation. By comparative analysis of simulated and real single-cell RNA-seq datasets, we show that RESCUE outperforms existing methods in terms of imputation accuracy which leads to more precise cell-type identification. CONCLUSIONS: Taken together, these results suggest that RESCUE is a useful tool for mitigating dropouts in single-cell RNA-seq data. RESCUE is implemented in R and available at https://github.com/seasamgo/rescue .


Assuntos
Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software , Viés , Células/metabolismo , Simulação por Computador , Regulação da Expressão Gênica , Humanos , RNA/genética , RNA/metabolismo
6.
BMC Bioinformatics ; 20(1): 405, 2019 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345161

RESUMO

BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


Assuntos
RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , Software , Algoritmos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , Curva ROC , Fatores de Tempo
7.
Nature ; 571(7765): 355-360, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31270458

RESUMO

Defining the transcriptomic identity of malignant cells is challenging in the absence of surface markers that distinguish cancer clones from one another, or from admixed non-neoplastic cells. To address this challenge, here we developed Genotyping of Transcriptomes (GoT), a method to integrate genotyping with high-throughput droplet-based single-cell RNA sequencing. We apply GoT to profile 38,290 CD34+ cells from patients with CALR-mutated myeloproliferative neoplasms to study how somatic mutations corrupt the complex process of human haematopoiesis. High-resolution mapping of malignant versus normal haematopoietic progenitors revealed an increasing fitness advantage with myeloid differentiation of cells with mutated CALR. We identified the unfolded protein response as a predominant outcome of CALR mutations, with a considerable dependency on cell identity, as well as upregulation of the NF-κB pathway specifically in uncommitted stem cells. We further extended the GoT toolkit to genotype multiple targets and loci that are distant from transcript ends. Together, these findings reveal that the transcriptional output of somatic mutations in myeloproliferative neoplasms is dependent on the native cell identity.


Assuntos
Genótipo , Mutação , Transtornos Mieloproliferativos/genética , Transtornos Mieloproliferativos/patologia , Neoplasias/genética , Neoplasias/patologia , Transcriptoma/genética , Animais , Antígenos CD34/metabolismo , Calreticulina/genética , Linhagem Celular , Proliferação de Células , Células Clonais/classificação , Células Clonais/metabolismo , Células Clonais/patologia , Endorribonucleases/metabolismo , Hematopoese/genética , Células-Tronco Hematopoéticas/classificação , Células-Tronco Hematopoéticas/metabolismo , Células-Tronco Hematopoéticas/patologia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Camundongos , Modelos Moleculares , Transtornos Mieloproliferativos/classificação , NF-kappa B/metabolismo , Neoplasias/classificação , Células-Tronco Neoplásicas/citologia , Células-Tronco Neoplásicas/metabolismo , Células-Tronco Neoplásicas/patologia , Mielofibrose Primária/genética , Mielofibrose Primária/patologia , Proteínas Serina-Treonina Quinases/metabolismo , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Resposta a Proteínas não Dobradas/genética
8.
Nature ; 571(7765): 419-423, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31292545

RESUMO

Single-cell RNA sequencing (scRNA-seq) has highlighted the important role of intercellular heterogeneity in phenotype variability in both health and disease1. However, current scRNA-seq approaches provide only a snapshot of gene expression and convey little information on the true temporal dynamics and stochastic nature of transcription. A further key limitation of scRNA-seq analysis is that the RNA profile of each individual cell can be analysed only once. Here we introduce single-cell, thiol-(SH)-linked alkylation of RNA for metabolic labelling sequencing (scSLAM-seq), which integrates metabolic RNA labelling2, biochemical nucleoside conversion3 and scRNA-seq to record transcriptional activity directly by differentiating between new and old RNA for thousands of genes per single cell. We use scSLAM-seq to study the onset of infection with lytic cytomegalovirus in single mouse fibroblasts. The cell-cycle state and dose of infection deduced from old RNA enable dose-response analysis based on new RNA. scSLAM-seq thereby both visualizes and explains differences in transcriptional activity at the single-cell level. Furthermore, it depicts 'on-off' switches and transcriptional burst kinetics in host gene expression with extensive gene-specific differences that correlate with promoter-intrinsic features (TBP-TATA-box interactions and DNA methylation). Thus, gene-specific, and not cell-specific, features explain the heterogeneity in transcriptomes between individual cells and the transcriptional response to perturbations.


Assuntos
Regulação da Expressão Gênica/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única , Transcrição Genética/genética , Alquilação , Animais , Ciclo Celular , Citomegalovirus/fisiologia , Metilação de DNA , Fibroblastos/metabolismo , Fibroblastos/virologia , Cinética , Camundongos , Regiões Promotoras Genéticas/genética , RNA/análise , RNA/química , Compostos de Sulfidrila/química
9.
Nat Commun ; 10(1): 2760, 2019 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-31235787

RESUMO

Heart failure is a leading cause of mortality, yet our understanding of the genetic interactions underlying this disease remains incomplete. Here, we harvest 1352 healthy and failing human hearts directly from transplant center operating rooms, and obtain genome-wide genotyping and gene expression measurements for a subset of 313. We build failing and non-failing cardiac regulatory gene networks, revealing important regulators and cardiac expression quantitative trait loci (eQTLs). PPP1R3A emerges as a regulator whose network connectivity changes significantly between health and disease. RNA sequencing after PPP1R3A knockdown validates network-based predictions, and highlights metabolic pathway regulation associated with increased cardiomyocyte size and perturbed respiratory metabolism. Mice lacking PPP1R3A are protected against pressure-overload heart failure. We present a global gene interaction map of the human heart failure transition, identify previously unreported cardiac eQTLs, and demonstrate the discovery potential of disease-specific networks through the description of PPP1R3A as a central regulator in heart failure.


Assuntos
Redes Reguladoras de Genes/genética , Insuficiência Cardíaca/genética , Miócitos Cardíacos/patologia , Fosfoproteínas Fosfatases/metabolismo , Animais , Benzenoacetamidas , Células Cultivadas , Conjuntos de Dados como Assunto , Modelos Animais de Doenças , Feminino , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Técnicas de Silenciamento de Genes , Estudo de Associação Genômica Ampla , Insuficiência Cardíaca/etiologia , Insuficiência Cardíaca/metabolismo , Insuficiência Cardíaca/patologia , Humanos , Masculino , Redes e Vias Metabólicas/genética , Camundongos , Camundongos Knockout , Pessoa de Meia-Idade , Fosfoproteínas Fosfatases/genética , Cultura Primária de Células , Piridinas , Locos de Características Quantitativas/genética , Ratos , Ratos Sprague-Dawley , Análise de Sequência de RNA/métodos
10.
Semin Ophthalmol ; 34(4): 223-231, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31170015

RESUMO

Purpose: To review the value of next-generation sequencing (NGS) in identifying the pathogens which cause ocular infections, thereby facilitating prompt initiation of treatment with an optimal anti-microbial regimen. Both contemporary and futuristic approaches to identifying pathogens in ocular infections are covered in this brief overview. Methods: Review of the peer reviewed literature on conventional and advanced methods as applied to the diagnosis of infectious diseases of the eye. Conclusion: NGS is a novel technology for identifying the pathogens responsible for ocular infections with the potential to improve the accuracy and speed of diagnosis and hastening the selection of the best therapy.


Assuntos
Infecções Oculares/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , DNA Bacteriano/genética , DNA Fúngico/genética , DNA Ribossômico/genética , Humanos , Reação em Cadeia da Polimerase
11.
Nat Methods ; 16(7): 619-626, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31209384

RESUMO

Sample multiplexing facilitates scRNA-seq by reducing costs and identifying artifacts such as cell doublets. However, universal and scalable sample barcoding strategies have not been described. We therefore developed MULTI-seq: multiplexing using lipid-tagged indices for single-cell and single-nucleus RNA sequencing. MULTI-seq reagents can barcode any cell type or nucleus from any species with an accessible plasma membrane. The method involves minimal sample processing, thereby preserving cell viability and endogenous gene expression patterns. When cells are classified into sample groups using MULTI-seq barcode abundances, data quality is improved through doublet identification and recovery of cells with low RNA content that would otherwise be discarded by standard quality-control workflows. We use MULTI-seq to track the dynamics of T-cell activation, perform a 96-plex perturbation experiment with primary human mammary epithelial cells and multiplex cryopreserved tumors and metastatic sites isolated from a patient-derived xenograft mouse model of triple-negative breast cancer.


Assuntos
Lipídeos/química , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Animais , Sequência de Bases , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos
12.
Gene ; 710: 258-264, 2019 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-31176731

RESUMO

OBJECTIVE: Increasing evidence indicated that cancer-secreted exosomes played an important role in tumor metastasis. However, the function of exosomes in breast cancer pulmonary metastasis remains unknown. The aim of the study was to investigate the role of exosome-derived from breast cancer-secreted long non-coding RNAs (LncRNAs) on pre-metastatic niche formation in pulmonary metastasis. METHODS: Exosomes-derived from breast cancer were separated by ultracentrifugation. The high-throughput sequencing, Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were used to detect and evaluate the differential expression of LncRNAs in lung fibroblasts with exosomes treated. And quantitative real-time polymerase chain reaction (qRT-PCR) was performed to verify candidate LncRNAs expression. RESULTS: We found that exosomes-derived from breast cancer induced lung fibroblasts proliferation and migration. In addition, a large number of LncRNAs expression abnormalities were involved in the breast cancer lung metastasis microenvironment. CONCLUSION: Our findings suggested that exosomal LncRNAs facilitated tumor pre-metastatic niche formation and represented a novel mechanistic insight into the molecular mechanism of cancer metastasis microenvironment.


Assuntos
Neoplasias da Mama/genética , Exossomos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias Pulmonares/secundário , RNA Longo não Codificante/genética , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Humanos , Neoplasias Pulmonares/genética , Análise de Sequência de RNA/métodos , Microambiente Tumoral
13.
Gene ; 710: 375-386, 2019 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-31200084

RESUMO

Cynanchum thesioides are upright, xerophytic shrubs that are widely distributed in arid and semi-arid areas of China, North Korea, Mongolia and Siberia. To date, little is known about the molecular mechanisms of drought resistance in C. thesioides. To better understand drought resistance, we used transcriptome analysis and Illumina sequencing technology on C. thesioides, to identify drought-responsive genes. Using de novo assembly 55,268 unigenes were identified from 207.58 Gb of clean data. Amongst these, 36,265 were annotated with gene descriptions, conserved domains, gene ontology terms and metabolic pathways. The sequencing results showed that genes that were differentially expressed (DEGs) under drought stress were enriched in pathways such as carbon metabolism, starch and sucrose metabolism, amino acid biosynthesis, phenylpropanoid biosynthesis and plant hormone signal transduction. Moreover, many functional genes were up-regulated under severe drought stress to enhance tolerance. Weighted gene co-expression network analysis showed that there were key hub genes related to drought stress. Hundreds of candidate genes were identified under severe drought stress, including transcriptional factors such as MYB, G2-like, ERF, C2H2, NAC, NF-X1, GRF, HD-ZIP, HB-other, HSF, C3H, GRAS, WRKY, bHLH and Trihelix. These data are a valuable resource for further investigation into the molecular mechanism for drought stress in C. thesioides and will facilitate exploration of drought resistance genes.


Assuntos
Cynanchum/genética , Secas , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Regulação da Expressão Gênica de Plantas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular , Proteínas de Plantas/genética , Análise de Sequência de RNA/métodos , Estresse Fisiológico
14.
BMC Bioinformatics ; 20(1): 331, 2019 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-31195976

RESUMO

BACKGROUND: Principal component analysis (PCA) is frequently used in genomics applications for quality assessment and exploratory analysis in high-dimensional data, such as RNA sequencing (RNA-seq) gene expression assays. Despite the availability of many software packages developed for this purpose, an interactive and comprehensive interface for performing these operations is lacking. RESULTS: We developed the pcaExplorer software package to enhance commonly performed analysis steps with an interactive and user-friendly application, which provides state saving as well as the automated creation of reproducible reports. pcaExplorer is implemented in R using the Shiny framework and exploits data structures from the open-source Bioconductor project. Users can easily generate a wide variety of publication-ready graphs, while assessing the expression data in the different modules available, including a general overview, dimension reduction on samples and genes, as well as functional interpretation of the principal components. CONCLUSION: pcaExplorer is distributed as an R package in the Bioconductor project ( http://bioconductor.org/packages/pcaExplorer/ ), and is designed to assist a broad range of researchers in the critical step of interactive data exploration.


Assuntos
Análise de Componente Principal , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Curadoria de Dados , Humanos , RNA/genética , Reprodutibilidade dos Testes
15.
Nat Commun ; 10(1): 2837, 2019 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-31253775

RESUMO

The diagnostic yield of exome and genome sequencing remains low (8-70%), due to incomplete knowledge on the genes that cause disease. To improve this, we use RNA-seq data from 31,499 samples to predict which genes cause specific disease phenotypes, and develop GeneNetwork Assisted Diagnostic Optimization (GADO). We show that this unbiased method, which does not rely upon specific knowledge on individual genes, is effective in both identifying previously unknown disease gene associations, and flagging genes that have previously been incorrectly implicated in disease. GADO can be run on www.genenetwork.nl by supplying HPO-terms and a list of genes that contain candidate variants. Finally, applying GADO to a cohort of 61 patients for whom exome-sequencing analysis had not resulted in a genetic diagnosis, yields likely causative genes for ten cases.


Assuntos
Regulação da Expressão Gênica/fisiologia , Predisposição Genética para Doença , Análise de Sequência de RNA/métodos , Transcriptoma , Bases de Dados de Ácidos Nucleicos , Humanos , Modelos Genéticos , Análise de Componente Principal , Software , Interface Usuário-Computador
16.
BMC Bioinformatics ; 20(Suppl 8): 284, 2019 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-31182005

RESUMO

BACKGROUND: Single cell RNA sequencing (scRNA-seq) is applied to assay the individual transcriptomes of large numbers of cells. The gene expression at single-cell level provides an opportunity for better understanding of cell function and new discoveries in biomedical areas. To ensure that the single-cell based gene expression data are interpreted appropriately, it is crucial to develop new computational methods. RESULTS: In this article, we try to re-construct a neural network based on Gene Ontology (GO) for dimension reduction of scRNA-seq data. By integrating GO with both unsupervised and supervised models, two novel methods are proposed, named GOAE (Gene Ontology AutoEncoder) and GONN (Gene Ontology Neural Network) respectively. CONCLUSIONS: The evaluation results show that the proposed models outperform some state-of-the-art dimensionality reduction approaches. Furthermore, incorporating with GO, we provide an opportunity to interpret the underlying biological mechanism behind the neural network-based model.


Assuntos
Ontologia Genética , Redes Neurais (Computação) , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Animais , Sequência de Bases , Análise por Conglomerados , Bases de Dados Genéticas , Humanos , Camundongos , RNA/genética
17.
BMC Bioinformatics ; 20(Suppl 11): 275, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31167661

RESUMO

BACKGROUND: The advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification. RESULTS: We have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment. CONCLUSIONS: Based on our study, we found that when marker genes are expressed at fold change of 4 or more, either Seurat or SIMLR algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fold change of 2, choice of the single cell algorithm is dependent on the number of single cells isolated and rarity of cell types to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in the design of single cell experiments.


Assuntos
Biologia Computacional/métodos , Projetos de Pesquisa , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Simulação por Computador , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Humanos , Tamanho da Amostra
18.
Nat Commun ; 10(1): 2832, 2019 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-31249312

RESUMO

Defining cellular and molecular identities within the kidney is necessary to understand its organization and function in health and disease. Here we demonstrate a reproducible method with minimal artifacts for single-nucleus Droplet-based RNA sequencing (snDrop-Seq) that we use to resolve thirty distinct cell populations in human adult kidney. We define molecular transition states along more than ten nephron segments spanning two major kidney regions. We further delineate cell type-specific expression of genes associated with chronic kidney disease, diabetes and hypertension, providing insight into possible targeted therapies. This includes expression of a hypertension-associated mechano-sensory ion channel in mesangial cells, and identification of proximal tubule cell populations defined by pathogenic expression signatures. Our fully optimized, quality-controlled transcriptomic profiling pipeline constitutes a tool for the generation of healthy and diseased molecular atlases applicable to clinical samples.


Assuntos
Núcleo Celular/genética , Nefropatias/genética , Rim/metabolismo , Rim/patologia , Análise de Sequência de RNA/métodos , Idoso , Núcleo Celular/metabolismo , Feminino , Perfilação da Expressão Gênica , Humanos , Nefropatias/diagnóstico , Nefropatias/metabolismo , Nefropatias/patologia , Masculino , Células Mesangiais/metabolismo , Pessoa de Meia-Idade , Análise de Célula Única/métodos
19.
Nat Commun ; 10(1): 2611, 2019 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-31197158

RESUMO

The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios.


Assuntos
Perfilação da Expressão Gênica/métodos , Modelos Genéticos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Análise por Conglomerados , Simulação por Computador , Conjuntos de Dados como Assunto , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Cinética , Transcriptoma/genética
20.
Gene ; 710: 246-257, 2019 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-31176732

RESUMO

Osteosarcoma is the most common primary bone tumor during childhood and adolescence. Several reports have presented data on serum biomarkers for osteosarcoma, but few reports have analyzed circulating microRNAs (miRNAs). In this study, we used next generation miRNA sequencing to examine miRNAs isolated from microvesicle-depleted extracellular vesicles (EVs) derived from six different human osteosarcoma or osteoblastic cell lines with different degrees of metastatic potential (i.e., SAOS2, MG63, HOS, 143B, U2OS and hFOB1.19). EVs from each cell line contain on average ~300 miRNAs, and ~70 of these miRNAs are present at very high levels (i.e., >1000 reads per million). The most prominent miRNAs are miR-21-5p, miR-143-3p, miR-148a-3p and 181a-5p, which are enriched between 3 and 100 fold and relatively abundant in EVs derived from metastatic SAOS2 cells compared to non-metastatic MG63 cells. Gene ontology analysis of predicted targets reveals that miRNAs present in EVs may regulate the metastatic potential of osteosarcoma cell lines by potentially inhibiting a network of genes (e.g., MAPK1, NRAS, FRS2, PRCKE, BCL2 and QKI) involved in apoptosis and/or cell adhesion. Our data indicate that osteosarcoma cell lines may selectively package miRNAs as molecular cargo of EVs that could function as paracrine agents to modulate the tumor micro-environment.


Assuntos
Neoplasias Ósseas/genética , Vesículas Extracelulares/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , MicroRNAs/genética , Osteossarcoma/genética , Apoptose , Adesão Celular , Linhagem Celular Tumoral , Redes Reguladoras de Genes , Humanos , Metástase Neoplásica , Análise de Sequência de RNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA