Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Immunol ; 2023 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-37966257

RESUMO

Identification of neoepitopes that can control tumor growth in vivo remains a challenge even 10 y after the first genomics-defined cancer neoepitopes were identified. In this study, we identify a neoepitope, resulting from a mutation in the junction plakoglobin (Jup) gene (chromosome 11), from the mouse colon cancer line MC38-FABF (C57BL/6). This neoepitope, Jup mutant (JupMUT), was detected during mass spectrometry of MHC class I-eluted peptides from the tumor. JupMUT has a predicted binding affinity of 564 nM for the Kb molecule and a higher predicted affinity of 82 nM for Db. However, whereas structural modeling of JupMUT and its unmutated counterpart Jup wild-type indicates that there are little conformational differences between the two epitopes bound to Db, large structural divergences are predicted between the two epitopes bound to Kb. Together with in vitro binding data with RMA-S cells, these data suggest that Kb rather than Db is the relevant MHC class I molecule of JupMUT. Immunization of naive C57BL/6 mice with JupMUT elicits CD8-dependent tumor control of a MC38-FABF challenge. Despite the CD8 dependence of JupMUT-mediated tumor control in vivo, CD8+ T cells from JupMUT-immunized mice do not produce higher levels of IFN-γ than do naive mice. The structural and immunological characteristics of JupMUT are substantially different from those of many other neoepitopes that have been shown to mediate tumor control.

2.
BMC Bioinformatics ; 21(Suppl 18): 498, 2020 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-33375939

RESUMO

BACKGROUND: Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines. RESULTS: In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping. CONCLUSIONS: Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.


Assuntos
Imunoterapia , Neoplasias/terapia , Peptídeos/análise , Medicina de Precisão , Aprendizado de Máquina Supervisionado , Epitopos/imunologia , Epitopos/metabolismo , Humanos , Polimorfismo de Nucleotídeo Único , Espectrometria de Massas em Tandem , Sequenciamento do Exoma
3.
BMC Genomics ; 19(Suppl 6): 569, 2018 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-30367575

RESUMO

BACKGROUND: Single cell transcriptomics is critical for understanding cellular heterogeneity and identification of novel cell types. Leveraging the recent advances in single cell RNA sequencing (scRNA-Seq) technology requires novel unsupervised clustering algorithms that are robust to high levels of technical and biological noise and scale to datasets of millions of cells. RESULTS: We present novel computational approaches for clustering scRNA-seq data based on the Term Frequency - Inverse Document Frequency (TF-IDF) transformation that has been successfully used in the field of text analysis. CONCLUSIONS: Empirical experimental results show that TF-IDF methods consistently outperform commonly used scRNA-Seq clustering approaches.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Algoritmos , Análise por Conglomerados , Análise de Célula Única
4.
Cancer Immunol Immunother ; 67(9): 1449-1459, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30030558

RESUMO

Dendritic cells play a critical role in initiating T-cell responses. In spite of this recognition, they have not been used widely as adjuvants, nor is the mechanism of their adjuvanticity fully understood. Here, using a mutated neoepitope of a mouse fibrosarcoma as the antigen, and tumor rejection as the end point, we show that dendritic cells but not macrophages possess superior adjuvanticity. Several types of dendritic cells, such as bone marrow-derived dendritic cells (GM-CSF cultured or FLT3-ligand induced) or monocyte-derived ones, are powerful adjuvants, although GM-CSF-cultured cells show the highest activity. Among these, the CD11c+ MHCIIlo sub-set, distinguishable by a distinct transcriptional profile including a higher expression of heat shock protein receptors CD91 and LOX1, mannose receptors and TLRs, is significantly superior to the CD11c+ MHCIIhi sub-set. Finally, dendritic cells exert their adjuvanticity by acting as both antigen donor cells (i.e., antigen reservoirs) as well as antigen presenting cells.


Assuntos
Antígeno CD11c/imunologia , Células Dendríticas/imunologia , Células Dendríticas/transplante , Fibrossarcoma/terapia , Fator Estimulador de Colônias de Granulócitos e Macrófagos/farmacologia , Antígenos de Histocompatibilidade Classe II/imunologia , Imunoterapia Adotiva/métodos , Animais , Antígenos de Neoplasias/imunologia , Células da Medula Óssea/efeitos dos fármacos , Células da Medula Óssea/imunologia , Células Dendríticas/efeitos dos fármacos , Epitopos/imunologia , Feminino , Fibrossarcoma/imunologia , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Linfócitos T/imunologia
5.
Bioinformatics ; 33(20): 3302-3304, 2017 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-28605502

RESUMO

SUMMARY: This note presents IsoEM2 and IsoDE2, new versions with enhanced features and faster runtime of the IsoEM and IsoDE packages for expression level estimation and differential expression. IsoEM2 estimates fragments per kilobase million (FPKM) and transcript per million (TPM) levels for genes and isoforms with confidence intervals through bootstrapping, while IsoDE2 performs differential expression analysis using the bootstrap samples generated by IsoEM2. Both tools are available with a command line interface as well as a graphical user interface (GUI) through wrappers for the Galaxy platform. AVAILABILITY AND IMPLEMENTATION: The source code of this software suite is available at https://github.com/mandricigor/isoem2. The Galaxy wrappers are available at https://toolshed.g2.bx.psu.edu/view/saharlcc/isoem2_isode2/. CONTACT: imandric1@student.gsu.edu or ion@engr.uconn.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Intervalos de Confiança , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Software
6.
BMC Genomics ; 17 Suppl 5: 495, 2016 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-27586787

RESUMO

BACKGROUND: The retina as a model system with extensive information on genes involved in development/maintenance is of great value for investigations employing deep sequencing to capture transcriptome change over time. This in turn could enable us to find patterns in gene expression across time to reveal transition in biological processes. METHODS: We developed a bioinformatics pipeline to categorize genes based on their differential expression and their alternative splicing status across time by binning genes based on their transcriptional kinetics. Genes within same bins were then leveraged to query gene annotation databases to discover molecular programs employed by the developing retina. RESULTS: Using our pipeline on RNA-Seq data obtained from fractionated (nucleus/cytoplasm) developing retina at embryonic day (E) 16 and postnatal day (P) 0, we captured high-resolution as in the difference between the cytoplasm and the nucleus at the same developmental time. We found de novo transcription of genes whose transcripts were exclusively found in the nuclear transcriptome at P0. Further analysis showed that these genes enriched for functions that are known to be executed during postnatal development, thus showing that the P0 nuclear transcriptome is temporally ahead of that of its cytoplasm. We extended our strategy to perform temporal analysis comparing P0 data to either P21-Nrl-wildtype (WT) or P21-Nrl-knockout (KO) retinae, which predicted that the KO retina would have compromised vasculature. Indeed, histological manifestation of vasodilation has been reported at a later time point (P60). CONCLUSIONS: Thus, our approach was predictive of a phenotype before it presented histologically. Our strategy can be extended to investigating the development and/or disease progression of other tissue types.


Assuntos
Retina/metabolismo , Transcriptoma , Processamento Alternativo , Animais , Biologia Computacional , Progressão da Doença , Perfilação da Expressão Gênica , Cinética , Camundongos , Camundongos Knockout , Retina/anormalidades , Retina/embriologia , Análise de Sequência de RNA , Análise Espaço-Temporal
7.
J Chem Inf Model ; 55(3): 709-18, 2015 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-25668446

RESUMO

Metabolic pathways are composed of a series of chemical reactions occurring within a cell. In each pathway, enzymes catalyze the conversion of substrates into structurally similar products. Thus, structural similarity provides a potential means for mapping newly identified biochemical compounds to known metabolic pathways. In this paper, we present TrackSM, a cheminformatics tool designed to associate a chemical compound to a known metabolic pathway based on molecular structure matching techniques. Validation experiments show that TrackSM is capable of associating 93% of tested structures to their correct KEGG pathway class and 88% to their correct individual KEGG pathway. This suggests that TrackSM may be a valuable tool to aid in associating previously unknown small molecules to known biochemical pathways and improve our ability to link metabolomics, proteomic, and genomic data sets. TrackSM is freely available at http://metabolomics.pharm.uconn.edu/?q=Software.html .


Assuntos
Algoritmos , Redes e Vias Metabólicas , Metabolômica/métodos , Estrutura Molecular , Reprodutibilidade dos Testes , Software
8.
BMC Bioinformatics ; 15 Suppl 13: S4, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25434802

RESUMO

BACKGROUND: There is an ever-expanding range of technologies that generate very large numbers of biomarkers for research and clinical applications. Choosing the most informative biomarkers from a high-dimensional data set, combined with identifying the most reliable and accurate classification algorithms to use with that biomarker set, can be a daunting task. Existing surveys of feature selection and classification algorithms typically focus on a single data type, such as gene expression microarrays, and rarely explore the model's performance across multiple biological data types. RESULTS: This paper presents the results of a large scale empirical study whereby a large number of popular feature selection and classification algorithms are used to identify the tissue of origin for the NCI-60 cancer cell lines. A computational pipeline was implemented to maximize predictive accuracy of all models at all parameters on five different data types available for the NCI-60 cell lines. A validation experiment was conducted using external data in order to demonstrate robustness. CONCLUSIONS: As expected, the data type and number of biomarkers have a significant effect on the performance of the predictive models. Although no model or data type uniformly outperforms the others across the entire range of tested numbers of markers, several clear trends are visible. At low numbers of biomarkers gene and protein expression data types are able to differentiate between cancer cell lines significantly better than the other three data types, namely SNP, array comparative genome hybridization (aCGH), and microRNA data.


Assuntos
Algoritmos , Biomarcadores Tumorais/análise , Biologia Computacional/métodos , Bases de Dados Factuais , Modelos Biológicos , Neoplasias/classificação , Variações do Número de Cópias de DNA/genética , Mineração de Dados , Humanos , MicroRNAs/genética , Neoplasias/genética , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único/genética , Proteínas/metabolismo , RNA Mensageiro/genética , Células Tumorais Cultivadas
9.
BMC Genomics ; 15 Suppl 8: S2, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25435284

RESUMO

A major application of RNA-Seq is to perform differential gene expression analysis. Many tools exist to analyze differentially expressed genes in the presence of biological replicates. Frequently, however, RNA-Seq experiments have no or very few biological replicates and development of methods for detecting differentially expressed genes in these scenarios is still an active research area. In this paper we introduce a novel method, called IsoDE, for differential gene expression analysis based on bootstrapping. We compared IsoDE against four existing methods (Fisher's exact test, GFOLD, edgeR and Cuffdiff) on RNA-Seq datasets generated using three different sequencing technologies, both with and without replicates. Experiments on MAQC RNA-Seq datasets without replicates show that IsoDE has consistently high accuracy as defined by the qPCR ground truth, frequently higher than that of the compared methods, particularly for low coverage data and at lower fold change thresholds. In experiments on RNA-Seq datasets with up to 7 replicates, IsoDE has also achieved high accuracy. Furthermore, unlike GFOLD and edgeR, IsoDE accuracy varies smoothly with the number of replicates, and is relatively uniform across the entire range of gene expression levels. The proposed non-parametric method based on bootstrapping has practical running time, and achieves robust performance over a broad range of technologies, number of replicates, sequencing depths, and minimum fold change thresholds.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Biologia Computacional , Software
10.
J Chem Inf Model ; 53(3): 601-12, 2013 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-23330685

RESUMO

The structural identification of unknown biochemical compounds in complex biofluids continues to be a major challenge in metabolomics research. Using LC/MS, there are currently two major options for solving this problem: searching small biochemical databases, which often do not contain the unknown of interest or searching large chemical databases which include large numbers of nonbiochemical compounds. Searching larger chemical databases (larger chemical space) increases the odds of identifying an unknown biochemical compound, but only if nonbiochemical structures can be eliminated from consideration. In this paper we present BioSM; a cheminformatics tool that uses known endogenous mammalian biochemical compounds (as scaffolds) and graph matching methods to identify endogenous mammalian biochemical structures in chemical structure space. The results of a comprehensive set of empirical experiments suggest that BioSM identifies endogenous mammalian biochemical structures with high accuracy. In a leave-one-out cross validation experiment, BioSM correctly predicted 95% of 1388 Kyoto Encyclopedia of Genes and Genomes (KEGG) compounds as endogenous mammalian biochemicals using 1565 scaffolds. Analysis of two additional biological data sets containing 2330 human metabolites (HMDB) and 2416 plant secondary metabolites (KEGG) resulted in biochemical annotations of 89% and 72% of the compounds, respectively. When a data set of 3895 drugs (DrugBank and USAN) was tested, 48% of these structures were predicted to be biochemical. However, when a set of synthetic chemical compounds (Chembridge and Chemsynthesis databases) were examined, only 29% of the 458,207 structures were predicted to be biochemical. Moreover, BioSM predicted that 34% of 883,199 randomly selected compounds from PubChem were biochemical. We then expanded the scaffold list to 3927 biochemical compounds and reevaluated the above data sets to determine whether scaffold number influenced model performance. Although there were significant improvements in model sensitivity and specificity using the larger scaffold list, the data set comparison results were very similar. These results suggest that additional biochemical scaffolds will not further improve our representation of biochemical structure space and that the model is reasonably robust. BioSM provides a qualitative (yes/no) and quantitative (ranking) method for endogenous mammalian biochemical annotation of chemical space and, thus, will be useful in the identification of unknown biochemical structures in metabolomics. BioSM is freely available at http://metabolomics.pharm.uconn.edu.


Assuntos
Mamíferos/metabolismo , Metabolômica/métodos , Algoritmos , Animais , Inteligência Artificial , Líquidos Corporais/química , Citocromos , Bases de Dados de Proteínas , Humanos , Modelos Químicos , Modelos Moleculares , Reprodutibilidade dos Testes , Bibliotecas de Moléculas Pequenas
11.
J Chem Inf Model ; 53(9): 2483-92, 2013 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-23991755

RESUMO

Current methods of structure identification in mass-spectrometry-based nontargeted metabolomics rely on matching experimentally determined features of an unknown compound to those of candidate compounds contained in biochemical databases. A major limitation of this approach is the relatively small number of compounds currently included in these databases. If the correct structure is not present in a database, it cannot be identified, and if it cannot be identified, it cannot be included in a database. Thus, there is an urgent need to augment metabolomics databases with rationally designed biochemical structures using alternative means. Here we present the In Vivo/In Silico Metabolites Database (IIMDB), a database of in silico enzymatically synthesized metabolites, to partially address this problem. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/, includes ~23,000 known compounds (mammalian metabolites, drugs, secondary plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400,000 computationally generated human phase-I and phase-II metabolites of these known compounds. IIMDB features a user-friendly web interface and a programmer-friendly RESTful web service. Ninety-five percent of the computationally generated metabolites in IIMDB were not found in any existing database. However, 21,640 were identical to compounds already listed in PubChem, HMDB, KEGG, or HumanCyc. Furthermore, the vast majority of these in silico metabolites were scored as biological using BioSM, a software program that identifies biochemical structures in chemical structure space. These results suggest that in silico biochemical synthesis represents a viable approach for significantly augmenting biochemical databases for nontargeted metabolomics applications.


Assuntos
Bases de Dados Factuais , Enzimas/metabolismo , Metabolômica/métodos , Animais , Glicerofosfolipídeos/metabolismo , Humanos , Internet , Preparações Farmacêuticas/metabolismo , Plantas/metabolismo , Interface Usuário-Computador
12.
J Comput Biol ; 30(4): 538-551, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36999902

RESUMO

High-throughput DNA and RNA sequencing are revolutionizing precision oncology, enabling personalized therapies such as cancer vaccines designed to target tumor-specific neoepitopes generated by somatic mutations expressed in cancer cells. Identification of these neoepitopes from next-generation sequencing data of clinical samples remains challenging and requires the use of complex bioinformatics pipelines. In this paper, we present GeNeo, a bioinformatics toolbox for genomics-guided neoepitope prediction. GeNeo includes a comprehensive set of tools for somatic variant calling and filtering, variant validation, and neoepitope prediction and filtering. For ease of use, GeNeo tools can be accessed via web-based interfaces deployed on a Galaxy portal publicly accessible at https://neo.engr.uconn.edu/. A virtual machine image for running GeNeo locally is also available to academic users upon request.


Assuntos
Neoplasias , Humanos , Neoplasias/genética , Neoplasias/terapia , Medicina de Precisão , Genômica/métodos , Biologia Computacional , Imunoterapia , Sequenciamento de Nucleotídeos em Larga Escala
13.
BMC Genomics ; 13 Suppl 2: S6, 2012 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-22537301

RESUMO

BACKGROUND: Massively parallel transcriptome sequencing (RNA-Seq) is becoming the method of choice for studying functional effects of genetic variability and establishing causal relationships between genetic variants and disease. However, RNA-Seq poses new technical and computational challenges compared to genome sequencing. In particular, mapping transcriptome reads onto the genome is more challenging than mapping genomic reads due to splicing. Furthermore, detection and genotyping of single nucleotide variants (SNVs) requires statistical models that are robust to variability in read coverage due to unequal transcript expression levels. RESULTS: In this paper we present a strategy to more reliably map transcriptome reads by taking advantage of the availability of both the genome reference sequence and transcript databases such as CCDS. We also present a novel Bayesian model for SNV discovery and genotyping based on quality scores. CONCLUSIONS: Experimental results on RNA-Seq data generated from blood cell tissue of three Hapmap individuals show that our methods yield increased accuracy compared to several widely used methods. The open source code implementing our methods, released under the GNU General Public License, is available at http://dna.engr.uconn.edu/software/NGSTools/.


Assuntos
Técnicas de Genotipagem/métodos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de RNA/métodos , Transcriptoma , Teorema de Bayes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Software
14.
Artigo em Inglês | MEDLINE | ID: mdl-34255632

RESUMO

The inference of disease transmission networks is an important problem in epidemiology. One popular approach for building transmission networks is to reconstruct a phylogenetic tree using sequences from disease strains sampled from infected hosts and infer transmissions based on this tree. However, most existing phylogenetic approaches for transmission network inference are highly computationally intensive and cannot take within-host strain diversity into account. Here, we introduce a new phylogenetic approach for inferring transmission networks, TNet, that addresses these limitations. TNet uses multiple strain sequences from each sampled host to infer transmissions and is simpler and more accurate than existing approaches. Furthermore, TNet is highly scalable and able to distinguish between ambiguous and unambiguous transmission inferences. We evaluated TNet on a large collection of 560 simulated transmission networks of various sizes and diverse host, sequence, and transmission characteristics, as well as on 10 real transmission datasets with known transmission histories. Our results show that TNet outperforms two other recently developed methods, phyloscanner and SharpTNI, that also consider within-host strain diversity. We also applied TNet to a large collection of SARS-CoV-2 genomes sampled from infected individuals in many countries around the world, demonstrating how our inference framework can be adapted to accurately infer geographical transmission networks. TNet is freely available from https://compbio.engr.uconn.edu/software/TNet/.


Assuntos
COVID-19 , Genoma , Humanos , Filogenia , SARS-CoV-2
15.
BMC Bioinformatics ; 12 Suppl 1: S53, 2011 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-21342586

RESUMO

BACKGROUND: Recent technology advances have enabled sequencing of individual genomes, promising to revolutionize biomedical research. However, deep sequencing remains more expensive than microarrays for performing whole-genome SNP genotyping. RESULTS: In this paper we introduce a new multi-locus statistical model and computationally efficient genotype calling algorithms that integrate shotgun sequencing data with linkage disequilibrium (LD) information extracted from reference population panels such as Hapmap or the 1000 genomes project. Experiments on publicly available 454, Illumina, and ABI SOLiD sequencing datasets suggest that integration of LD information results in genotype calling accuracy comparable to that of microarray platforms from sequencing data of low-coverage. A software package implementing our algorithm, released under the GNU General Public License, is available at http://dna.engr.uconn.edu/software/GeneSeq/. CONCLUSIONS: Integration of LD information leads to significant improvements in genotype calling accuracy compared to prior LD-oblivious methods, rendering low-coverage sequencing as a viable alternative to microarrays for conducting large-scale genome-wide association studies.


Assuntos
Algoritmos , Genótipo , Desequilíbrio de Ligação , Modelos Estatísticos , Análise de Sequência de DNA/métodos , Software , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla , Genômica/métodos , Polimorfismo de Nucleotídeo Único
16.
Nucleic Acids Res ; 37(8): 2483-92, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19264805

RESUMO

Rapid and reliable virus subtype identification is critical for accurate diagnosis of human infections, effective response to epidemic outbreaks and global-scale surveillance of highly pathogenic viral subtypes such as avian influenza H5N1. The polymerase chain reaction (PCR) has become the method of choice for virus subtype identification. However, designing subtype-specific PCR primer pairs is a very challenging task: on one hand, selected primer pairs must result in robust amplification in the presence of a significant degree of sequence heterogeneity within subtypes, on the other, they must discriminate between the subtype of interest and closely related subtypes. In this article, we present a new tool, called PrimerHunter, that can be used to select highly sensitive and specific primers for virus subtyping. Our tool takes as input sets of both target and nontarget sequences. Primers are selected such that they efficiently amplify any one of the target sequences, and none of the nontarget sequences. PrimerHunter ensures the desired amplification properties by using accurate estimates of melting temperature with mismatches, computed based on the nearest neighbor model via an efficient fractional programming algorithm. Validation experiments with three avian influenza HA subtypes confirm that primers selected by PrimerHunter have high sensitivity and specificity for target sequences.


Assuntos
Primers do DNA/química , Vírus da Influenza A/classificação , Reação em Cadeia da Polimerase , Software , Vírus da Influenza A/genética , Desnaturação de Ácido Nucleico , Filogenia , Reprodutibilidade dos Testes
17.
J Comput Biol ; 28(8): 820-841, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34115950

RESUMO

Single-cell RNA-Seq (scRNA-Seq) is critical for studying cellular function and phenotypic heterogeneity as well as the development of tissues and tumors. In this study, we present SC1 a web-based highly interactive scRNA-Seq data analysis tool publicly accessible at https://sc1.engr.uconn.edu. The tool presents an integrated workflow for scRNA-Seq analysis, implements a novel method of selecting informative genes based on term-frequency inverse-document-frequency scores, and provides a broad range of methods for clustering, differential expression analysis, gene enrichment, interactive visualization, and cell cycle analysis. The tool integrates other single-cell omics data modalities such as T-cell receptor (TCR)-Seq and supports several single-cell sequencing technologies. In just a few steps, researchers can generate a comprehensive analysis and gain powerful insights from their scRNA-Seq data.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Regulação da Expressão Gênica , Humanos , Internet , Análise de Sequência de RNA , Software
18.
J Comput Biol ; 28(8): 842-855, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34264744

RESUMO

In this article, we present our novel pipeline for analysis of metabolic activity using a microbial community's metatranscriptome sequence data set for validation. Our method is based on expectation-maximization (EM) algorithm and provides enzyme expression and pathway activity levels. Further expanding our analysis, we consider individual enzymatic activity and compute enzyme participation coefficients to approximate the metabolic pathway activity more accurately. We apply our EM pathways pipeline to a metatranscriptomic data set of a plankton community from surface waters of the Northern Gulf of Mexico. The data set consists of RNA-seq data and respective environmental parameters, which were sampled at two depths, six times a day over multiple 24-hour cycles. Furthermore, we discuss microbial dependence on day-night cycle within our findings based on a three-way correlation of the enzyme expression during antipodal times-midnight and noon. We show that the enzyme participation levels strongly affect the metabolic activity estimates: that is, marginal and multiple linear regression of enzymatic and metabolic pathway activity correlated significantly with the recorded environmental parameters. Our analysis statistically validates that EM-based methods produce meaningful results, as our method confirms statistically significant dependence of metabolic pathway activity on the environmental parameters, such as salinity, temperature, brightness, and a few others.


Assuntos
Bactérias/genética , Perfilação da Expressão Gênica/métodos , Redes e Vias Metabólicas , Plâncton/microbiologia , Algoritmos , Golfo do México , Modelos Lineares , Metagenômica , Análise de Sequência de RNA
19.
Sci Rep ; 11(1): 3552, 2021 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-33574458

RESUMO

Oligodendrocyte precursor cells (NG2 glia) are uniformly distributed proliferative cells in the mammalian central nervous system and generate myelinating oligodendrocytes throughout life. A subpopulation of OPCs in the neocortex arises from progenitor cells in the embryonic ganglionic eminences that also produce inhibitory neurons. The neuronal fate of some progenitor cells is sealed before birth as they become committed to the oligodendrocyte lineage, marked by sustained expression of the oligodendrocyte transcription factor Olig2, which represses the interneuron transcription factor Dlx2. Here we show that misexpression of Dlx2 alone in postnatal mouse OPCs caused them to switch their fate to GABAergic neurons within 2 days by downregulating Olig2 and upregulating a network of inhibitory neuron transcripts. After two weeks, some OPC-derived neurons generated trains of action potentials and formed clusters of GABAergic synaptic proteins. Our study revealed that the developmental molecular logic can be applied to promote neuronal reprogramming from OPCs.


Assuntos
Desenvolvimento Embrionário/genética , Neurônios GABAérgicos/metabolismo , Proteínas de Homeodomínio/genética , Células Precursoras de Oligodendrócitos/metabolismo , Fator de Transcrição 2 de Oligodendrócitos/genética , Fatores de Transcrição/genética , Proliferação de Células/genética , Reprogramação Celular/genética , Sistema Nervoso Central , Regulação da Expressão Gênica/genética , Proteínas de Homeodomínio/metabolismo , Neuroglia/metabolismo , Sinapses/genética , Fatores de Transcrição/metabolismo
20.
J Clin Invest ; 131(3)2021 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-33320837

RESUMO

Identification of neoepitopes that are effective in cancer therapy is a major challenge in creating cancer vaccines. Here, using an entirely unbiased approach, we queried all possible neoepitopes in a mouse cancer model and asked which of those are effective in mediating tumor rejection and, independently, in eliciting a measurable CD8 response. This analysis uncovered a large trove of effective anticancer neoepitopes that have strikingly different properties from conventional epitopes and suggested an algorithm to predict them. It also revealed that our current methods of prediction discard the overwhelming majority of true anticancer neoepitopes. These results from a single mouse model were validated in another antigenically distinct mouse cancer model and are consistent with data reported in human studies. Structural modeling showed how the MHC I-presented neoepitopes had an altered conformation, higher stability, or increased exposure to T cell receptors as compared with the unmutated counterparts. T cells elicited by the active neoepitopes identified here demonstrated a stem-like early dysfunctional phenotype associated with effective responses against viruses and tumors of transgenic mice. These abundant anticancer neoepitopes, which have not been tested in human studies thus far, can be exploited for generation of personalized human cancer vaccines.


Assuntos
Antígenos de Neoplasias , Vacinas Anticâncer , Epitopos de Linfócito T , Imunoterapia , Neoplasias , Animais , Antígenos de Neoplasias/imunologia , Antígenos de Neoplasias/farmacologia , Vacinas Anticâncer/imunologia , Vacinas Anticâncer/farmacologia , Linhagem Celular Tumoral , Epitopos de Linfócito T/imunologia , Epitopos de Linfócito T/farmacologia , Feminino , Camundongos , Neoplasias/imunologia , Neoplasias/terapia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA