RESUMO
Inflammation is a beneficial host response to infection but can contribute to inflammatory disease if unregulated. The Th17 lineage of T helper (Th) cells can cause severe human inflammatory diseases. These cells exhibit both instability (they can cease to express their signature cytokine, IL-17A) and plasticity (they can start expressing cytokines typical of other lineages) upon in vitro re-stimulation. However, technical limitations have prevented the transcriptional profiling of pre- and post-conversion Th17 cells ex vivo during immune responses. Thus, it is unknown whether Th17 cell plasticity merely reflects change in expression of a few cytokines, or if Th17 cells physiologically undergo global genetic reprogramming driving their conversion from one T helper cell type to another, a process known as transdifferentiation. Furthermore, although Th17 cell instability/plasticity has been associated with pathogenicity, it is unknown whether this could present a therapeutic opportunity, whereby formerly pathogenic Th17 cells could adopt an anti-inflammatory fate. Here we used two new fate-mapping mouse models to track Th17 cells during immune responses to show that CD4(+) T cells that formerly expressed IL-17A go on to acquire an anti-inflammatory phenotype. The transdifferentiation of Th17 into regulatory T cells was illustrated by a change in their signature transcriptional profile and the acquisition of potent regulatory capacity. Comparisons of the transcriptional profiles of pre- and post-conversion Th17 cells also revealed a role for canonical TGF-ß signalling and consequently for the aryl hydrocarbon receptor (AhR) in conversion. Thus, Th17 cells transdifferentiate into regulatory cells, and contribute to the resolution of inflammation. Our data suggest that Th17 cell instability and plasticity is a therapeutic opportunity for inflammatory diseases.
Assuntos
Transdiferenciação Celular , Linfócitos T Reguladores/citologia , Linfócitos T Reguladores/imunologia , Células Th17/citologia , Células Th17/imunologia , Animais , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Helmintíase/imunologia , Masculino , Camundongos , Nippostrongylus/imunologia , Infecções Estafilocócicas/imunologia , Staphylococcus aureus/imunologiaRESUMO
BACKGROUND: The clonoSEQ® Assay (Adaptive Biotechnologies Corporation, Seattle, USA) identifies and tracks unique disease-associated immunoglobulin (Ig) sequences by next-generation sequencing of IgH, IgK, and IgL rearrangements and IgH-BCL1/2 translocations in malignant B cells. Here, we describe studies to validate the analytical performance of the assay using patient samples and cell lines. METHODS: Sensitivity and specificity were established by defining the limit of detection (LoD), limit of quantitation (LoQ) and limit of blank (LoB) in genomic DNA (gDNA) from 66 patients with multiple myeloma (MM), acute lymphoblastic leukemia (ALL), or chronic lymphocytic leukemia (CLL), and three cell lines. Healthy donor gDNA was used as a diluent to contrive samples with specific DNA masses and malignant-cell frequencies. Precision was validated using a range of samples contrived from patient gDNA, healthy donor gDNA, and 9 cell lines to generate measurable residual disease (MRD) frequencies spanning clinically relevant thresholds. Linearity was determined using samples contrived from cell line gDNA spiked into healthy gDNA to generate 11 MRD frequencies for each DNA input, then confirmed using clinical samples. Quantitation accuracy was assessed by (1) comparing clonoSEQ and multiparametric flow cytometry (mpFC) measurements of ALL and MM cell lines diluted in healthy mononuclear cells, and (2) analyzing precision study data for bias between clonoSEQ MRD results in diluted gDNA and those expected from mpFC based on original, undiluted samples. Repeatability of nucleotide base calls was assessed via the assay's ability to recover malignant clonotype sequences across several replicates, process features, and MRD levels. RESULTS: LoD and LoQ were estimated at 1.903 cells and 2.390 malignant cells, respectively. LoB was zero in healthy donor gDNA. Precision ranged from 18% CV (coefficient of variation) at higher DNA inputs to 68% CV near the LoD. Variance component analysis showed MRD results were robust, with expected laboratory process variations contributing ≤3% CV. Linearity and accuracy were demonstrated for each disease across orders of magnitude of clonal frequencies. Nucleotide sequence error rates were extremely low. CONCLUSIONS: These studies validate the analytical performance of the clonoSEQ Assay and demonstrate its potential as a highly sensitive diagnostic tool for selected lymphoid malignancies.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Leucemia Linfocítica Crônica de Células B/diagnóstico , Mieloma Múltiplo/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Kit de Reagentes para Diagnóstico , Medula Óssea/patologia , Ciclina D1/genética , Rearranjo Gênico , Humanos , Cadeias Pesadas de Imunoglobulinas/genética , Cadeias lambda de Imunoglobulina/genética , Imunoglobulinas/genética , Leucemia Linfocítica Crônica de Células B/sangue , Leucemia Linfocítica Crônica de Células B/genética , Leucemia Linfocítica Crônica de Células B/terapia , Limite de Detecção , Mieloma Múltiplo/sangue , Mieloma Múltiplo/genética , Mieloma Múltiplo/terapia , Neoplasia Residual , Leucemia-Linfoma Linfoblástico de Células Precursoras/sangue , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/terapia , Proteínas Proto-Oncogênicas c-bcl-2/genética , Translocação GenéticaRESUMO
Artificial neural networks (ANN) are computing architectures with many interconnections of simple neural-inspired computing elements, and have been applied to biomedical fields such as imaging analysis and diagnosis. We have developed a new ANN framework called Cox-nnet to predict patient prognosis from high throughput transcriptomics data. In 10 TCGA RNA-Seq data sets, Cox-nnet achieves the same or better predictive accuracy compared to other methods, including Cox-proportional hazards regression (with LASSO, ridge, and mimimax concave penalty), Random Forests Survival and CoxBoost. Cox-nnet also reveals richer biological information, at both the pathway and gene levels. The outputs from the hidden layer node provide an alternative approach for survival-sensitive dimension reduction. In summary, we have developed a new method for accurate and efficient prognosis prediction on high throughput data, with functional biological insights. The source code is freely available at https://github.com/lanagarmire/cox-nnet.
Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Redes Neurais de Computação , Prognóstico , Modelos de Riscos Proporcionais , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Feminino , Redes Reguladoras de Genes , Humanos , Estimativa de Kaplan-Meier , Masculino , Redes e Vias Metabólicas/genética , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/mortalidade , Análise de Sequência de RNA/estatística & dados numéricos , Análise de SobrevidaRESUMO
Plant secretory (Class III) peroxidases are redox enzymes that rely on N-glycosylation for full enzyme activity and stability. Peroxidases from palm tree leaves comprise the most stable and active plant peroxidases characterized to date. Herein, site-specific glycosylation and microheterogeneity of windmill palm tree (Trachycarpus fortunei) peroxidase are reported. The workflow developed in this study includes novel tools, written in R, to aid plant glycan identification, pGlycoFilter, for annotation of glycopeptide fragmentation spectra, gPSMvalidator, and for relative quantitation of glycoforms, glycoRQ. Mass spectrometry analysis provided a detailed glycosylation profile at the 13 sites of N-linked glycosylation on windmill palm tree peroxidase. Glycan microheterogeneity was observed at each site. Site Asn211 was the most heterogeneous and contained 30 different glycans. Relative quantitation revealed 90% of each glycosylation site was occupied by three or fewer glycans, and two of the 13 sites were partially unoccupied. Although complex and hybrid glycans were identified, the majority of glycans were paucimannosidic, characteristic of plant vacuolar glycoproteins. Further studies pertaining to the glycan structure-activity relationships in plant peroxidases can benefit from the work outlined here.
Assuntos
Arecaceae/enzimologia , Bases de Dados de Proteínas , Glicopeptídeos/análise , Peroxidase/metabolismo , Polissacarídeos/análise , Glicosilação , Espectrometria de Massas , Proteínas de Plantas/metabolismo , Fluxo de TrabalhoRESUMO
It is crucial for researchers to optimize RNA-seq experimental designs for differential expression detection. Currently, the field lacks general methods to estimate power and sample size for RNA-Seq in complex experimental designs, under the assumption of the negative binomial distribution. We simulate RNA-Seq count data based on parameters estimated from six widely different public data sets (including cell line comparison, tissue comparison, and cancer data sets) and calculate the statistical power in paired and unpaired sample experiments. We comprehensively compare five differential expression analysis packages (DESeq, edgeR, DESeq2, sSeq, and EBSeq) and evaluate their performance by power, receiver operator characteristic (ROC) curves, and other metrics including areas under the curve (AUC), Matthews correlation coefficient (MCC), and F-measures. DESeq2 and edgeR tend to give the best performance in general. Increasing sample size or sequencing depth increases power; however, increasing sample size is more potent than sequencing depth to increase power, especially when the sequencing depth reaches 20 million reads. Long intergenic noncoding RNAs (lincRNA) yields lower power relative to the protein coding mRNAs, given their lower expression level in the same RNA-Seq experiment. On the other hand, paired-sample RNA-Seq significantly enhances the statistical power, confirming the importance of considering the multifactor experimental design. Finally, a local optimal power is achievable for a given budget constraint, and the dominant contributing factor is sample size rather than the sequencing depth. In conclusion, we provide a power analysis tool (http://www2.hawaii.edu/~lgarmire/RNASeqPowerCalculator.htm) that captures the dispersion in the data and can serve as a practical reference under the budget constraint of RNA-Seq experiments.
Assuntos
Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Mensageiro/química , Análise de Sequência de RNA/métodos , Animais , Simulação por Computador , Bases de Dados Genéticas , Humanos , Camundongos , Análise de Regressão , Reprodutibilidade dos Testes , Projetos de Pesquisa , Tamanho da Amostra , SoftwareRESUMO
BACKGROUND: Epigenetic alterations are known to correlate with changes in gene expression among various diseases including cancers. However, quantitative models that accurately predict the up or down regulation of gene expression are currently lacking. METHODS: A new machine learning-based method of gene expression prediction is developed in the context of lung cancer. This method uses the Illumina Infinium HumanMethylation450K Beadchip CpG methylation array data from paired lung cancer and adjacent normal tissues in The Cancer Genome Atlas (TCGA) and histone modification marker CHIP-Seq data from the ENCODE project, to predict the differential expression of RNA-Seq data in TCGA lung cancers. It considers a comprehensive list of 1424 features spanning the four categories of CpG methylation, histone H3 methylation modification, nucleotide composition, and conservation. Various feature selection and classification methods are compared to select the best model over 10-fold cross-validation in the training data set. RESULTS: A best model comprising 67 features is chosen by ReliefF based feature selection and random forest classification method, with AUC = 0.864 from the 10-fold cross-validation of the training set and AUC = 0.836 from the testing set. The selected features cover all four data types, with histone H3 methylation modification (32 features) and CpG methylation (15 features) being most abundant. Among the dropping-off tests of individual data-type based features, removal of CpG methylation feature leads to the most reduction in model performance. In the best model, 19 selected features are from the promoter regions (TSS200 and TSS1500), highest among all locations relative to transcripts. Sequential dropping-off of CpG methylation features relative to different regions on the protein coding transcripts shows that promoter regions contribute most significantly to the accurate prediction of gene expression. CONCLUSIONS: By considering a comprehensive list of epigenomic and genomic features, we have constructed an accurate model to predict transcriptomic differential expression, exemplified in lung cancer.
Assuntos
Ilhas de CpG/genética , Metilação de DNA , Epigenômica/métodos , Regulação Neoplásica da Expressão Gênica , Genoma Humano , Neoplasias Pulmonares/genética , Inteligência Artificial , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Histonas/genética , HumanosRESUMO
Breast cancer is the most common malignancy in women worldwide. With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed for more personalized treatment and disease management. Towards this goal, we have developed a novel computational model for breast cancer prognosis by combining the Pathway Deregulation Score (PDS) based pathifier algorithm, Cox regression and L1-LASSO penalization method. We trained the model on a set of 236 patients with gene expression data and clinical information, and validated the performance on three diversified testing data sets of 606 patients. To evaluate the performance of the model, we conducted survival analysis of the dichotomized groups, and compared the areas under the curve based on the binary classification. The resulting prognosis genomic model is composed of fifteen pathways (e.g., P53 pathway) that had previously reported cancer relevance, and it successfully differentiated relapse in the training set (log rank p-valueâ=â6.25e-12) and three testing data sets (log rank p-value < 0.0005). Moreover, the pathway-based genomic models consistently performed better than gene-based models on all four data sets. We also find strong evidence that combining genomic information with clinical information improved the p-values of prognosis prediction by at least three orders of magnitude in comparison to using either genomic or clinical information alone. In summary, we propose a novel prognosis model that harnesses the pathway-based dysregulation as well as valuable clinical information. The selected pathways in our prognosis model are promising targets for therapeutic intervention.
Assuntos
Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Genômica/métodos , Modelos Estatísticos , Algoritmos , Neoplasias da Mama/metabolismo , Progressão da Doença , Feminino , Perfilação da Expressão Gênica , Humanos , Pessoa de Meia-Idade , Prognóstico , Modelos de Riscos Proporcionais , Curva ROC , TranscriptomaRESUMO
Horizontal gene transfers (HGT) between four Crenarchaeota species (Metallosphaera cuprina Ar-4T, Acidianus hospitalis W1T, Vulcanisaeta moutnovskia 768-28T, and Pyrobaculum islandicum DSM 4184T) were investigated with quartet analysis. Strong support was found for individual genes that disagree with the phylogeny of the majority, implying genomic mosaicism. One such gene, a ferredoxin-related gene, was investigated further and incorporated into a larger phylogeny, which provided evidence for HGT of this gene from the Vulcanisaeta lineage to the Acidianus lineage. This is the first application of quartet analysis of HGT for the phylum Crenarchaeota. The results have shown that quartet analysis is a powerful technique to screen homologous sequences for putative HGTs and is useful in visually describing genomic mosaicism and HGT within four taxa.
Assuntos
Crenarchaeota/genética , Transferência Genética Horizontal , Biologia Computacional , Crenarchaeota/classificação , Ferredoxinas/genética , Filogenia , RNA Ribossômico/genética , RNA Ribossômico 16S/genéticaRESUMO
Pre-eclampsia is the leading cause of fetal and maternal morbidity and mortality. Early onset pre-eclampsia (EOPE) is a disorder that has severe maternal and fetal outcomes, whilst its etiology is poorly understood. We hypothesize that epigenetics plays an important role to mediate the development of EOPE and conducted a case-control study to compare the genome-wide methylome difference between chorioamniotic membranes from 30 EOPE and 17 full-term pregnancies using the Infinium Human Methylation 450 BeadChip arrays. Bioinformatics analysis tested differential methylation (DM) at CpG site level, gene level, and pathway and network level. A striking genome-wide hypermethylation pattern coupled with hypomethylation in promoters was observed. Out of 385 184 CpG sites, 9995 showed DM (2.6%). Of those DM sites, 91.9% showed hypermethylation (9186 of 9995). Over 900 genes had DM associated with promoters. Promoter-based DM analysis revealed that genes in canonical cancer-related pathways such as Rac, Ras, PI3K/Akt, NFκB and ErBB4 were enriched, and represented biological functional alterations that involve cell cycle, apoptosis, cancer signaling and inflammation. A group of genes previously found to be up-regulated in pre-eclampsia, including GRB2, ATF3, NFKB2, as well as genes in proteasome subunits (PSMA1, PMSE1, PSMD1 and PMSD8), harbored hypomethylated promoters. Contrarily, a cluster of microRNAs, including mir-519a1, mir-301a, mir-487a, mir-185, mir-329, mir-194, mir-376a1, mir-486 and mir-744 were all hypermethylated in their promoters in the EOPE samples. These findings collectively reveal new avenues of research regarding the vast epigenetic modifications in EOPE.
Assuntos
Âmnio/metabolismo , Córion/metabolismo , Metilação de DNA , Epigênese Genética , Pré-Eclâmpsia/metabolismo , Regiões Promotoras Genéticas , Adulto , Estudos de Casos e Controles , Biologia Computacional , DNA/metabolismo , Regulação para Baixo , Feminino , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Gravidez , Estudos Retrospectivos , Regulação para Cima , Adulto JovemRESUMO
TruAB Discovery is an approach that integrates cellular immunology, high-throughput immunosequencing, bioinformatics, and computational biology in order to discover naturally occurring human antibodies for prophylactic or therapeutic use. We adapted our previously described pairSEQ technology to pair B cell receptor heavy and light chains of SARS-CoV-2 spike protein-binding antibodies derived from enriched antigen-specific memory B cells and bulk antibody-secreting cells. We identified approximately 60,000 productive, in-frame, paired antibody sequences, from which 2,093 antibodies were selected for functional evaluation based on abundance, isotype and patterns of somatic hypermutation. The exceptionally diverse antibodies included RBD-binders with broad neutralizing activity against SARS-CoV-2 variants, and S2-binders with broad specificity against betacoronaviruses and the ability to block membrane fusion. A subset of these RBD- and S2-binding antibodies demonstrated robust protection against challenge in hamster and mouse models. This high-throughput approach can accelerate discovery of diverse, multifunctional antibodies against any target of interest.
Assuntos
COVID-19 , SARS-CoV-2 , Animais , Camundongos , Humanos , Anticorpos Neutralizantes , Anticorpos Amplamente Neutralizantes , Anticorpos AntiviraisRESUMO
Glycerol is a biodiesel byproduct. In the present study, glycerol was used as a co-substrate during biodegradation of dibenzothiophene (DBT) by Paraburkholderia sp. C3. Polycyclic aromatic hydrocarbons (PAHs) are a group of persistent, ubiquitous and carcinogenic chemicals found in the environment. DBT is a major sulfur-containing PAH. The chemical properties of DBT make it an ideal model pollutant for examining the bioremediation of higher molecular weight PAHs. Bioremediation uses microbial catalysis for removal of environmental pollutants. Environmental microorganisms that encounter aromatic substrates such as heterocyclic PAHs develop unique characteristics that allow the uptake and assimilation of these cytotoxic substrates. Microbial adaptations include changes in membrane lipid composition, secretion of surface-active compounds and accumulation of lipid granules to withstand chemical toxicity. Biostimulation using more readily metabolized substrates can increase the biodegradation rate of PAHs, but the molecular mechanisms are not well understood. We analyzed the DBT biodegradation kinetics in C3, proteome changes and TEM micrographs in different culturing conditions. We utilized 2-bromoalkanoic lipid metabolic inhibitors to establish a correlation between polyhydroxyalkanoate (PHA) granule formation and the enhancement of DBT biodegradation induced by glycerol. This is the first description linking PHA biosynthesis, DBT biodegradation and 2-bromoalkanoic acids in a Paraburkholderia species.
Assuntos
Hidrocarbonetos Policíclicos Aromáticos , Poli-Hidroxialcanoatos , Biodegradação Ambiental , Glicerol , TiofenosRESUMO
PURPOSE: The CLL14 study has established one-year fixed-duration treatment of venetoclax and obinutuzumab (Ven-Obi) for patients with previously untreated chronic lymphocytic leukemia. With all patients off treatment for at least three years, we report a detailed analysis of minimal residual disease (MRD) kinetics and long-term outcome of patients treated in the CLL14 study. PATIENTS AND METHODS: Patients were randomly assigned to receive six cycles of obinutuzumab with 12 cycles of venetoclax or 12 cycles of chlorambucil (Clb-Obi). Progression-free survival (PFS) was the primary end point. Key secondary end points included rates of undetectable MRD and overall survival. To analyze MRD kinetics, a population-based growth model with nonlinear mixed effects approach was developed. RESULTS: Of 432 patients, 216 were assigned to Ven-Obi and 216 to Clb-Obi. Three months after treatment completion, 40% of patients in the Ven-Obi arm (7% in the Clb-Obi arm) had undetectable MRD levels < 10-6 by next-generation sequencing in peripheral blood. Median MRD doubling time was longer after Ven-Obi than Clb-Obi therapy (median 80 v 69 days). At a median follow-up of 52.4 months, a sustained significant PFS improvement was observed in the Ven-Obi arm compared with Clb-Obi (median not reached v 36.4 months; hazard ratio 0.33; 95% CI, 0.25 to 0.45; P < .0001). The estimated 4-year PFS rate was 74.0% in the Ven-Obi and 35.4% in the Clb-Obi arm. No difference in overall survival was observed (hazard ratio 0.85; 95% CI, 0.54 to 1.35; P = .49). No new safety signals occurred. CONCLUSION: Appearance of MRD after Ven-Obi is significantly slower than that after Clb-Obi with more effective MRD reduction. These findings translate into a superior long-term efficacy with the majority of Ven-Obi-treated patients remaining in remission.
Assuntos
Anticorpos Monoclonais Humanizados/efeitos adversos , Protocolos de Quimioterapia Combinada Antineoplásica/efeitos adversos , Compostos Bicíclicos Heterocíclicos com Pontes/efeitos adversos , Neoplasia Residual/induzido quimicamente , Sulfonamidas/efeitos adversos , Anticorpos Monoclonais Humanizados/farmacologia , Protocolos de Quimioterapia Combinada Antineoplásica/farmacologia , Medicamentos Biossimilares , Compostos Bicíclicos Heterocíclicos com Pontes/farmacologia , Feminino , Humanos , Masculino , Sulfonamidas/farmacologiaRESUMO
Although driver genes in hepatocellular carcinoma (HCC) have been investigated in various previous genetic studies, prevalence of key driver genes among heterogeneous populations is unknown. Moreover, the phenotypic associations of these driver genes are poorly understood. This report aims to reveal the phenotypic impacts of a group of consensus driver genes in HCC. We used MutSigCV and OncodriveFM modules implemented in the IntOGen pipeline to identify consensus driver genes across six HCC cohorts comprising 1,494 samples in total. To access their global impacts, we used The Cancer Genome Atlas (TCGA) mutations and copy-number variations to predict the transcriptomics data, under generalized linear models. We further investigated the associations of the consensus driver genes to patient survival, age, gender, race, and risk factors. We identify 10 consensus driver genes across six HCC cohorts in total. Integrative analysis of driver mutations, copy-number variations, and transcriptomic data reveals that these consensus driver mutations and their copy-number variations are associated with a majority (62.5%) of the mRNA transcriptome but only a small fraction (8.9%) of miRNAs. Genes associated with TP53, CTNNB1, and ARID1A mutations contribute to the tripod of most densely connected pathway clusters. These driver genes are significantly associated with patients' overall survival. Some driver genes are significantly linked to HCC gender (CTNNB1, ALB, TP53, and AXIN1), race (TP53 and CDKN2A), and age (RB1) disparities. This study prioritizes a group of consensus drivers in HCC, which collectively show vast impacts on the phenotypes. These driver genes may warrant as valuable therapeutic targets of HCC.
Assuntos
Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Predisposição Genética para Doença , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Oncogenes , Fenótipo , Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Estudos de Associação Genética , Humanos , Modelos Biológicos , Mutação , TranscriptomaRESUMO
INTRODUCTION: Preeclampsia is a medical condition complicated with hypertension and proteinuria during pregnancy. While preeclampsia affects approximately 5% of pregnancies, it remains without a cure. In addition, women who had preeclampsia during pregnancy have been reported to have an increased risk for cardiovascular disease later in life. However, the disease etiology and molecular mechanisms remain poorly understood. The paucity in the literature on preeclampsia associated maternal cardiovascular risk in different ethnic populations also present a need for more research. Therefore, the objective of this study was to identify cardiovascular/metabolic single nucleotide polymorphisms (SNPs), genes, and regulatory pathways associated with early-onset preeclampsia. MATERIALS AND METHODS: We compared maternal DNAs from 31 women with early-onset preeclampsia with those from a control group of 29 women without preeclampsia who delivered full-term normal birthweight infants. Women with multiple gestations and/or known medical disorders associated with preeclampsia (pregestational diabetes, chronic hypertension, renal disease, hyperthyroidism, and lupus) were excluded. The MetaboChip genotyping array with approximately 197,000 SNPs associated with metabolic and cardiovascular traits was used. Single nucleotide polymorphism analysis was performed using the SNPAssoc program in R. The Truncated Product Method was used to identify significantly associated genes. Ingenuity Pathway Analysis and Ingenuity Causal Network Analysis were used to identify significantly associated disease processes and regulatory gene networks respectively. RESULTS: The early-onset preeclampsia group included 45% Filipino, 26% White, 16% other Asian, and 13% Native Hawaiian and other Pacific Islanders, which did not differ from the control group. There were no SNPs associated with early-onset preeclampsia after correction for multiple comparisons. However, through gene-based tests, 68 genes and 23 cardiovascular disease-related processes were found to be significantly associated. Associated gene regulatory networks involved cellular movement, cardiovascular disease, and inflammatory disease. CONCLUSIONS: Multiple cardiovascular genes and diseases demonstrate associations with early-onset preeclampsia. This unfolds new areas of research regarding the genetic determinants of early-onset preeclampsia and their relation to future cardiovascular disease.
Assuntos
Doenças Cardiovasculares/genética , Genes/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Pré-Eclâmpsia/genética , Adulto , Estudos de Casos e Controles , Feminino , Humanos , GravidezRESUMO
Alternative splicing (AS) has been shown to participate in prostate cancer development and progression; however, a link between AS and prostate cancer health disparities has been largely unexplored. Here we report on the cloning of a novel splice variant of FGFR3 that is preferentially expressed in African American (AA) prostate cancer. This novel variant (FGFR3-S) omits exon 14, comprising 123 nucleotides that encode the activation loop in the intracellular split kinase domain. Ectopic overexpression of FGFR3-S in European American (EA) prostate cancer cell lines (PC-3 and LNCaP) led to enhanced receptor autophosphorylation and increased activation of the downstream signaling effectors AKT, STAT3, and ribosomal S6 compared with FGFR3-L (retains exon 14). The increased oncogenic signaling imparted by FGFR3-S was associated with a substantial gain in proliferative and antiapoptotic activities, as well as a modest but significant gain in cell motility. Moreover, the FGFR3-S-conferred proliferative and motility gains were highly resistant to the pan-FGFR small-molecule inhibitor dovitinib and the antiapoptotic gain was insensitive to the cytotoxic drug docetaxel, which stands in marked contrast with dovitinib- and docetaxel-sensitive FGFR3-L. In an in vivo xenograft model, mice injected with PC-3 cells overexpressing FGFR3-S exhibited significantly increased tumor growth and resistance to dovitinib treatment compared with cells overexpressing FGFR3-L. In agreement with our in vitro and in vivo findings, a high FGFR3-S/FGFR3-L expression ratio in prostate cancer specimens was associated with poor patient prognosis. IMPLICATIONS: This work identifies a novel FGFR3 splice variant and supports the hypothesis that differential AS participates in prostate cancer health disparities.
Assuntos
Negro ou Afro-Americano/genética , Docetaxel/farmacologia , Neoplasias da Próstata/genética , Receptor Tipo 3 de Fator de Crescimento de Fibroblastos/genética , Processamento Alternativo , Animais , Linhagem Celular Tumoral , Resistencia a Medicamentos Antineoplásicos , Humanos , Masculino , Células PC-3 , Fenótipo , Neoplasias da Próstata/tratamento farmacológico , Neoplasias da Próstata/mortalidade , Neoplasias da Próstata/patologia , Splicing de RNA , Coelhos , Transdução de Sinais , Análise de Sobrevida , TransfecçãoRESUMO
Long intergenic non-coding RNAs have been shown to play important roles in cancer. However, because lincRNAs are a relatively new class of RNAs compared to protein-coding mRNAs, the mutational landscape of lincRNAs has not been as extensively studied. Here we characterize expressed somatic nucleotide variants within lincRNAs using 12 cancer RNA-Seq datasets in TCGA. We build machine-learning models to discriminate somatic variants from germline variants within lincRNA regions (AUC 0.987). We build another model to differentiate lincRNA somatic mutations from background regions (AUC 0.72) and find several molecular features that are strongly associated with lincRNA mutations, including copy number variation, conservation, substitution type and histone marker features.
Assuntos
Neoplasias/genética , RNA Longo não Codificante/genética , RNA Neoplásico/genética , Algoritmos , Biologia Computacional/métodos , Sequência Conservada , Variações do Número de Cópias de DNA , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Feminino , Variação Genética , Mutação em Linhagem Germinativa , Histonas/genética , Histonas/metabolismo , Humanos , Funções Verossimilhança , Modelos Logísticos , Aprendizado de Máquina , Masculino , Modelos Genéticos , Modelos Estatísticos , Mutação , Neoplasias/metabolismo , Redes Neurais de Computação , Dinâmica não Linear , Análise de Sequência de RNA/estatística & dados numéricosRESUMO
Despite its popularity, characterization of subpopulations with transcript abundance is subject to a significant amount of noise. We propose to use effective and expressed nucleotide variations (eeSNVs) from scRNA-seq as alternative features for tumor subpopulation identification. We develop a linear modeling framework, SSrGE, to link eeSNVs associated with gene expression. In all the datasets tested, eeSNVs achieve better accuracies than gene expression for identifying subpopulations. Previously validated cancer-relevant genes are also highly ranked, confirming the significance of the method. Moreover, SSrGE is capable of analyzing coupled DNA-seq and RNA-seq data from the same single cells, demonstrating its value in integrating multi-omics single cell techniques. In summary, SNV features from scRNA-seq data have merits for both subpopulation identification and linkage of genotype-phenotype relationship.
Assuntos
Regulação Neoplásica da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Análise de Célula Única/métodos , Algoritmos , Perfilação da Expressão Gênica/métodos , Genótipo , Humanos , Modelos Genéticos , Neoplasias/patologia , FenótipoRESUMO
plantGlycoMS is a set of tools, implemented in R, which is used to assess and validate glycopeptide spectrum matches (gPSMs). Validity of gPSMs is based on characteristic fragmentation patterns of glycopeptides (gPSMvalidator), adherence of the glycan moiety to the known N-glycan biosynthesis pathway in plants (pGlycoFilter), and elution of the glycopeptide within the observed retention time window of other glycopeptides sharing the same peptide backbone (rt.Restrict). plantGlycoMS also contains a tool for relative quantitation of glycoforms based on selected ion chromatograms of glycopeptide ion precursors in the mass spectrometry level 1 data (glycoRQ). This protocol walks the user through this workflow with example mass spectrometry data obtained for a plant glycoprotein.
Assuntos
Glicopeptídeos/análise , Glicoproteínas/química , Espectrometria de Massas/métodos , Proteínas de Plantas/química , Plantas/química , Polissacarídeos/análise , Sequência de Aminoácidos , Glicosilação , Software , Fluxo de TrabalhoRESUMO
Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.
Assuntos
Pesquisa Biomédica/tendências , Tecnologia Biomédica/tendências , Aprendizado Profundo/tendências , Algoritmos , Pesquisa Biomédica/métodos , Tomada de Decisões , Atenção à Saúde/métodos , Atenção à Saúde/tendências , Doença/genética , Desenho de Fármacos , Registros Eletrônicos de Saúde/tendências , Humanos , Terminologia como AssuntoRESUMO
Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM.