Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 51
Filtrar
1.
PLoS Comput Biol ; 20(7): e1012142, 2024 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-39047024

RESUMEN

Increasing genetic and phenotypic data size is critical for understanding the genetic determinants of diseases. Evidently, establishing practical means for collaboration and data sharing among institutions is a fundamental methodological barrier for performing high-powered studies. As the sample sizes become more heterogeneous, complex statistical approaches, such as generalized linear mixed effects models, must be used to correct for the confounders that may bias results. On another front, due to the privacy concerns around Protected Health Information (PHI), genetic information is restrictively protected by sharing according to regulations such as Health Insurance Portability and Accountability Act (HIPAA). This limits data sharing among institutions and hampers efforts around executing high-powered collaborative studies. Federated approaches are promising to alleviate the issues around privacy and performance, since sensitive data never leaves the local sites. Motivated by these, we developed FedGMMAT, a federated genetic association testing tool that utilizes a federated statistical testing approach for efficient association tests that can correct for confounding fixed and additive polygenic random effects among different collaborating sites. Genetic data is never shared among collaborating sites, and the intermediate statistics are protected by encryption. Using simulated and real datasets, we demonstrate FedGMMAT can achieve the virtually same results as pooled analysis under a privacy-preserving framework with practical resource requirements.

3.
J Neurooncol ; 168(3): 515-524, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38811523

RESUMEN

PURPOSE: Accurate classification of cancer subgroups is essential for precision medicine, tailoring treatments to individual patients based on their cancer subtypes. In recent years, advances in high-throughput sequencing technologies have enabled the generation of large-scale transcriptomic data from cancer samples. These data have provided opportunities for developing computational methods that can improve cancer subtyping and enable better personalized treatment strategies. METHODS: Here in this study, we evaluated different feature selection schemes in the context of meningioma classification. To integrate interpretable features from the bulk (n = 77 samples) and single-cell profiling (∼ 10 K cells), we developed an algorithm named CLIPPR which combines the top-performing single-cell models, RNA-inferred copy number variation (CNV) signals, and the initial bulk model to create a meta-model. RESULTS: While the scheme relying solely on bulk transcriptomic data showed good classification accuracy, it exhibited confusion between malignant and benign molecular classes in approximately ∼ 8% of meningioma samples. In contrast, models trained on features learned from meningioma single-cell data accurately resolved the sub-groups confused by bulk-transcriptomic data but showed limited overall accuracy. CLIPPR showed superior overall accuracy and resolved benign-malignant confusion as validated on n = 789 bulk meningioma samples gathered from multiple institutions. Finally, we showed the generalizability of our algorithm using our in-house single-cell (∼ 200 K cells) and bulk TCGA glioma data (n = 711 samples). CONCLUSION: Overall, our algorithm CLIPPR synergizes the resolution of single-cell data with the depth of bulk sequencing and enables improved cancer sub-group diagnoses and insights into their biology.


Asunto(s)
Algoritmos , Neoplasias Meníngeas , Meningioma , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Neoplasias Meníngeas/genética , Neoplasias Meníngeas/patología , Neoplasias Meníngeas/clasificación , Meningioma/genética , Meningioma/patología , Meningioma/clasificación , Análisis de Secuencia de ARN/métodos , Variaciones en el Número de Copia de ADN , Biomarcadores de Tumor/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Transcriptoma , Perfilación de la Expresión Génica/métodos
4.
bioRxiv ; 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38496434

RESUMEN

Prior studies have described the complex interplay that exists between glioma cells and neurons, however, the electrophysiological properties endogenous to tumor cells remain obscure. To address this, we employed Patch-sequencing on human glioma specimens and found that one third of patched cells in IDH mutant (IDH mut ) tumors demonstrate properties of both neurons and glia by firing single, short action potentials. To define these hybrid cells (HCs) and discern if they are tumor in origin, we developed a computational tool, Single Cell Rule Association Mining (SCRAM), to annotate each cell individually. SCRAM revealed that HCs represent tumor and non-tumor cells that feature GABAergic neuron and oligodendrocyte precursor cell signatures. These studies are the first to characterize the combined electrophysiological and molecular properties of human glioma cells and describe a new cell type in human glioma with unique electrophysiological and transcriptomic properties that are likely also present in the non-tumor mammalian brain.

6.
Nat Med ; 29(12): 3067-3076, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37944590

RESUMEN

Surgery is the mainstay of treatment for meningioma, the most common primary intracranial tumor, but improvements in meningioma risk stratification are needed and indications for postoperative radiotherapy are controversial. Here we develop a targeted gene expression biomarker that predicts meningioma outcomes and radiotherapy responses. Using a discovery cohort of 173 meningiomas, we developed a 34-gene expression risk score and performed clinical and analytical validation of this biomarker on independent meningiomas from 12 institutions across 3 continents (N = 1,856), including 103 meningiomas from a prospective clinical trial. The gene expression biomarker improved discrimination of outcomes compared with all other systems tested (N = 9) in the clinical validation cohort for local recurrence (5-year area under the curve (AUC) 0.81) and overall survival (5-year AUC 0.80). The increase in AUC compared with the standard of care, World Health Organization 2021 grade, was 0.11 for local recurrence (95% confidence interval 0.07 to 0.17, P < 0.001). The gene expression biomarker identified meningiomas benefiting from postoperative radiotherapy (hazard ratio 0.54, 95% confidence interval 0.37 to 0.78, P = 0.0001) and suggested postoperative management could be refined for 29.8% of patients. In sum, our results identify a targeted gene expression biomarker that improves discrimination of meningioma outcomes, including prediction of postoperative radiotherapy responses.


Asunto(s)
Neoplasias Meníngeas , Meningioma , Humanos , Biomarcadores , Perfilación de la Expresión Génica , Neoplasias Meníngeas/genética , Neoplasias Meníngeas/radioterapia , Neoplasias Meníngeas/patología , Meningioma/genética , Meningioma/radioterapia , Meningioma/patología , Recurrencia Local de Neoplasia/patología , Estudios Prospectivos
7.
Genome Biol ; 24(1): 204, 2023 09 11.
Artículo en Inglés | MEDLINE | ID: mdl-37697426

RESUMEN

Growing regulatory requirements set barriers around genetic data sharing and collaborations. Moreover, existing privacy-aware paradigms are challenging to deploy in collaborative settings. We present COLLAGENE, a tool base for building secure collaborative genomic data analysis methods. COLLAGENE protects data using shared-key homomorphic encryption and combines encryption with multiparty strategies for efficient privacy-aware collaborative method development. COLLAGENE provides ready-to-run tools for encryption/decryption, matrix processing, and network transfers, which can be immediately integrated into existing pipelines. We demonstrate the usage of COLLAGENE by building a practical federated GWAS protocol for binary phenotypes and a secure meta-analysis protocol. COLLAGENE is available at https://zenodo.org/record/8125935 .


Asunto(s)
Genómica , Privacidad , Análisis de Datos , Difusión de la Información , Fenotipo , Metaanálisis como Asunto
8.
bioRxiv ; 2023 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-37609241

RESUMEN

Predictive models in biomedicine need to ensure equitable and reliable outcomes for the populations they are applied to. Unfortunately, biases in medical predictions can lead to unfair treatment and widening disparities, underscoring the need for effective techniques to address these issues. To enhance fairness, we introduce a framework based on a Multiple Domain Adversarial Neural Network (MDANN), which incorporates multiple adversarial components. In an MDANN, an adversarial module is applied to learn a fair pattern by negative gradients back-propagating across multiple sensitive features (i.e., characteristics of individuals that should not be used to discriminate unfairly between individuals when making predictions or decisions.) We leverage loss functions based on the Area Under the Receiver Operating Characteristic Curve (AUC) to address the class imbalance, promoting equitable classification performance for minority groups (e.g., a subset of the population that is underrepresented or disadvantaged.) Moreover, we utilize pre-trained convolutional autoencoders (CAEs) to extract deep representations of data, aiming to enhance prediction accuracy and fairness. Combining these mechanisms, we alleviate biases and disparities to provide reliable and equitable disease prediction. We empirically demonstrate that the MDANN approach leads to better accuracy and fairness in predicting disease progression using brain imaging data for Alzheimer's Disease and Autism populations than state-of-the-art techniques.

9.
iScience ; 26(8): 107227, 2023 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-37529100

RESUMEN

Federated association testing is a powerful approach to conduct large-scale association studies where sites share intermediate statistics through a central server. There are, however, several standing challenges. Confounding factors like population stratification should be carefully modeled across sites. In addition, it is crucial to consider disease etiology using flexible models to prevent biases. Privacy protections for participants pose another significant challenge. Here, we propose distributed Mixed Effects Genome-wide Association study (dMEGA), a method that enables federated generalized linear mixed model-based association testing across multiple sites without explicitly sharing genotype and phenotype data. dMEGA employs a reference projection to correct for population-stratification and utilizes efficient local-gradient updates among sites, incorporating both fixed and random effects. The accuracy and efficiency of dMEGA are demonstrated through simulated and real datasets. dMEGA is publicly available at https://github.com/Li-Wentao/dMEGA.

10.
Cancer Immunol Res ; : OF1-OF18, 2023 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-37285177

RESUMEN

Comprehensive investigation of CD8+ T cells in acute myeloid leukemia (AML) is essential for developing immunotherapeutic strategies beyond immune checkpoint blockade. Herein, we performed single-cell RNA profiling of CD8+ T cells from 3 healthy bone marrow donors and 23 newly diagnosed (NewlyDx) and 8 relapsed/refractory (RelRef) patients with AML. Cells coexpressing canonical exhaustion markers formed a cluster constituting <1% of all CD8+ T cells. We identified two effector CD8+ T-cell subsets characterized by distinct cytokine and metabolic profiles that were differentially enriched in NewlyDx and RelRef patients. We refined a 25-gene CD8-derived signature correlating with therapy resistance, including genes associated with activation, chemoresistance, and terminal differentiation. Pseudotemporal trajectory analysis supported enrichment of a terminally differentiated state in CD8+ T cells with high CD8-derived signature expression at relapse or refractory disease. Higher expression of the 25-gene CD8 AML signature correlated with poorer outcomes in previously untreated patients with AML, suggesting that the bona fide state of CD8+ T cells and their degree of differentiation are clinically relevant. Immune clonotype tracking revealed more phenotypic transitions in CD8 clonotypes in NewlyDx than in RelRef patients. Furthermore, CD8+ T cells from RelRef patients had a higher degree of clonal hyperexpansion associated with terminal differentiation and higher CD8-derived signature expression. Clonotype-derived antigen prediction revealed that most previously unreported clonotypes were patient-specific, suggesting significant heterogeneity in AML immunogenicity. Thus, immunologic reconstitution in AML is likely to be most successful at earlier disease stages when CD8+ T cells are less differentiated and have greater capacity for clonotype transitions.

11.
J Neurooncol ; 163(2): 397-405, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37318677

RESUMEN

INTRODUCTION: Meningiomas are the most common primary intracranial tumor. Recently, various genetic classification systems for meningioma have been described. We sought to identify clinical drivers of different molecular changes in meningioma. As such, clinical and genomic consequences of smoking in patients with meningiomas remain unexplored. METHODS: 88 tumor samples were analyzed in this study. Whole exome sequencing (WES) was used to assess somatic mutation burden. RNA sequencing data was used to identify differentially expressed genes (DEG) and genes sets (GSEA). RESULTS: Fifty-seven patients had no history of smoking, twenty-two were past smokers, and nine were current smokers. The clinical data showed no major differences in natural history across smoking status. WES revealed absence of AKT1 mutation rate in current or past smokers compared to non-smokers (p = 0.046). Current smokers had increased mutation rate in NOTCH2 compared to past and never smokers (p < 0.05). Mutational signature from current and past smokers showed disrupted DNA mismatch repair (cosine-similarity = 0.759 and 0.783). DEG analysis revealed the xenobiotic metabolic genes UGT2A1 and UGT2A2 were both significantly downregulated in current smokers compared to past (Log2FC = - 3.97, padj = 0.0347 and Log2FC = - 4.18, padj = 0.0304) and never smokers (Log2FC = - 3.86, padj = 0.0235 and Log2FC = - 4.20, padj = 0.0149). GSEA analysis of current smokers showed downregulation of xenobiotic metabolism and enrichment for G2M checkpoint, E2F targets, and mitotic spindle compared to past and never smokers (FDR < 25% each). CONCLUSION: In this study, we conducted a comparative analysis of meningioma patients based on their smoking history, examining both their clinical trajectories and molecular changes. Meningiomas from current smokers were more likely to harbor NOTCH2 mutations, and AKT1 mutations were absent in current or past smokers. Moreover, both current and past smokers exhibited a mutational signature associated with DNA mismatch repair. Meningiomas from current smokers demonstrate downregulation of xenobiotic metabolic enzymes UGT2A1 and UGT2A2, which are downregulated in other smoking related cancers. Furthermore, current smokers exhibited downregulation xenobiotic metabolic gene sets, as well as enrichment in gene sets related to mitotic spindle, E2F targets, and G2M checkpoint, which are hallmark pathways involved in cell division and DNA replication control. In aggregate, our results demonstrate novel alterations in meningioma molecular biology in response to systemic carcinogens.


Asunto(s)
Neoplasias Meníngeas , Meningioma , Humanos , Meningioma/genética , Meningioma/patología , Xenobióticos , Fumar/efectos adversos , Fumar/genética , Mutación , Genómica , Neoplasias Meníngeas/patología , Glucuronosiltransferasa/genética
12.
Cancer Immunol Res ; 2023 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-37163233

RESUMEN

Comprehensive investigation of CD8+ T cells in acute myeloid leukemia (AML) is essential for developing immunotherapeutic strategies beyond immune checkpoint blockade. Herein, we performed single-cell RNA profiling of CD8+ T cells from 3 healthy bone marrow donors and 23 newly diagnosed (NewlyDx) and 8 relapsed/refractory (RelRef) AML patients. Cells co-expressing canonical exhaustion markers formed a cluster constituting <1% of all CD8+ T cells. We identified two effector CD8+ T cell subsets characterized by distinct cytokine and metabolic profiles that were differentially enriched in NewlyDx and RelRef patients. We refined a 25-gene CD8-derived signature correlating with therapy resistance, including genes associated with activation, chemoresistance, and terminal differentiation. Pseudotemporal trajectory analysis supported enrichment of a terminally differentiated state in CD8+ T cells with high CD8-derived signature expression at relapse or refractory disease. Higher expression of the 25-gene CD8 AML signature correlated with poorer outcomes in previously untreated AML patients, suggesting that the bona fide state of CD8+ T cells and their degree of differentiation are clinically relevant. Immune clonotype tracking revealed more phenotypic transitions in CD8 clonotypes in NewlyDx than in RelRef patients. Furthermore, CD8+ T cells from RelRef patients had a higher degree of clonal hyperexpansion associated with terminal differentiation and higher CD8-derived signature expression. Clonotype-derived antigen prediction revealed that most previously unreported clonotypes were patient-specific, suggesting significant heterogeneity in AML immunogenicity. Thus, immunologic reconstitution in AML is likely to be most successful at earlier disease stages when CD8+ T cells are less differentiated and have greater capacity for clonotype transitions.

13.
Res Sq ; 2023 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-36993741

RESUMEN

Background: Surgery is the mainstay of treatment for meningioma, the most common primary intracranial tumor, but improvements in meningioma risk stratification are needed and current indications for postoperative radiotherapy are controversial. Recent studies have proposed prognostic meningioma classification systems using DNA methylation profiling, copy number variants, DNA sequencing, RNA sequencing, histology, or integrated models based on multiple combined features. Targeted gene expression profiling has generated robust biomarkers integrating multiple molecular features for other cancers, but is understudied for meningiomas. Methods: Targeted gene expression profiling was performed on 173 meningiomas and an optimized gene expression biomarker (34 genes) and risk score (0 to 1) was developed to predict clinical outcomes. Clinical and analytical validation was performed on independent meningiomas from 12 institutions across 3 continents (N = 1856), including 103 meningiomas from a prospective clinical trial. Gene expression biomarker performance was compared to 9 other classification systems. Results: The gene expression biomarker improved discrimination of postoperative meningioma outcomes compared to all other classification systems tested in the independent clinical validation cohort for local recurrence (5-year area under the curve [AUC] 0.81) and overall survival (5-year AUC 0.80). The increase in area under the curve compared to the current standard of care, World Health Organization 2021 grade, was 0.11 for local recurrence (95% confidence interval [CI] 0.07-0.17, P < 0.001). The gene expression biomarker identified meningiomas benefiting from postoperative radiotherapy (hazard ratio 0.54, 95% CI 0.37-0.78, P = 0.0001) and re-classified up to 52.0% meningiomas compared to conventional clinical criteria, suggesting postoperative management could be refined for 29.8% of patients. Conclusions: A targeted gene expression biomarker improves discrimination of meningioma outcomes compared to recent classification systems and predicts postoperative radiotherapy responses.

15.
BMC Genomics ; 23(1): 841, 2022 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-36539717

RESUMEN

BACKGROUND: RNA-sequencing has become a standard tool for analyzing gene activity in bulk samples and at the single-cell level. By increasing sample sizes and cell counts, this technique can uncover substantial information about cellular transcriptional states. Beyond quantification of gene expression, RNA-seq can be used for detecting variants, including single nucleotide polymorphisms, small insertions/deletions, and larger variants, such as copy number variants. Notably, joint analysis of variants with cellular transcriptional states may provide insights into the impact of mutations, especially for complex and heterogeneous samples. However, this analysis is often challenging due to a prohibitively high number of variants and cells, which are difficult to summarize and visualize. Further, there is a dearth of methods that assess and summarize the association between detected variants and cellular transcriptional states. RESULTS: Here, we introduce XCVATR (eXpressed Clusters of Variant Alleles in Transcriptome pRofiles), a method that identifies variants and detects local enrichment of expressed variants within embedding of samples and cells in single-cell and bulk RNA-seq datasets. XCVATR visualizes local "clumps" of small and large-scale variants and searches for patterns of association between each variant and cellular states, as described by the coordinates of cell embedding, which can be computed independently using any type of distance metrics, such as principal component analysis or t-distributed stochastic neighbor embedding. Through simulations and analysis of real datasets, we demonstrate that XCVATR can detect enrichment of expressed variants and provide insight into the transcriptional states of cells and samples. We next sequenced 2 new single cell RNA-seq tumor samples and applied XCVATR. XCVATR revealed subtle differences in CNV impact on tumors. CONCLUSIONS: XCVATR is publicly available to download from https://github.com/harmancilab/XCVATR .


Asunto(s)
Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Transcriptoma , RNA-Seq , Análisis de Secuencia de ARN/métodos , ARN/genética , Análisis de la Célula Individual/métodos
16.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36384083

RESUMEN

BACKGROUND: Estimation of genetic relatedness, or kinship, is used occasionally for recreational purposes and in forensic applications. While numerous methods were developed to estimate kinship, they suffer from high computational requirements and often make an untenable assumption of homogeneous population ancestry of the samples. Moreover, genetic privacy is generally overlooked in the usage of kinship estimation methods. There can be ethical concerns about finding unknown familial relationships in third-party databases. Similar ethical concerns may arise while estimating and reporting sensitive population-level statistics such as inbreeding coefficients for the concerns around marginalization and stigmatization. RESULTS: Here, we present SIGFRIED, which makes use of existing reference panels with a projection-based approach that simplifies kinship estimation in the admixed populations. We use simulated and real datasets to demonstrate the accuracy and efficiency of kinship estimation. We present a secure federated kinship estimation framework and implement a secure kinship estimator using homomorphic encryption-based primitives for computing relatedness between samples in two different sites while genotype data are kept confidential. Source code and documentation for our methods can be found at https://doi.org/10.5281/zenodo.7053352. CONCLUSIONS: Analysis of relatedness is fundamentally important for identifying relatives, in association studies, and for estimation of population-level estimates of inbreeding. As the awareness of individual and group genomic privacy is growing, privacy-preserving methods for the estimation of relatedness are needed. Presented methods alleviate the ethical and privacy concerns in the analysis of relatedness in admixed, historically isolated and underrepresented populations. SHORT ABSTRACT: Genetic relatedness is a central quantity used for finding relatives in databases, correcting biases in genome wide association studies and for estimating population-level statistics. Methods for estimating genetic relatedness have high computational requirements, and occasionally do not consider individuals from admixed ancestries. Furthermore, the ethical concerns around using genetic data and calculating relatedness are not considered. We present a projection-based approach that can efficiently and accurately estimate kinship. We implement our method using encryption-based techniques that provide provable security guarantees to protect genetic data while kinship statistics are computed among multiple sites.


Asunto(s)
Estudio de Asociación del Genoma Completo , Privacidad , Humanos , Genotipo , Privacidad Genética , Genoma
17.
BMC Bioinformatics ; 23(1): 409, 2022 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-36182914

RESUMEN

BACKGROUND: Sequencing of thousands of samples provides genetic variants with allele frequencies spanning a very large spectrum and gives invaluable insight into genetic determinants of diseases. Protecting the genetic privacy of participants is challenging as only a few rare variants can easily re-identify an individual among millions. In certain cases, there are policy barriers against sharing genetic data from indigenous populations and stigmatizing conditions. RESULTS: We present SVAT, a method for secure outsourcing of variant annotation and aggregation, which are two basic steps in variant interpretation and detection of causal variants. SVAT uses homomorphic encryption to encrypt the data at the client-side. The data always stays encrypted while it is stored, in-transit, and most importantly while it is analyzed. SVAT makes use of a vectorized data representation to convert annotation and aggregation into efficient vectorized operations in a single framework. Also, SVAT utilizes a secure re-encryption approach so that multiple disparate genotype datasets can be combined for federated aggregation and secure computation of allele frequencies on the aggregated dataset. CONCLUSIONS: Overall, SVAT provides a secure, flexible, and practical framework for privacy-aware outsourcing of annotation, filtering, and aggregation of genetic variants. SVAT is publicly available for download from https://github.com/harmancilab/SVAT .


Asunto(s)
Nube Computacional , Servicios Externos , Seguridad Computacional , Frecuencia de los Genes , Genotipo , Humanos
18.
J Am Med Inform Assoc ; 29(12): 2182-2190, 2022 11 14.
Artículo en Inglés | MEDLINE | ID: mdl-36164820

RESUMEN

Concerns regarding inappropriate leakage of sensitive personal information as well as unauthorized data use are increasing with the growth of genomic data repositories. Therefore, privacy and security of genomic data have become increasingly important and need to be studied. With many proposed protection techniques, their applicability in support of biomedical research should be well understood. For this purpose, we have organized a community effort in the past 8 years through the integrating data for analysis, anonymization and sharing consortium to address this practical challenge. In this article, we summarize our experience from these competitions, report lessons learned from the events in 2020/2021 as examples, and discuss potential future research directions in this emerging field.


Asunto(s)
Seguridad Computacional , Privacidad , Análisis de Datos , Genómica , Genoma
19.
BMC Bioinformatics ; 23(1): 356, 2022 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-36038834

RESUMEN

BACKGROUND: The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. RESULTS: Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants. CONCLUSIONS: Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer .


Asunto(s)
Polimorfismo de Nucleótido Simple , Programas Informáticos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Haplotipos , Desequilibrio de Ligamiento , Análisis de Secuencia de ADN/métodos
20.
Front Oncol ; 12: 855167, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35600406

RESUMEN

The RE1 Silencing Transcription Factor (REST) is a major regulator of neurogenesis and brain development. Medulloblastoma (MB) is a pediatric brain cancer characterized by a blockade of neuronal specification. REST gene expression is aberrantly elevated in a subset of MBs that are driven by constitutive activation of sonic hedgehog (SHH) signaling in cerebellar granular progenitor cells (CGNPs), the cells of origin of this subgroup of tumors. To understand its transcriptional deregulation in MBs, we first studied control of Rest gene expression during neuronal differentiation of normal mouse CGNPs. Higher Rest expression was observed in proliferating CGNPs compared to differentiating neurons. Interestingly, two Rest isoforms were expressed in CGNPs, of which only one showed a significant reduction in expression during neurogenesis. In proliferating CGNPs, higher MLL4 and KDM7A activities opposed by the repressive polycomb repressive complex 2 (PRC2) and the G9A/G9A-like protein (GLP) complex function allowed Rest homeostasis. During differentiation, reduction in MLL4 enrichment on chromatin, in conjunction with an increase in PRC2/G9A/GLP/KDM7A activities promoted a decline in Rest expression. These findings suggest a lineage-context specific paradoxical role for KDM7A in the regulation of Rest expression in CGNPs. In human SHH-MBs (SHH-α and SHH-ß) where elevated REST gene expression is associated with poor prognosis, up- or downregulation of KDM7A caused a significant worsening in patient survival. Our studies are the first to implicate KDM7A in REST regulation and in MB biology.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...