Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 222
Filtrar
1.
NPJ Precis Oncol ; 8(1): 105, 2024 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-38762545

RESUMEN

The diagnostic spectrum for AML patients is increasingly based on genetic abnormalities due to their prognostic and predictive value. However, information on the AML blast phenotype regarding their maturational arrest has started to regain importance due to its predictive power for drug responses. Here, we deconvolute 1350 bulk RNA-seq samples from five independent AML cohorts on a single-cell healthy BM reference and demonstrate that the morphological differentiation stages (FAB) could be faithfully reconstituted using estimated cell compositions (ECCs). Moreover, we show that the ECCs reliably predict ex-vivo drug resistances as demonstrated for Venetoclax, a BCL-2 inhibitor, resistance specifically in AML with CD14+ monocyte phenotype. We validate these predictions using LUMC proteomics data by showing that BCL-2 protein abundance is split into two distinct clusters for NPM1-mutated AML at the extremes of CD14+ monocyte percentages, which could be crucial for the Venetoclax dosing patients. Our results suggest that Venetoclax resistance predictions can also be extended to AML without recurrent genetic abnormalities and possibly to MDS-related and secondary AML. Lastly, we show that CD14+ monocytic dominated Ven/Aza treated patients have significantly lower overall survival. Collectively, we propose a framework for allowing a joint mutation and maturation stage modeling that could be used as a blueprint for testing sensitivity for new agents across the various subtypes of AML.

2.
Alzheimers Dement ; 20(6): 3864-3875, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38634500

RESUMEN

BACKGROUND: Alzheimer's disease (AD) prevalence increases with age, yet a small fraction of the population reaches ages > 100 years without cognitive decline. We studied the genetic factors associated with such resilience against AD. METHODS: Genome-wide association studies identified 86 single nucleotide polymorphisms (SNPs) associated with AD risk. We estimated SNP frequency in 2281 AD cases, 3165 age-matched controls, and 346 cognitively healthy centenarians. We calculated a polygenic risk score (PRS) for each individual and investigated the functional properties of SNPs enriched/depleted in centenarians. RESULTS: Cognitively healthy centenarians were enriched with the protective alleles of the SNPs associated with AD risk. The protective effect concentrated on the alleles in/near ANKH, GRN, TMEM106B, SORT1, PLCG2, RIN3, and APOE genes. This translated to >5-fold lower PRS in centenarians compared to AD cases (P = 7.69 × 10-71), and 2-fold lower compared to age-matched controls (P = 5.83 × 10-17). DISCUSSION: Maintaining cognitive health until extreme ages requires complex genetic protection against AD, which concentrates on the genes associated with the endolysosomal and immune systems. HIGHLIGHTS: Cognitively healthy cent enarians are enriched with the protective alleles of genetic variants associated with Alzheimer's disease (AD). The protective effect is concentrated on variants involved in the immune and endolysosomal systems. Combining variants into a polygenic risk score (PRS) translated to > 5-fold lower PRS in centenarians compared to AD cases, and ≈ 2-fold lower compared to middle-aged healthy controls.


Asunto(s)
Enfermedad de Alzheimer , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Humanos , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/prevención & control , Femenino , Masculino , Anciano de 80 o más Años , Predisposición Genética a la Enfermedad , Herencia Multifactorial/genética , Alelos , Estudios de Casos y Controles
3.
Front Bioinform ; 4: 1347276, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38501113

RESUMEN

Most regulatory elements, especially enhancer sequences, are cell population-specific. One could even argue that a distinct set of regulatory elements is what defines a cell population. However, discovering which non-coding regions of the DNA are essential in which context, and as a result, which genes are expressed, is a difficult task. Some computational models tackle this problem by predicting gene expression directly from the genomic sequence. These models are currently limited to predicting bulk measurements and mainly make tissue-specific predictions. Here, we present a model that leverages single-cell RNA-sequencing data to predict gene expression. We show that cell population-specific models outperform tissue-specific models, especially when the expression profile of a cell population and the corresponding tissue are dissimilar. Further, we show that our model can prioritize GWAS variants and learn motifs of transcription factor binding sites. We envision that our model can be useful for delineating cell population-specific regulatory elements.

4.
Int J Gynecol Cancer ; 34(5): 713-721, 2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38388177

RESUMEN

OBJECTIVE: To assess the feasibility of scalable, objective, and minimally invasive liquid biopsy-derived biomarkers such as cell-free DNA copy number profiles, human epididymis protein 4 (HE4), and cancer antigen 125 (CA125) for pre-operative risk assessment of early-stage ovarian cancer in a clinically representative and diagnostically challenging population and to compare the performance of these biomarkers with the Risk of Malignancy Index (RMI). METHODS: In this case-control study, we included 100 patients with an ovarian mass clinically suspected to be early-stage ovarian cancer. Of these 100 patients, 50 were confirmed to have a malignant mass (cases) and 50 had a benign mass (controls). Using WisecondorX, an algorithm used extensively in non-invasive prenatal testing, we calculated the benign-calibrated copy number profile abnormality score. This score represents how different a sample is from benign controls based on copy number profiles. We combined this score with HE4 serum concentration to separate cases and controls. RESULTS: Combining the benign-calibrated copy number profile abnormality score with HE4, we obtained a model with a significantly higher sensitivity (42% vs 0%; p<0.002) at 99% specificity as compared with the RMI that is currently employed in clinical practice. Investigating performance in subgroups, we observed especially large differences in the advanced stage and non-high-grade serous ovarian cancer groups. CONCLUSION: This study demonstrates that cell-free DNA can be successfully employed to perform pre-operative risk of malignancy assessment for ovarian masses; however, results warrant validation in a more extensive clinical study.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Ováricas , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP , Humanos , Femenino , Neoplasias Ováricas/sangre , Neoplasias Ováricas/diagnóstico , Neoplasias Ováricas/cirugía , Neoplasias Ováricas/patología , Estudios de Casos y Controles , Persona de Mediana Edad , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/análisis , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/metabolismo , Biopsia Líquida/métodos , Biomarcadores de Tumor/sangre , Ácidos Nucleicos Libres de Células/sangre , Adulto , Anciano , Antígeno Ca-125/sangre
5.
Leukemia ; 38(4): 751-761, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38360865

RESUMEN

Subtyping of acute myeloid leukaemia (AML) is predominantly based on recurrent genetic abnormalities, but recent literature indicates that transcriptomic phenotyping holds immense potential to further refine AML classification. Here we integrated five AML transcriptomic datasets with corresponding genetic information to provide an overview (n = 1224) of the transcriptomic AML landscape. Consensus clustering identified 17 robust patient clusters which improved identification of CEBPA-mutated patients with favourable outcomes, and uncovered transcriptomic subtypes for KMT2A rearrangements (2), NPM1 mutations (5), and AML with myelodysplasia-related changes (AML-MRC) (5). Transcriptomic subtypes of KMT2A, NPM1 and AML-MRC showed distinct mutational profiles, cell type differentiation arrests and immune properties, suggesting differences in underlying disease biology. Moreover, our transcriptomic clusters show differences in ex-vivo drug responses, even when corrected for differentiation arrest and superiorly capture differences in drug response compared to genetic classification. In conclusion, our findings underscore the importance of transcriptomics in AML subtyping and offer a basis for future research and personalised treatment strategies. Our transcriptomic compendium is publicly available and we supply an R package to project clusters to new transcriptomic studies.


Asunto(s)
Leucemia Mieloide Aguda , Proteínas Nucleares , Humanos , Proteínas Nucleares/genética , Transcriptoma/genética , Nucleofosmina , Leucemia Mieloide Aguda/tratamiento farmacológico , Leucemia Mieloide Aguda/genética , Mutación , Perfilación de la Expresión Génica , Pronóstico
6.
Bioinform Adv ; 3(1): vbad171, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38075479

RESUMEN

Motivation: Single-cell technologies allow deep characterization of different molecular aspects of cells. Integrating these modalities provides a comprehensive view of cellular identity. Current integration methods rely on overlapping features or cells to link datasets measuring different modalities, limiting their application to experiments where different molecular layers are profiled in different subsets of cells. Results: We present scTopoGAN, a method for unsupervised manifold alignment of single-cell datasets with non-overlapping cells or features. We use topological autoencoders (topoAE) to obtain latent representations of each modality separately. A topology-guided Generative Adversarial Network then aligns these latent representations into a common space. We show that scTopoGAN outperforms state-of-the-art manifold alignment methods in complete unsupervised settings. Interestingly, the topoAE for individual modalities also showed better performance in preserving the original structure of the data in the low-dimensional representations when compared to other manifold projection methods. Taken together, we show that the concept of topology preservation might be a powerful tool to align multiple single modality datasets, unleashing the potential of multi-omic interpretations of cells. Availability and implementation: Implementation available on GitHub (https://github.com/AkashCiel/scTopoGAN). All datasets used in this study are publicly available.

7.
Eur Heart J Digit Health ; 4(6): 444-454, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38045440

RESUMEN

Aims: Risk assessment tools are needed for timely identification of patients with heart failure (HF) with reduced ejection fraction (HFrEF) who are at high risk of adverse events. In this study, we aim to derive a small set out of 4210 repeatedly measured proteins, which, along with clinical characteristics and established biomarkers, carry optimal prognostic capacity for adverse events, in patients with HFrEF. Methods and results: In 382 patients, we performed repeated blood sampling (median follow-up: 2.1 years) and applied an aptamer-based multiplex proteomic approach. We used machine learning to select the optimal set of predictors for the primary endpoint (PEP: composite of cardiovascular death, heart transplantation, left ventricular assist device implantation, and HF hospitalization). The association between repeated measures of selected proteins and PEP was investigated by multivariable joint models. Internal validation (cross-validated c-index) and external validation (Henry Ford HF PharmacoGenomic Registry cohort) were performed. Nine proteins were selected in addition to the MAGGIC risk score, N-terminal pro-hormone B-type natriuretic peptide, and troponin T: suppression of tumourigenicity 2, tryptophanyl-tRNA synthetase cytoplasmic, histone H2A Type 3, angiotensinogen, deltex-1, thrombospondin-4, ADAMTS-like protein 2, anthrax toxin receptor 1, and cathepsin D. N-terminal pro-hormone B-type natriuretic peptide and angiotensinogen showed the strongest associations [hazard ratio (95% confidence interval): 1.96 (1.17-3.40) and 0.66 (0.49-0.88), respectively]. The multivariable model yielded a c-index of 0.85 upon internal validation and c-indices up to 0.80 upon external validation. The c-index was higher than that of a model containing established risk factors (P = 0.021). Conclusion: Nine serially measured proteins captured the most essential prognostic information for the occurrence of adverse events in patients with HFrEF, and provided incremental value for HF prognostication beyond established risk factors. These proteins could be used for dynamic, individual risk assessment in a prospective setting. These findings also illustrate the potential value of relatively 'novel' biomarkers for prognostication. Clinical Trial Registration: https://clinicaltrials.gov/ct2/show/NCT01851538?term=nCT01851538&draw=2&rank=1 24.

8.
Metabolites ; 13(12)2023 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-38132863

RESUMEN

1H-NMR metabolomics data is increasingly used to track health and disease. Nightingale Health, a major supplier of 1H-NMR metabolomics, has recently updated the quantification strategy to further align with clinical standards. Such updates, however, might influence backward replicability, particularly affecting studies with repeated measures. Using data from BBMRI-NL consortium (~28,000 samples from 28 cohorts), we compared Nightingale data, originally released in 2014 and 2016, with a re-quantified version released in 2020, of which both versions were based on the same NMR spectra. Apart from two discontinued and twenty-three new analytes, we generally observe a high concordance between quantification versions with 73 out of 222 (33%) analytes showing a mean ρ > 0.9 across all cohorts. Conversely, five analytes consistently showed lower Spearman's correlations (ρ < 0.7) between versions, namely acetoacetate, LDL-L, saturated fatty acids, S-HDL-C, and sphingomyelins. Furthermore, previously trained multi-analyte scores, such as MetaboAge or MetaboHealth, might be particularly sensitive to platform changes. Whereas MetaboHealth replicated well, the MetaboAge score had to be retrained due to use of discontinued analytes. Notably, both scores in the re-quantified data recapitulated mortality associations observed previously. Concluding, we urge caution in utilizing different platform versions to avoid mixing analytes, having different units, or simply being discontinued.

9.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38018908

RESUMEN

Multi-omic analyses are necessary to understand the complex biological processes taking place at the tissue and cell level, but also to make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, but recently there has been a rise in the popularity of neural architectures that embed paired -omics into the same non-linear manifold. This work describes a head-to-head comparison of linear and non-linear joint embedding methods using both bulk and single-cell multi-modal datasets. We found that non-linear methods have a clear advantage with respect to linear ones for missing modality imputation. Performance comparisons in the downstream tasks of survival analysis for bulk tumor data and cell type classification for single-cell data lead to the following insights: First, concatenating the principal components of each modality is a competitive baseline and hard to beat if all modalities are available at test time. However, if we only have one modality available at test time, training a predictive model on the joint space of that modality can lead to performance improvements with respect to just using the unimodal principal components. Second, -omic profiles imputed by neural joint embedding methods are realistic enough to be used by a classifier trained on real data with limited performance drops. Taken together, our comparisons give hints to which joint embedding to use for which downstream task. Overall, product-of-experts performed well in most tasks and was reasonably fast, while early integration (concatenation) of modalities did quite poorly.


Asunto(s)
Multiómica , Neoplasias , Humanos
10.
PLoS One ; 18(10): e0292126, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37796856

RESUMEN

Deep generative models, such as variational autoencoders (VAE), have gained increasing attention in computational biology due to their ability to capture complex data manifolds which subsequently can be used to achieve better performance in downstream tasks, such as cancer type prediction or subtyping of cancer. However, these models are difficult to train due to the large number of hyperparameters that need to be tuned. To get a better understanding of the importance of the different hyperparameters, we examined six different VAE models when trained on TCGA transcriptomics data and evaluated on the downstream tasks of cluster agreement with cancer subtypes and survival analysis. We studied the effect of the latent space dimensionality, learning rate, optimizer, initialization and activation function on the quality of subsequent downstream tasks on the TCGA samples. We found ß-TCVAE and DIP-VAE to have a good performance, on average, despite being more sensitive to hyperparameters selection. Based on these experiments, we derived recommendations for selecting the different hyperparameters settings. To ensure generalization, we tested all hyperparameter configurations on the GTEx dataset. We found a significant correlation (ρ = 0.7) between the hyperparameter effects on clustering performance in the TCGA and GTEx datasets. This highlights the robustness and generalizability of our recommendations. In addition, we examined whether the learned latent spaces capture biologically relevant information. Hereto, we measured the correlation and mutual information of the different representations with various data characteristics such as gender, age, days to metastasis, immune infiltration, and mutation signatures. We found that for all models the latent factors, in general, do not uniquely correlate with one of the data characteristics nor capture separable information in the latent factors even for models specifically designed for disentanglement.


Asunto(s)
Benchmarking , Neoplasias , Humanos , Transcriptoma , Neoplasias/genética , Perfilación de la Expresión Génica , Análisis por Conglomerados
11.
Bioinformatics ; 39(11)2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37847663

RESUMEN

SUMMARY: T-cell receptors (TCRs) on T cells recognize and bind to epitopes presented by the major histocompatibility complex in case of an infection or cancer. However, the high diversity of TCRs, as well as their unique and complex binding mechanisms underlying epitope recognition, make it difficult to predict the binding between TCRs and epitopes. Here, we present the utility of transformers, a deep learning strategy that incorporates an attention mechanism that learns the informative features, and show that these models pre-trained on a large set of protein sequences outperform current strategies. We compared three pre-trained auto-encoder transformer models (ProtBERT, ProtAlbert, and ProtElectra) and one pre-trained auto-regressive transformer model (ProtXLNet) to predict the binding specificity of TCRs to 25 epitopes from the VDJdb database (human and murine). Two additional modifications were performed to incorporate gene usage of the TCRs in the four transformer models. Of all 12 transformer implementations (four models with three different modifications), a modified version of the ProtXLNet model could predict TCR-epitope pairs with the highest accuracy (weighted F1 score 0.55 simultaneously considering all 25 epitopes). The modification included additional features representing the gene names for the TCRs. We also showed that the basic implementation of transformers outperformed the previously available methods, i.e. TCRGP, TCRdist, and DeepTCR, developed for the same biological problem, especially for the hard-to-classify labels. We show that the proficiency of transformers in attention learning can be made operational in a complex biological setting like TCR binding prediction. Further ingenuity in utilizing the full potential of transformers, either through attention head visualization or introducing additional features, can extend T-cell research avenues. AVAILABILITY AND IMPLEMENTATION: Data and code are available on https://github.com/InduKhatri/tcrformer.


Asunto(s)
Epítopos de Linfocito T , Receptores de Antígenos de Linfocitos T , Humanos , Animales , Ratones , Epítopos de Linfocito T/metabolismo , Receptores de Antígenos de Linfocitos T/genética , Linfocitos T/metabolismo , Secuencia de Aminoácidos , Complejo Mayor de Histocompatibilidad
12.
NAR Genom Bioinform ; 5(3): lqad070, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37502708

RESUMEN

Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health and disease. Such large-scale atlases increase the scale and generalizability of analyses and enable combining knowledge generated by individual studies. Specifically, individual studies often differ regarding cell annotation terminology and depth, with different groups specializing in different cell type compartments, often using distinct terminology. Understanding how these distinct sets of annotations are related and complement each other would mark a major step towards a consensus-based cell-type annotation reflecting the latest knowledge in the field. Whereas recent computational techniques, referred to as 'reference mapping' methods, facilitate the usage and expansion of existing reference atlases by mapping new datasets (i.e. queries) onto an atlas; a systematic approach towards harmonizing dataset-specific cell-type terminology and annotation depth is still lacking. Here, we present 'treeArches', a framework to automatically build and extend reference atlases while enriching them with an updatable hierarchy of cell-type annotations across different datasets. We demonstrate various use cases for treeArches, from automatically resolving relations between reference and query cell types to identifying unseen cell types absent in the reference, such as disease-associated cell states. We envision treeArches enabling data-driven construction of consensus atlas-level cell-type hierarchies and facilitating efficient usage of reference atlases.

13.
Bioinformatics ; 39(39 Suppl 1): i404-i412, 2023 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-37387141

RESUMEN

MOTIVATION: Knowing the relation between cell types is crucial for translating experimental results from mice to humans. Establishing cell type matches, however, is hindered by the biological differences between the species. A substantial amount of evolutionary information between genes that could be used to align the species is discarded by most of the current methods since they only use one-to-one orthologous genes. Some methods try to retain the information by explicitly including the relation between genes, however, not without caveats. RESULTS: In this work, we present a model to transfer and align cell types in cross-species analysis (TACTiCS). First, TACTiCS uses a natural language processing model to match genes using their protein sequences. Next, TACTiCS employs a neural network to classify cell types within a species. Afterward, TACTiCS uses transfer learning to propagate cell type labels between species. We applied TACTiCS on scRNA-seq data of the primary motor cortex of human, mouse, and marmoset. Our model can accurately match and align cell types on these datasets. Moreover, our model outperforms Seurat and the state-of-the-art method SAMap. Finally, we show that our gene matching method results in better cell type matches than BLAST in our model. AVAILABILITY AND IMPLEMENTATION: The implementation is available on GitHub (https://github.com/kbiharie/TACTiCS). The preprocessed datasets and trained models can be downloaded from Zenodo (https://doi.org/10.5281/zenodo.7582460).


Asunto(s)
Evolución Biológica , Técnicas Genéticas , Humanos , Animales , Ratones , Secuencia de Aminoácidos , Procesamiento de Lenguaje Natural , Aprendizaje Automático
14.
J Gerontol A Biol Sci Med Sci ; 78(10): 1753-1762, 2023 10 09.
Artículo en Inglés | MEDLINE | ID: mdl-37303208

RESUMEN

Biological age captures a person's age-related risk of unfavorable outcomes using biophysiological information. Multivariate biological age measures include frailty scores and molecular biomarkers. These measures are often studied in isolation, but here we present a large-scale study comparing them. In 2 prospective cohorts (n = 3 222), we compared epigenetic (DNAm Horvath, DNAm Hannum, DNAm Lin, DNAm epiTOC, DNAm PhenoAge, DNAm DunedinPoAm, DNAm GrimAge, and DNAm Zhang) and metabolomic-based (MetaboAge and MetaboHealth) biomarkers in reflection of biological age, as represented by 5 frailty measures and overall mortality. Biomarkers trained on outcomes with biophysiological and/or mortality information outperformed age-trained biomarkers in frailty reflection and mortality prediction. DNAm GrimAge and MetaboHealth, trained on mortality, showed the strongest association with these outcomes. The associations of DNAm GrimAge and MetaboHealth with frailty and mortality were independent of each other and of the frailty score mimicking clinical geriatric assessment. Epigenetic, metabolomic, and clinical biological age markers seem to capture different aspects of aging. These findings suggest that mortality-trained molecular markers may provide novel phenotype reflecting biological age and strengthen current clinical geriatric health and well-being assessment.


Asunto(s)
Fragilidad , Humanos , Anciano , Fragilidad/genética , Estudios Prospectivos , Biomarcadores , Envejecimiento/genética , Epigénesis Genética , Metilación de ADN
15.
Sci Rep ; 13(1): 10424, 2023 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-37369746

RESUMEN

Next generation sequencing of cell-free DNA (cfDNA) is a promising method for treatment monitoring and therapy selection in metastatic breast cancer (MBC). However, distinguishing tumor-specific variants from sequencing artefacts and germline variation with low false discovery rate is challenging when using large targeted sequencing panels covering many tumor suppressor genes. To address this, we built a machine learning model to remove false positive variant calls and augmented it with additional filters to ensure selection of tumor-derived variants. We used cfDNA of 70 MBC patients profiled with both the small targeted Oncomine breast panel (Thermofisher) and the much larger Qiaseq Human Breast Cancer Panel (Qiagen). The model was trained on the panels' common regions using Oncomine hotspot mutations as ground truth. Applied to Qiaseq data, it achieved 35% sensitivity and 36% precision, outperforming basic filtering. For 20 patients we used germline DNA to filter for somatic variants and obtained 245 variants in total, while our model found seven variants, of which six were also detected using the germline strategy. In ten tumor-free individuals, our method detected in total one (potentially germline) variant, in contrast to 521 variants detected without our model. These results indicate that our model largely detects somatic variants.


Asunto(s)
Neoplasias de la Mama , Ácidos Nucleicos Libres de Células , Humanos , Femenino , Neoplasias de la Mama/genética , Ácidos Nucleicos Libres de Células/genética , Mutación , Mama , Secuenciación de Nucleótidos de Alto Rendimiento , Aprendizaje Automático
16.
NAR Genom Bioinform ; 5(2): lqad048, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37274121

RESUMEN

Cell-free DNA (cfDNA) are DNA fragments originating from dying cells that are detectable in bodily fluids, such as the plasma. Accelerated cell death, for example caused by disease, induces an elevated concentration of cfDNA. As a result, determining the cell type origins of cfDNA molecules can provide information about an individual's health. In this work, we aim to increase the sensitivity of methylation-based cell type deconvolution by adapting an existing method, CelFiE, which uses the methylation beta values of individual CpG sites to estimate cell type proportions. Our new method, CelFEER, instead differentiates cell types by the average methylation values within individual reads. We additionally improved the originally reported performance of CelFiE by using a new approach for finding marker regions that are differentially methylated between cell types. We show that CelFEER estimates cell type proportions with a higher correlation (r = 0.94 ± 0.04) than CelFiE (r = 0.86 ± 0.09) on simulated mixtures of cell types. Moreover, we show that the cell type proportion estimated by CelFEER can differentiate between ALS patients and healthy controls, between pregnant women in their first and third trimester, and between pregnant women with and without gestational diabetes.

17.
medRxiv ; 2023 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-37292975

RESUMEN

Understanding how genetic risk variants contribute to Alzheimer's Disease etiology remains a challenge. Single-cell RNA sequencing (scRNAseq) allows for the investigation of cell type specific effects of genomic risk loci on gene expression. Using seven scRNAseq datasets totalling >1.3 million cells, we investigated differential correlation of genes between healthy individuals and individuals diagnosed with Alzheimer's Disease. Using the number of differential correlations of a gene to estimate its involvement and potential impact, we present a prioritization scheme for identifying probable causal genes near genomic risk loci. Besides prioritizing genes, our approach pin-points specific cell types and provides insight into the rewiring of gene-gene relationships associated with Alzheimer's.

18.
Neurol Genet ; 9(3): e200066, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37123987

RESUMEN

Background and Objectives: With age, somatic mutations accumulated in human brain cells can lead to various neurologic disorders and brain tumors. Because the incidence rate of Alzheimer disease (AD) increases exponentially with age, investigating the association between AD and the accumulation of somatic mutation can help understand the etiology of AD. Methods: We designed a somatic mutation detection workflow by contrasting genotypes derived from whole-genome sequencing (WGS) data with genotypes derived from scRNA-seq data and applied this workflow to 76 participants from the Religious Order Study and the Rush Memory and Aging Project (ROSMAP) cohort. We focused only on excitatory neurons, the dominant cell type in the scRNA-seq data. Results: We identified 196 sites that harbored at least 1 individual with an excitatory neuron-specific somatic mutation (ENSM), and these 196 sites were mapped to 127 genes. The single base substitution (SBS) pattern of the putative ENSMs was best explained by signature SBS5 from the Catalogue of Somatic Mutations in Cancer (COSMIC) mutational signatures, a clock-like pattern correlating with the age of the individual. The count of ENSMs per individual also showed an increasing trend with age. Among the mutated sites, we found 2 sites tend to have more mutations in older individuals (16:6899517 [RBFOX1], p = 0.04; 4:21788463 [KCNIP4], p < 0.05). In addition, 2 sites were found to have a higher odds ratio to detect a somatic mutation in AD samples (6:73374221 [KCNQ5], p = 0.01 and 13:36667102 [DCLK1], p = 0.02). Thirty-two genes that harbor somatic mutations unique to AD and the KCNQ5 and DCLK1 genes were used for gene ontology (GO)-term enrichment analysis. We found the AD-specific ENSMs enriched in the GO-term "vocalization behavior" and "intraspecies interaction between organisms." Of interest we observed both age-specific and AD-specific ENSMs enriched in the K+ channel-associated genes. Discussion: Our results show that combining scRNA-seq and WGS data can successfully detect putative somatic mutations. The putative somatic mutations detected from ROSMAP data set have provided new insights into the association of AD and aging with brain somatic mutagenesis.

19.
Alzheimers Dement ; 19(11): 5036-5047, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37092333

RESUMEN

INTRODUCTION: Neuropathological substrates associated with neurodegeneration occur in brains of the oldest old. How does this affect cognitive performance? METHODS: The 100-plus Study is an ongoing longitudinal cohort study of centenarians who self-report to be cognitively healthy; post mortem brain donation is optional. In 85 centenarian brains, we explored the correlations between the levels of 11 neuropathological substrates with ante mortem performance on 12 neuropsychological tests. RESULTS: Levels of neuropathological substrates varied: we observed levels up to Thal-amyloid beta phase 5, Braak-neurofibrillary tangle (NFT) stage V, Consortium to Establish a Registry for Alzheimer's Disease (CERAD)-neuritic plaque score 3, Thal-cerebral amyloid angiopathy stage 3, Tar-DNA binding protein 43 (TDP-43) stage 3, hippocampal sclerosis stage 1, Braak-Lewy bodies stage 6, atherosclerosis stage 3, cerebral infarcts stage 1, and cerebral atrophy stage 2. Granulovacuolar degeneration occurred in all centenarians. Some high performers had the highest neuropathology scores. DISCUSSION: Only Braak-NFT stage and limbic-predominant age-related TDP-43 encephalopathy (LATE) pathology associated significantly with performance across multiple cognitive domains. Of all cognitive tests, the clock-drawing test was particularly sensitive to levels of multiple neuropathologies.


Asunto(s)
Enfermedad de Alzheimer , Péptidos beta-Amiloides , Anciano de 80 o más Años , Humanos , Péptidos beta-Amiloides/metabolismo , Centenarios , Estudios Longitudinales , Enfermedad de Alzheimer/patología , Encéfalo/patología , Ovillos Neurofibrilares/patología , Neuropatología , Cognición
20.
PLoS One ; 18(4): e0284493, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37058455

RESUMEN

BACKGROUND: Non-Invasive Prenatal Testing is often performed by utilizing read coverage-based profiles obtained from shallow whole genome sequencing to detect fetal copy number variations. Such screening typically operates on a discretized binned representation of the genome, where (ab)normality of bins of a set size is judged relative to a reference panel of healthy samples. In practice such approaches are too costly given that for each tested sample they require the resequencing of the reference panel to avoid technical bias. Within-sample testing methods utilize the observation that bins on one chromosome can be judged relative to the behavior of similarly behaving bins on other chromosomes, allowing the bins of a sample to be compared among themselves, avoiding technical bias. RESULTS: We present a comprehensive performance analysis of the within-sample testing method Wisecondor and its variants, using both experimental and simulated data. We introduced alterations to Wisecondor to explicitly address and exploit paired-end sequencing data. Wisecondor was found to yield the most stable results across different bin size scales while producing more robust calls by assigning higher Z-scores at all fetal fraction ranges. CONCLUSIONS: Our findings show that the most recent available version of Wisecondor performs best.


Asunto(s)
Variaciones en el Número de Copia de ADN , Diagnóstico Prenatal , Embarazo , Femenino , Humanos , Diagnóstico Prenatal/métodos , Atención Prenatal , Análisis de Secuencia de ADN/métodos , Secuenciación Completa del Genoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA