Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 162
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nat Immunol ; 22(5): 654-665, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33888898

RESUMEN

Controlled human infections provide opportunities to study the interaction between the immune system and malaria parasites, which is essential for vaccine development. Here, we compared immune signatures of malaria-naive Europeans and of Africans with lifelong malaria exposure using mass cytometry, RNA sequencing and data integration, before and 5 and 11 days after venous inoculation with Plasmodium falciparum sporozoites. We observed differences in immune cell populations, antigen-specific responses and gene expression profiles between Europeans and Africans and among Africans with differing degrees of immunity. Before inoculation, an activated/differentiated state of both innate and adaptive cells, including elevated CD161+CD4+ T cells and interferon-γ production, predicted Africans capable of controlling parasitemia. After inoculation, the rapidity of the transcriptional response and clusters of CD4+ T cells, plasmacytoid dendritic cells and innate T cells were among the features distinguishing Africans capable of controlling parasitemia from susceptible individuals. These findings can guide the development of a vaccine effective in malaria-endemic regions.


Asunto(s)
Inmunidad Adaptativa/inmunología , Susceptibilidad a Enfermedades/inmunología , Malaria Falciparum/inmunología , Plasmodium falciparum/inmunología , Inmunidad Adaptativa/genética , Adolescente , Adulto , Anticuerpos Antiprotozoarios/sangre , Anticuerpos Antiprotozoarios/inmunología , Antígenos de Protozoos/inmunología , Población Negra/genética , Células Dendríticas/inmunología , Susceptibilidad a Enfermedades/sangre , Susceptibilidad a Enfermedades/parasitología , Femenino , Voluntarios Sanos , Interacciones Huésped-Parásitos/genética , Interacciones Huésped-Parásitos/inmunología , Humanos , Inmunidad Innata/genética , Inmunidad Innata/inmunología , Interferón gamma/metabolismo , Malaria Falciparum/sangre , Malaria Falciparum/parasitología , Masculino , RNA-Seq , Análisis de Sistemas , Linfocitos T/inmunología , Linfocitos T/metabolismo , Población Blanca/genética , Adulto Joven
2.
Genome Res ; 32(4): 656-670, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35332097

RESUMEN

Genome-wide association studies (GWAS) have been highly informative in discovering disease-associated loci but are not designed to capture all structural variations in the human genome. Using long-read sequencing data, we discovered widespread structural variation within SINE-VNTR-Alu (SVA) elements, a class of great ape-specific transposable elements with gene-regulatory roles, which represents a major source of structural variability in the human population. We highlight the presence of structurally variable SVAs (SV-SVAs) in neurological disease-associated loci, and we further associate SV-SVAs to disease-associated SNPs and differential gene expression using luciferase assays and expression quantitative trait loci data. Finally, we genetically deleted SV-SVAs in the BIN1 and CD2AP Alzheimer's disease-associated risk loci and in the BCKDK Parkinson's disease-associated risk locus and assessed multiple aspects of their gene-regulatory influence in a human neuronal context. Together, this study reveals a novel layer of genetic variation in transposable elements that may contribute to identification of the structural variants that are the actual drivers of disease associations of GWAS loci.


Asunto(s)
Elementos Transponibles de ADN , Estudio de Asociación del Genoma Completo , Elementos Alu , Elementos Transponibles de ADN/genética , Predisposición Genética a la Enfermedad , Variación Genética , Genoma Humano , Humanos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
3.
Bioinformatics ; 39(11)2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37847663

RESUMEN

SUMMARY: T-cell receptors (TCRs) on T cells recognize and bind to epitopes presented by the major histocompatibility complex in case of an infection or cancer. However, the high diversity of TCRs, as well as their unique and complex binding mechanisms underlying epitope recognition, make it difficult to predict the binding between TCRs and epitopes. Here, we present the utility of transformers, a deep learning strategy that incorporates an attention mechanism that learns the informative features, and show that these models pre-trained on a large set of protein sequences outperform current strategies. We compared three pre-trained auto-encoder transformer models (ProtBERT, ProtAlbert, and ProtElectra) and one pre-trained auto-regressive transformer model (ProtXLNet) to predict the binding specificity of TCRs to 25 epitopes from the VDJdb database (human and murine). Two additional modifications were performed to incorporate gene usage of the TCRs in the four transformer models. Of all 12 transformer implementations (four models with three different modifications), a modified version of the ProtXLNet model could predict TCR-epitope pairs with the highest accuracy (weighted F1 score 0.55 simultaneously considering all 25 epitopes). The modification included additional features representing the gene names for the TCRs. We also showed that the basic implementation of transformers outperformed the previously available methods, i.e. TCRGP, TCRdist, and DeepTCR, developed for the same biological problem, especially for the hard-to-classify labels. We show that the proficiency of transformers in attention learning can be made operational in a complex biological setting like TCR binding prediction. Further ingenuity in utilizing the full potential of transformers, either through attention head visualization or introducing additional features, can extend T-cell research avenues. AVAILABILITY AND IMPLEMENTATION: Data and code are available on https://github.com/InduKhatri/tcrformer.


Asunto(s)
Epítopos de Linfocito T , Receptores de Antígenos de Linfocitos T , Humanos , Animales , Ratones , Epítopos de Linfocito T/metabolismo , Receptores de Antígenos de Linfocitos T/genética , Linfocitos T/metabolismo , Secuencia de Aminoácidos , Complejo Mayor de Histocompatibilidad
4.
Bioinformatics ; 39(39 Suppl 1): i404-i412, 2023 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-37387141

RESUMEN

MOTIVATION: Knowing the relation between cell types is crucial for translating experimental results from mice to humans. Establishing cell type matches, however, is hindered by the biological differences between the species. A substantial amount of evolutionary information between genes that could be used to align the species is discarded by most of the current methods since they only use one-to-one orthologous genes. Some methods try to retain the information by explicitly including the relation between genes, however, not without caveats. RESULTS: In this work, we present a model to transfer and align cell types in cross-species analysis (TACTiCS). First, TACTiCS uses a natural language processing model to match genes using their protein sequences. Next, TACTiCS employs a neural network to classify cell types within a species. Afterward, TACTiCS uses transfer learning to propagate cell type labels between species. We applied TACTiCS on scRNA-seq data of the primary motor cortex of human, mouse, and marmoset. Our model can accurately match and align cell types on these datasets. Moreover, our model outperforms Seurat and the state-of-the-art method SAMap. Finally, we show that our gene matching method results in better cell type matches than BLAST in our model. AVAILABILITY AND IMPLEMENTATION: The implementation is available on GitHub (https://github.com/kbiharie/TACTiCS). The preprocessed datasets and trained models can be downloaded from Zenodo (https://doi.org/10.5281/zenodo.7582460).


Asunto(s)
Evolución Biológica , Técnicas Genéticas , Humanos , Animales , Ratones , Secuencia de Aminoácidos , Procesamiento de Lenguaje Natural , Aprendizaje Automático
5.
Int J Gynecol Cancer ; 34(5): 713-721, 2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38388177

RESUMEN

OBJECTIVE: To assess the feasibility of scalable, objective, and minimally invasive liquid biopsy-derived biomarkers such as cell-free DNA copy number profiles, human epididymis protein 4 (HE4), and cancer antigen 125 (CA125) for pre-operative risk assessment of early-stage ovarian cancer in a clinically representative and diagnostically challenging population and to compare the performance of these biomarkers with the Risk of Malignancy Index (RMI). METHODS: In this case-control study, we included 100 patients with an ovarian mass clinically suspected to be early-stage ovarian cancer. Of these 100 patients, 50 were confirmed to have a malignant mass (cases) and 50 had a benign mass (controls). Using WisecondorX, an algorithm used extensively in non-invasive prenatal testing, we calculated the benign-calibrated copy number profile abnormality score. This score represents how different a sample is from benign controls based on copy number profiles. We combined this score with HE4 serum concentration to separate cases and controls. RESULTS: Combining the benign-calibrated copy number profile abnormality score with HE4, we obtained a model with a significantly higher sensitivity (42% vs 0%; p<0.002) at 99% specificity as compared with the RMI that is currently employed in clinical practice. Investigating performance in subgroups, we observed especially large differences in the advanced stage and non-high-grade serous ovarian cancer groups. CONCLUSION: This study demonstrates that cell-free DNA can be successfully employed to perform pre-operative risk of malignancy assessment for ovarian masses; however, results warrant validation in a more extensive clinical study.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Ováricas , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP , Humanos , Femenino , Neoplasias Ováricas/sangre , Neoplasias Ováricas/diagnóstico , Neoplasias Ováricas/cirugía , Neoplasias Ováricas/patología , Estudios de Casos y Controles , Persona de Mediana Edad , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/análisis , Proteína 2 de Dominio del Núcleo de Cuatro Disulfuros WAP/metabolismo , Biopsia Líquida/métodos , Biomarcadores de Tumor/sangre , Ácidos Nucleicos Libres de Células/sangre , Adulto , Anciano , Antígeno Ca-125/sangre
6.
Proc Natl Acad Sci U S A ; 118(49)2021 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-34873056

RESUMEN

Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.


Asunto(s)
Ensayos de Selección de Medicamentos Antitumorales/métodos , Perfilación de la Expresión Génica/métodos , Animales , Antineoplásicos/uso terapéutico , Biomarcadores Farmacológicos/metabolismo , Línea Celular Tumoral/efectos de los fármacos , Aprendizaje Profundo , Modelos Animales de Enfermedad , Predicción/métodos , Xenoinjertos , Humanos , Modelos Teóricos
7.
FASEB J ; 36(11): e22578, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36183353

RESUMEN

The response to lifestyle intervention studies is often heterogeneous, especially in older adults. Subtle responses that may represent a health gain for individuals are not always detected by classical health variables, stressing the need for novel biomarkers that detect intermediate changes in metabolic, inflammatory, and immunity-related health. Here, our aim was to develop and validate a molecular multivariate biomarker maximally sensitive to the individual effect of a lifestyle intervention; the Personalized Lifestyle Intervention Status (PLIS). We used 1 H-NMR fasting blood metabolite measurements from before and after the 13-week combined physical and nutritional Growing Old TOgether (GOTO) lifestyle intervention study in combination with a fivefold cross-validation and a bootstrapping method to train a separate PLIS score for men and women. The PLIS scores consisted of 14 and four metabolites for females and males, respectively. Performance of the PLIS score in tracking health gain was illustrated by association of the sex-specific PLIS scores with several classical metabolic health markers, such as BMI, trunk fat%, fasting HDL cholesterol, and fasting insulin, the primary outcome of the GOTO study. We also showed that the baseline PLIS score indicated which participants respond positively to the intervention. Finally, we explored PLIS in an independent physical activity lifestyle intervention study, showing similar, albeit remarkably weaker, associations of PLIS with classical metabolic health markers. To conclude, we found that the sex-specific PLIS score was able to track the individual short-term metabolic health gain of the GOTO lifestyle intervention study. The methodology used to train the PLIS score potentially provides a useful instrument to track personal responses and predict the participant's health benefit in lifestyle interventions similar to the GOTO study.


Asunto(s)
Estilo de Vida , Obesidad , Anciano , Biomarcadores , HDL-Colesterol , Femenino , Humanos , Insulina , Masculino
8.
Nucleic Acids Res ; 49(W1): W603-W612, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-34048563

RESUMEN

Genetic association studies are frequently used to study the genetic basis of numerous human phenotypes. However, the rapid interrogation of how well a certain genomic region associates across traits as well as the interpretation of genetic associations is often complex and requires the integration of multiple sources of annotation, which involves advanced bioinformatic skills. We developed snpXplorer, an easy-to-use web-server application for exploring Single Nucleotide Polymorphisms (SNP) association statistics and to functionally annotate sets of SNPs. snpXplorer can superimpose association statistics from multiple studies, and displays regional information including SNP associations, structural variations, recombination rates, eQTL, linkage disequilibrium patterns, genes and gene-expressions per tissue. By overlaying multiple GWAS studies, snpXplorer can be used to compare levels of association across different traits, which may help the interpretation of variant consequences. Given a list of SNPs, snpXplorer can also be used to perform variant-to-gene mapping and gene-set enrichment analysis to identify molecular pathways that are overrepresented in the list of input SNPs. snpXplorer is freely available at https://snpxplorer.net. Source code, documentation, example files and tutorial videos are available within the Help section of snpXplorer and at https://github.com/TesiNicco/snpXplorer.


Asunto(s)
Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Programas Informáticos , Enfermedad de Alzheimer/genética , Expresión Génica , Estudios de Asociación Genética , Genómica , Humanos , Desequilibrio de Ligamiento , Sitios de Carácter Cuantitativo
9.
Alzheimers Dement ; 19(7): 2831-2841, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-36583547

RESUMEN

INTRODUCTION: With increasing age, neuropathological substrates associated with Alzheimer's disease (AD) accumulate in brains of cognitively healthy individuals-are they resilient, or resistant to AD-associated neuropathologies? METHODS: In 85 centenarian brains, we correlated NIA (amyloid) stages, Braak (neurofibrillary tangle) stages, and CERAD (neuritic plaque) scores with cognitive performance close to death as determined by Mini-Mental State Examination (MMSE) scores. We assessed centenarian brains against 2131 brains from AD patients, non-AD demented, and non-demented individuals in an age continuum ranging from 16 to 100+ years. RESULTS: With age, brains from non-demented individuals reached the NIA and Braak stages observed in AD patients, while CERAD scores remained lower. In centenarians, NIA stages varied (22.4% were the highest stage 3), Braak stages rarely exceeded stage IV (5.9% were V), and CERAD scores rarely exceeded 2 (4.7% were 3); within these distributions, we observed no correlation with the MMSE (NIA: P = 0.60; Braak: P = 0.08; CERAD: P = 0.16). DISCUSSION: Cognitive health can be maintained despite the accumulation of high levels of AD-related neuropathological substrates. HIGHLIGHTS: Cognitively healthy elderly have AD neuropathology levels similar to AD patients. AD neuropathology loads do not correlate with cognitive performance in centenarians. Some centenarians are resilient to the highest levels of AD neuropathology.


Asunto(s)
Enfermedad de Alzheimer , Ovillos Neurofibrilares , Anciano de 80 o más Años , Humanos , Anciano , Adolescente , Adulto Joven , Adulto , Persona de Mediana Edad , Ovillos Neurofibrilares/patología , Placa Amiloide/patología , Centenarios , Enfermedad de Alzheimer/patología , Encéfalo/patología
10.
Alzheimers Dement ; 19(11): 5036-5047, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37092333

RESUMEN

INTRODUCTION: Neuropathological substrates associated with neurodegeneration occur in brains of the oldest old. How does this affect cognitive performance? METHODS: The 100-plus Study is an ongoing longitudinal cohort study of centenarians who self-report to be cognitively healthy; post mortem brain donation is optional. In 85 centenarian brains, we explored the correlations between the levels of 11 neuropathological substrates with ante mortem performance on 12 neuropsychological tests. RESULTS: Levels of neuropathological substrates varied: we observed levels up to Thal-amyloid beta phase 5, Braak-neurofibrillary tangle (NFT) stage V, Consortium to Establish a Registry for Alzheimer's Disease (CERAD)-neuritic plaque score 3, Thal-cerebral amyloid angiopathy stage 3, Tar-DNA binding protein 43 (TDP-43) stage 3, hippocampal sclerosis stage 1, Braak-Lewy bodies stage 6, atherosclerosis stage 3, cerebral infarcts stage 1, and cerebral atrophy stage 2. Granulovacuolar degeneration occurred in all centenarians. Some high performers had the highest neuropathology scores. DISCUSSION: Only Braak-NFT stage and limbic-predominant age-related TDP-43 encephalopathy (LATE) pathology associated significantly with performance across multiple cognitive domains. Of all cognitive tests, the clock-drawing test was particularly sensitive to levels of multiple neuropathologies.


Asunto(s)
Enfermedad de Alzheimer , Péptidos beta-Amiloides , Anciano de 80 o más Años , Humanos , Péptidos beta-Amiloides/metabolismo , Centenarios , Estudios Longitudinales , Enfermedad de Alzheimer/patología , Encéfalo/patología , Ovillos Neurofibrilares/patología , Neuropatología , Cognición
11.
Genes Immun ; 23(2): 99-110, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35436998

RESUMEN

The IMGT database profiles the TR germline alleles for all four TR loci (TRA, TRB, TRG and TRD), however, it does not comprise of the information regarding population specificity and allelic frequencies of these germline alleles. The specificity of allelic variants to different human populations can, however, be a rich source of information when studying the genetic basis of population-specific immune responses in disease and in vaccination. Therefore, we meticulously identified true germline alleles enriched with complete TR allele sequences and their frequencies across 26 different human populations, profiled by "1000 Genomes data". We identified 205 TRAV, 249 TRBV, 16 TRGV and 5 TRDV germline alleles supported by at least four haplotypes. The diversity of germline allelic variants in the TR loci is the highest in Africans, while the majority of the Non-African alleles are specific to the Asian populations, suggesting a diverse profile of TR germline alleles in different human populations. Interestingly, the alleles in the IMGT database are frequent and common across all five super-populations. We believe that this new set of germline TR sequences represents a valuable new resource which we have made available through the new population-matched TR (pmTR) database, accessible via https://pmtrig.lumc.nl/ .


Asunto(s)
Células Germinativas , Receptores de Antígenos de Linfocitos T , Alelos , Humanos , Receptores de Antígenos de Linfocitos T/genética
12.
BMC Genomics ; 23(1): 546, 2022 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-35907790

RESUMEN

Population-scale expression profiling studies can provide valuable insights into biological and disease-underlying mechanisms. The availability of phenotypic traits is essential for studying clinical effects. Therefore, missing, incomplete, or inaccurate phenotypic information can make analyses challenging and prevent RNA-seq or other omics data to be reused. A possible solution are predictors that infer clinical or behavioral phenotypic traits from molecular data. While such predictors have been developed based on different omics data types and are being applied in various studies, metabolomics-based surrogates are less commonly used than predictors based on DNA methylation profiles.In this study, we inferred 17 traits, including diabetes status and exposure to lipid medication, using previously trained metabolomic predictors. We evaluated whether these metabolomic surrogates can be used as an alternative to reported information for studying the respective phenotypes using expression profiling data of four population cohorts. For the majority of the 17 traits, the metabolomic surrogates performed similarly to the reported phenotypes in terms of effect sizes, number of significant associations, replication rates, and significantly enriched pathways.The application of metabolomics-derived surrogate outcomes opens new possibilities for reuse of multi-omics data sets. In studies where availability of clinical metadata is limited, missing or incomplete information can be complemented by these surrogates, thereby increasing the size of available data sets. Additionally, the availability of such surrogates could be used to correct for potential biological confounding. In the future, it would be interesting to further investigate the use of molecular predictors across different omics types and cohorts.


Asunto(s)
Metabolómica , Fenotipo
13.
Bioinformatics ; 37(2): 162-170, 2021 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-32797179

RESUMEN

MOTIVATION: Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict function from these. Deep supervised models require a lot of labeled training data which are not available for this task. However, a very large amount of protein sequences without functional labels is available. RESULTS: We applied an existing deep sequence model that had been pretrained in an unsupervised setting on the supervised task of protein molecular function prediction. We found that this complex feature representation is effective for this task, outperforming hand-crafted features such as one-hot encoding of amino acids, k-mer counts, secondary structure and backbone angles. Also, it partly negates the need for complex prediction models, as a two-layer perceptron was enough to achieve competitive performance in the third Critical Assessment of Functional Annotation benchmark. We also show that combining this sequence representation with protein 3D structure information does not lead to performance improvement, hinting that 3D structure is also potentially learned during the unsupervised pretraining. AVAILABILITY AND IMPLEMENTATION: Implementations of all used models can be found at https://github.com/stamakro/GCN-for-Structure-and-Function. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas , Programas Informáticos , Secuencia de Aminoácidos , Redes Neurales de la Computación , Proteínas/genética
14.
Mol Genet Metab ; 136(3): 199-218, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35660124

RESUMEN

The integration of metabolomics data with sequencing data is a key step towards improving the diagnostic process for finding the disease-causing genetic variant(s) in patients suspected of having an inborn error of metabolism (IEM). The measured metabolite levels could provide additional phenotypical evidence to elucidate the degree of pathogenicity for variants found in genes associated with metabolic processes. We present a computational approach, called Reafect, that calculates for each reaction in a metabolic pathway a score indicating whether that reaction is deficient or not. When calculating this score, Reafect takes multiple factors into account: the magnitude and sign of alterations in the metabolite levels, the reaction distances between metabolites and reactions in the pathway, and the biochemical directionality of the reactions. We applied Reafect to untargeted metabolomics data of 72 patient samples with a known IEM and found that in 81% of the cases the correct deficient enzyme was ranked within the top 5% of all considered enzyme deficiencies. Next, we integrated Reafect with Combined Annotation Dependent Depletion (CADD) scores (a measure for gene variant deleteriousness) and ranked the metabolic genes of 27 IEM patients. We observed that this integrated approach significantly improved the prioritization of the genes containing the disease-causing variant when compared with the two approaches individually. For 15/27 IEM patients the correct affected gene was ranked within the top 0.25% of the set of potentially affected genes. Together, our findings suggest that metabolomics data improves the identification of affected genes in patients suffering from IEM.


Asunto(s)
Errores Innatos del Metabolismo , Metabolómica , Genómica , Humanos , Redes y Vías Metabólicas/genética , Errores Innatos del Metabolismo/diagnóstico
15.
Nucleic Acids Res ; 48(18): e107, 2020 10 09.
Artículo en Inglés | MEDLINE | ID: mdl-32955565

RESUMEN

Single-cell technologies are emerging fast due to their ability to unravel the heterogeneity of biological systems. While scRNA-seq is a powerful tool that measures whole-transcriptome expression of single cells, it lacks their spatial localization. Novel spatial transcriptomics methods do retain cells spatial information but some methods can only measure tens to hundreds of transcripts. To resolve this discrepancy, we developed SpaGE, a method that integrates spatial and scRNA-seq datasets to predict whole-transcriptome expressions in their spatial configuration. Using five dataset-pairs, SpaGE outperformed previously published methods and showed scalability to large datasets. Moreover, SpaGE predicted new spatial gene patterns that are confirmed independently using in situ hybridization data from the Allen Mouse Brain Atlas.


Asunto(s)
RNA-Seq , Análisis de la Célula Individual , Programas Informáticos , Transcriptoma , Animales , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Ratones
16.
Genomics ; 113(4): 2229-2239, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34022350

RESUMEN

The genotype-phenotype link is a major research topic in the life sciences but remains highly complex to disentangle. Part of the complexity arises from the number of genes contributing to the observed phenotype. Despite the vast increase of molecular data, pinpointing the causal variant underlying a phenotype of interest is still challenging. In this study, we present an approach to map causal variation and molecular pathways underlying important phenotypes in pigs. We prioritize variation by utilizing and integrating predicted variant impact scores (pCADD), functional genomic information, and associated phenotypes in other mammalian species. We demonstrate the efficacy of our approach by reporting known and novel causal variants, of which many affect non-coding sequences. Our approach allows the disentangling of the biology behind important phenotypes by accelerating the discovery of novel causal variants and molecular mechanisms affecting important phenotypes in pigs. This information on molecular mechanisms could be applicable in other mammalian species, including humans.


Asunto(s)
Variación Genética , Genómica , Animales , Genotipo , Mamíferos , Fenotipo , Porcinos/genética
17.
Genes Immun ; 22(3): 172-186, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34120151

RESUMEN

Immunoglobulin (IG) loci harbor inter-individual allelic variants in many different germline IG variable, diversity and joining genes of the IG heavy (IGH), kappa (IGK) and lambda (IGL) loci, which together form the genetic basis of the highly diverse antigen-specific B-cell receptors. These allelic variants can be shared between or be specific to human populations. The current immunogenetics resources gather the germline alleles, however, lack the population specificity of the alleles which poses limitations for disease-association studies related to immune responses in different human populations. Therefore, we systematically identified germline alleles from 26 different human populations around the world, profiled by "1000 Genomes" data. We identified 409 IGHV, 179 IGKV, and 199 IGLV germline alleles supported by at least seven haplotypes. The diversity of germline alleles is the highest in Africans. Remarkably, the variants in the identified novel alleles show strikingly conserved patterns, the same as found in other IG databases, suggesting over-time evolutionary selection processes. We could relate the genetic variants to population-specific immune responses, e.g. IGHV1-69 for flu in Africans. The population matched IG (pmIG) resource will enhance our understanding of the SHM-related B-cell receptor selection processes in (infectious) diseases and vaccination within and between different human populations.


Asunto(s)
Enfermedades Transmisibles , Inmunoglobulinas , Alelos , Genes de Inmunoglobulinas , Células Germinativas , Humanos , Vacunación
18.
Eur J Neurosci ; 53(11): 3727-3739, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33792979

RESUMEN

Structural covariance networks are able to identify functionally organized brain regions by gray matter volume covariance across a population. We examined the transcriptomic signature of such anatomical networks in the healthy brain using postmortem microarray data from the Allen Human Brain Atlas. A previous study revealed that a posterior cingulate network and anterior cingulate network showed decreased gray matter in brains of Parkinson's disease patients. Therefore, we examined these two anatomical networks to understand the underlying molecular processes that may be involved in Parkinson's disease. Whole brain transcriptomics from the healthy brain revealed upregulation of genes associated with serotonin, GPCR, GABA, glutamate, and RAS-signaling pathways. Our results also suggest involvement of the cholinergic circuit, in which genes NPPA, SOSTDC1, and TYRP1 may play a functional role. Finally, both networks were enriched for genes associated with neuropsychiatric disorders that overlap with Parkinson's disease symptoms. The identified genes and pathways contribute to healthy functions of the posterior and anterior cingulate networks and disruptions to these functions may in turn contribute to the pathological and clinical events observed in Parkinson's disease.


Asunto(s)
Sustancia Gris , Enfermedad de Parkinson , Proteínas Adaptadoras Transductoras de Señales , Encéfalo/diagnóstico por imagen , Colinérgicos , Sustancia Gris/diagnóstico por imagen , Humanos , Imagen por Resonancia Magnética , Enfermedad de Parkinson/genética
19.
Bioinformatics ; 36(4): 1182-1190, 2020 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-31562759

RESUMEN

MOTIVATION: Co-expression of two genes across different conditions is indicative of their involvement in the same biological process. However, when using RNA-Seq datasets with many experimental conditions from diverse sources, only a subset of the experimental conditions is expected to be relevant for finding genes related to a particular Gene Ontology (GO) term. Therefore, we hypothesize that when the purpose is to find similarly functioning genes, the co-expression of genes should not be determined on all samples but only on those samples informative for the GO term of interest. RESULTS: To address this, we developed Metric Learning for Co-expression (MLC), a fast algorithm that assigns a GO-term-specific weight to each expression sample. The goal is to obtain a weighted co-expression measure that is more suitable than the unweighted Pearson correlation for applying Guilt-By-Association-based function predictions. More specifically, if two genes are annotated with a given GO term, MLC tries to maximize their weighted co-expression and, in addition, if one of them is not annotated with that term, the weighted co-expression is minimized. Our experiments on publicly available Arabidopsis thaliana RNA-Seq data demonstrate that MLC outperforms standard Pearson correlation in term-centric performance. Moreover, our method is particularly good at more specific terms, which are the most interesting. Finally, by observing the sample weights for a particular GO term, one can identify which experiments are important for learning that term and potentially identify novel conditions that are relevant, as demonstrated by experiments in both A. thaliana and Pseudomonas Aeruginosa. AVAILABILITY AND IMPLEMENTATION: MLC is available as a Python package at www.github.com/stamakro/MLC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , RNA-Seq , Ontología de Genes , Fenotipo
20.
Bioinformatics ; 36(Suppl_2): i849-i856, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381821

RESUMEN

MOTIVATION: Single cell data measures multiple cellular markers at the single-cell level for thousands to millions of cells. Identification of distinct cell populations is a key step for further biological understanding, usually performed by clustering this data. Dimensionality reduction based clustering tools are either not scalable to large datasets containing millions of cells, or not fully automated requiring an initial manual estimation of the number of clusters. Graph clustering tools provide automated and reliable clustering for single cell data, but suffer heavily from scalability to large datasets. RESULTS: We developed SCHNEL, a scalable, reliable and automated clustering tool for high-dimensional single-cell data. SCHNEL transforms large high-dimensional data to a hierarchy of datasets containing subsets of data points following the original data manifold. The novel approach of SCHNEL combines this hierarchical representation of the data with graph clustering, making graph clustering scalable to millions of cells. Using seven different cytometry datasets, SCHNEL outperformed three popular clustering tools for cytometry data, and was able to produce meaningful clustering results for datasets of 3.5 and 17.2 million cells within workable time frames. In addition, we show that SCHNEL is a general clustering tool by applying it to single-cell RNA sequencing data, as well as a popular machine learning benchmark dataset MNIST. AVAILABILITY AND IMPLEMENTATION: Implementation is available on GitHub (https://github.com/biovault/SCHNELpy). All datasets used in this study are publicly available. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , ARN , Análisis por Conglomerados , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Secuenciación del Exoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA