Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Stat Appl Genet Mol Biol ; 23(1)2024 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-38563699

RESUMO

Simulation frameworks are useful to stress-test predictive models when data is scarce, or to assert model sensitivity to specific data distributions. Such frameworks often need to recapitulate several layers of data complexity, including emergent properties that arise implicitly from the interaction between simulation components. Antibody-antigen binding is a complex mechanism by which an antibody sequence wraps itself around an antigen with high affinity. In this study, we use a synthetic simulation framework for antibody-antigen folding and binding on a 3D lattice that include full details on the spatial conformation of both molecules. We investigate how emergent properties arise in this framework, in particular the physical proximity of amino acids, their presence on the binding interface, or the binding status of a sequence, and relate that to the individual and pairwise contributions of amino acids in statistical models for binding prediction. We show that weights learnt from a simple logistic regression model align with some but not all features of amino acids involved in the binding, and that predictive sequence binding patterns can be enriched. In particular, main effects correlated with the capacity of a sequence to bind any antigen, while statistical interactions were related to sequence specificity.


Assuntos
Anticorpos , Antifibrinolíticos , Estudos de Viabilidade , Vacinas Sintéticas , Aminoácidos
2.
Int J Mol Sci ; 25(10)2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38791593

RESUMO

Epidemiological evidence suggests existing comorbidity between postmenopausal osteoporosis (OP) and cardiovascular disease (CVD), but identification of possible shared genes is lacking. The skeletal global transcriptomes were analyzed in trans-iliac bone biopsies (n = 84) from clinically well-characterized postmenopausal women (50 to 86 years) without clinical CVD using microchips and RNA sequencing. One thousand transcripts highly correlated with areal bone mineral density (aBMD) were further analyzed using bioinformatics, and common genes overlapping with CVD and associated biological mechanisms, pathways and functions were identified. Fifty genes (45 mRNAs, 5 miRNAs) were discovered with established roles in oxidative stress, inflammatory response, endothelial function, fibrosis, dyslipidemia and osteoblastogenesis/calcification. These pleiotropic genes with possible CVD comorbidity functions were also present in transcriptomes of microvascular endothelial cells and cardiomyocytes and were differentially expressed between healthy and osteoporotic women with fragility fractures. The results were supported by a genetic pleiotropy-informed conditional False Discovery Rate approach identifying any overlap in single nucleotide polymorphisms (SNPs) within several genes encoding aBMD- and CVD-associated transcripts. The study provides transcriptional and genomic evidence for genes of importance for both BMD regulation and CVD risk in a large collection of postmenopausal bone biopsies. Most of the transcripts identified in the CVD risk categories have no previously recognized roles in OP pathogenesis and provide novel avenues for exploring the mechanistic basis for the biological association between CVD and OP.


Assuntos
Densidade Óssea , Doenças Cardiovasculares , Osteoporose Pós-Menopausa , Polimorfismo de Nucleotídeo Único , Transcriptoma , Humanos , Feminino , Osteoporose Pós-Menopausa/genética , Osteoporose Pós-Menopausa/patologia , Idoso , Pessoa de Meia-Idade , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/patologia , Idoso de 80 Anos ou mais , Densidade Óssea/genética , Perfilação da Expressão Gênica , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , MicroRNAs/genética
3.
Bioinformatics ; 36(11): 3594-3596, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32154832

RESUMO

SUMMARY: B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. AVAILABILITY AND IMPLEMENTATION: The package is available via https://github.com/GreiffLab/immuneSIM and on CRAN at https://cran.r-project.org/web/packages/immuneSIM. The documentation is hosted at https://immuneSIM.readthedocs.io. CONTACT: sai.reddy@ethz.ch or victor.greiff@medisin.uio.no. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Benchmarking , Software , Simulação por Computador , Receptores de Antígenos de Linfócitos T/genética
4.
BMC Genomics ; 21(1): 282, 2020 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-32252628

RESUMO

BACKGROUND: Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. RESULTS: We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. CONCLUSIONS: Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Alinhamento de Sequência
5.
PLoS Comput Biol ; 15(2): e1006731, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30779737

RESUMO

Graph-based representations are considered to be the future for reference genomes, as they allow integrated representation of the steadily increasing data on individual variation. Currently available tools allow de novo assembly of graph-based reference genomes, alignment of new read sets to the graph representation as well as certain analyses like variant calling and haplotyping. We here present a first method for calling ChIP-Seq peaks on read data aligned to a graph-based reference genome. The method is a graph generalization of the peak caller MACS2, and is implemented in an open source tool, Graph Peak Caller. By using the existing tool vg to build a pan-genome of Arabidopsis thaliana, we validate our approach by showing that Graph Peak Caller with a pan-genome reference graph can trace variants within peaks that are not part of the linear reference genome, and find peaks that in general are more motif-enriched than those found by MACS2.


Assuntos
Imunoprecipitação da Cromatina/métodos , Genômica/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Arabidopsis/genética , Genoma/genética , Ligação Proteica , Software , Fatores de Transcrição
6.
Nucleic Acids Res ; 46(W1): W186-W193, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29873782

RESUMO

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.


Assuntos
Genômica/métodos , Software , Imunoprecipitação da Cromatina , Fator de Transcrição GATA1/metabolismo , Internet , Análise de Sequência de DNA , Interface Usuário-Computador
7.
BMC Bioinformatics ; 18(1): 263, 2017 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-28521770

RESUMO

BACKGROUND: It has been proposed that future reference genomes should be graph structures in order to better represent the sequence diversity present in a species. However, there is currently no standard method to represent genomic intervals, such as the positions of genes or transcription factor binding sites, on graph-based reference genomes. RESULTS: We formalize offset-based coordinate systems on graph-based reference genomes and introduce methods for representing intervals on these reference structures. We show the advantage of our methods by representing genes on a graph-based representation of the newest assembly of the human genome (GRCh38) and its alternative loci for regions that are highly variable. CONCLUSION: More complex reference genomes, containing alternative loci, require methods to represent genomic data on these structures. Our proposed notation for genomic intervals makes it possible to fully utilize the alternative loci of the GRCh38 assembly and potential future graph-based reference genomes. We have made a Python package for representing such intervals on offset-based coordinate systems, available at https://github.com/uio-cels/offsetbasedgraph . An interactive web-tool using this Python package to visualize genes on a graph created from GRCh38 is available at https://github.com/uio-cels/genomicgraphcoords .


Assuntos
Gráficos por Computador , Genoma Humano , Genômica/métodos , Algoritmos , Loci Gênicos , Humanos , Internet , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Sequência de DNA , Software
8.
Nucleic Acids Res ; 41(Web Server issue): W133-41, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23632163

RESUMO

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.


Assuntos
Genômica/métodos , Software , Interpretação Estatística de Dados , Genoma , Internet
9.
BMC Med ; 11: 163, 2013 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-23849224

RESUMO

BACKGROUND: Vitamin D insufficiency has been implicated in autoimmunity. ChIP-seq experiments using immune cell lines have shown that vitamin D receptor (VDR) binding sites are enriched near regions of the genome associated with autoimmune diseases. We aimed to investigate VDR binding in primary CD4+ cells from healthy volunteers. METHODS: We extracted CD4+ cells from nine healthy volunteers. Each sample underwent VDR ChIP-seq. Our results were analyzed in relation to published ChIP-seq and RNA-seq data in the Genomic HyperBrowser. We used MEMEChIP for de novo motif discovery. 25-Hydroxyvitamin D levels were measured using liquid chromatography-tandem mass spectrometry and samples were divided into vitamin D sufficient (25(OH)D ≥75 nmol/L) and insufficient/deficient (25(OH)D <75 nmol/L) groups. RESULTS: We found that the amount of VDR binding is correlated with the serum level of 25-hydroxyvitamin D (r = 0.92, P= 0.0005). In vivo VDR binding sites are enriched for autoimmune disease associated loci, especially when 25-hydroxyvitamin D levels (25(OH)D) were sufficient (25(OH)D ≥75: 3.13-fold, P<0.0001; 25(OH)D <75: 2.76-fold, P<0.0001; 25(OH)D ≥75 enrichment versus 25(OH)D <75 enrichment: P= 0.0002). VDR binding was also enriched near genes associated specifically with T-regulatory and T-helper cells in the 25(OH)D ≥75 group. MEME ChIP did not identify any VDR-like motifs underlying our VDR ChIP-seq peaks. CONCLUSION: Our results show a direct correlation between in vivo 25-hydroxyvitamin D levels and the number of VDR binding sites, although our sample size is relatively small. Our study further implicates VDR binding as important in gene-environment interactions underlying the development of autoimmunity and provides a biological rationale for 25-hydroxyvitamin D sufficiency being based at 75 nmol/L. Our results also suggest that VDR binding in response to physiological levels of vitamin D occurs predominantly in a VDR motif-independent manner.


Assuntos
Doenças Autoimunes/sangue , Linfócitos T CD4-Positivos/metabolismo , Análise Serial de Proteínas/métodos , Receptores de Calcitriol/sangue , Vitamina D/análogos & derivados , Motivos de Aminoácidos , Sequência de Aminoácidos , Doenças Autoimunes/genética , Doenças Autoimunes/patologia , Sítios de Ligação/genética , Linfócitos T CD4-Positivos/patologia , Genômica/métodos , Humanos , Cultura Primária de Células , Receptores de Calcitriol/genética , Vitamina D/sangue
10.
PLoS One ; 18(4): e0284443, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37058511

RESUMO

Data simulation is fundamental for machine learning and causal inference, as it allows exploration of scenarios and assessment of methods in settings with full control of ground truth. Directed acyclic graphs (DAGs) are well established for encoding the dependence structure over a collection of variables in both inference and simulation settings. However, while modern machine learning is applied to data of an increasingly complex nature, DAG-based simulation frameworks are still confined to settings with relatively simple variable types and functional forms. We here present DagSim, a Python-based framework for DAG-based data simulation without any constraints on variable types or functional relations. A succinct YAML format for defining the simulation model structure promotes transparency, while separate user-provided functions for generating each variable based on its parents ensure simulation code modularization. We illustrate the capabilities of DagSim through use cases where metadata variables control shapes in an image and patterns in bio-sequences. DagSim is available as a Python package at PyPI. Source code and documentation are available at: https://github.com/uio-bmi/dagsim.


Assuntos
Software , Simulação por Computador
11.
Gigascience ; 112022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35639633

RESUMO

BACKGROUND: Machine learning (ML) methodology development for the classification of immune states in adaptive immune receptor repertoires (AIRRs) has seen a recent surge of interest. However, so far, there does not exist a systematic evaluation of scenarios where classical ML methods (such as penalized logistic regression) already perform adequately for AIRR classification. This hinders investigative reorientation to those scenarios where method development of more sophisticated ML approaches may be required. RESULTS: To identify those scenarios where a baseline ML method is able to perform well for AIRR classification, we generated a collection of synthetic AIRR benchmark data sets encompassing a wide range of data set architecture-associated and immune state-associated sequence patterns (signal) complexity. We trained ≈1,700 ML models with varying assumptions regarding immune signal on ≈1,000 data sets with a total of ≈250,000 AIRRs containing ≈46 billion TCRß CDR3 amino acid sequences, thereby surpassing the sample sizes of current state-of-the-art AIRR-ML setups by two orders of magnitude. We found that L1-penalized logistic regression achieved high prediction accuracy even when the immune signal occurs only in 1 out of 50,000 AIR sequences. CONCLUSIONS: We provide a reference benchmark to guide new AIRR-ML classification methodology by (i) identifying those scenarios characterized by immune signal and data set complexity, where baseline methods already achieve high prediction accuracy, and (ii) facilitating realistic expectations of the performance of AIRR-ML models given training data set properties and assumptions. Our study serves as a template for defining specialized AIRR benchmark data sets for comprehensive benchmarking of AIRR-ML methods.


Assuntos
Aprendizado de Máquina , Receptores Imunológicos
12.
Gigascience ; 122022 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-37848619

RESUMO

BACKGROUND: Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires. RESULTS: We demonstrate that a common approach to generating simulated AIRR benchmark datasets can introduce biases, which may be exploited for undesired shortcut learning by certain ML methods. To mitigate undesirable access to true signals in simulated AIRR datasets, we devised a simulation strategy (simAIRR) that constructs antigen-experienced-like repertoires with a realistic overlap of receptor sequences. simAIRR can be used for constructing AIRR-level benchmarks based on a range of assumptions (or experimental data sources) for what constitutes receptor-level immune signals. This includes the possibility of making or not making any prior assumptions regarding the similarity or commonality of immune state-associated sequences that will be used as true signals. We demonstrate the real-world realism of our proposed simulation approach by showing that basic ML strategies perform similarly on simAIRR-generated and real-world experimental AIRR datasets. CONCLUSIONS: This study sheds light on the potential shortcut learning opportunities for ML methods that can arise with the state-of-the-art way of simulating AIRR datasets. simAIRR is available as a Python package: https://github.com/KanduriC/simAIRR.


Assuntos
Benchmarking , Simulação por Computador
13.
Cell Rep Methods ; 2(8): 100269, 2022 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-36046619

RESUMO

B and T cell receptor (immune) repertoires can represent an individual's immune history. While current repertoire analysis methods aim to discriminate between health and disease states, they are typically based on only a limited number of parameters. Here, we introduce immuneREF: a quantitative multidimensional measure of adaptive immune repertoire (and transcriptome) similarity that allows interpretation of immune repertoire variation by relying on both repertoire features and cross-referencing of simulated and experimental datasets. To quantify immune repertoire similarity landscapes across health and disease, we applied immuneREF to >2,400 datasets from individuals with varying immune states (healthy, [autoimmune] disease, and infection). We discovered, in contrast to the current paradigm, that blood-derived immune repertoires of healthy and diseased individuals are highly similar for certain immune states, suggesting that repertoire changes to immune perturbations are less pronounced than previously thought. In conclusion, immuneREF enables the population-wide study of adaptive immune response similarity across immune states.


Assuntos
Imunidade Adaptativa , Doenças Autoimunes , Humanos , Receptores de Antígenos de Linfócitos T/genética , Receptores Imunológicos
14.
BMC Genomics ; 12: 353, 2011 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-21736759

RESUMO

BACKGROUND: Transcription factors in disease-relevant pathways represent potential drug targets, by impacting a distinct set of pathways that may be modulated through gene regulation. The influence of transcription factors is typically studied on a per disease basis, and no current resources provide a global overview of the relations between transcription factors and disease. Furthermore, existing pipelines for related large-scale analysis are tailored for particular sources of input data, and there is a need for generic methodology for integrating complementary sources of genomic information. RESULTS: We here present a large-scale analysis of multiple diseases versus multiple transcription factors, with a global map of over-and under-representation of 446 transcription factors in 1010 diseases. This map, referred to as the differential disease regulome, provides a first global statistical overview of the complex interrelationships between diseases, genes and controlling elements. The map is visualized using the Google map engine, due to its very large size, and provides a range of detailed information in a dynamic presentation format.The analysis is achieved through a novel methodology that performs a pairwise, genome-wide comparison on the cartesian product of two distinct sets of annotation tracks, e.g. all combinations of one disease and one TF.The methodology was also used to extend with maps using alternative data sets related to transcription and disease, as well as data sets related to Gene Ontology classification and histone modifications. We provide a web-based interface that allows users to generate other custom maps, which could be based on precisely specified subsets of transcription factors and diseases, or, in general, on any categorical genome annotation tracks as they are improved or become available. CONCLUSION: We have created a first resource that provides a global overview of the complex relations between transcription factors and disease. As the accuracy of the disease regulome depends mainly on the quality of the input data, forthcoming ChIP-seq based binding data for many TFs will provide improved maps. We further believe our approach to genome analysis could allow an advance from the current typical situation of one-time integrative efforts to reproducible and upgradable integrative analysis. The differential disease regulome and its associated methodology is available at http://hyperbrowser.uio.no.


Assuntos
Doença/genética , Genômica/métodos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Gráficos por Computador , Humanos , Internet , Anotação de Sequência Molecular
15.
Cell Rep ; 34(11): 108856, 2021 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-33730590

RESUMO

Antibody-antigen binding relies on the specific interaction of amino acids at the paratope-epitope interface. The predictability of antibody-antigen binding is a prerequisite for de novo antibody and (neo-)epitope design. A fundamental premise for the predictability of antibody-antigen binding is the existence of paratope-epitope interaction motifs that are universally shared among antibody-antigen structures. In a dataset of non-redundant antibody-antigen structures, we identify structural interaction motifs, which together compose a commonly shared structure-based vocabulary of paratope-epitope interactions. We show that this vocabulary enables the machine learnability of antibody-antigen binding on the paratope-epitope level using generative machine learning. The vocabulary (1) is compact, less than 104 motifs; (2) distinct from non-immune protein-protein interactions; and (3) mediates specific oligo- and polyreactive interactions between paratope-epitope pairs. Our work leverages combined structure- and sequence-based learning to demonstrate that machine-learning-driven predictive paratope and epitope engineering is feasible.


Assuntos
Reações Antígeno-Anticorpo/imunologia , Sítios de Ligação de Anticorpos/imunologia , Epitopos/imunologia , Motivos de Aminoácidos , Sequência de Aminoácidos , Anticorpos/química , Anticorpos/imunologia , Regiões Determinantes de Complementaridade/química , Epitopos/química , Aprendizado de Máquina , Ligação Proteica
17.
BMC Bioinformatics ; 9: 123, 2008 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-18302777

RESUMO

BACKGROUND: Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. RESULTS: We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. CONCLUSION: Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.


Assuntos
Algoritmos , DNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Fatores de Transcrição/genética , Motivos de Aminoácidos , Sequência de Bases , Sítios de Ligação , Dados de Sequência Molecular , Ligação Proteica
18.
J Clin Invest ; 128(6): 2642-2650, 2018 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-29757191

RESUMO

Little is known about the repertoire dynamics and persistence of pathogenic T cells in HLA-associated disorders. In celiac disease, a disorder with a strong association with certain HLA-DQ allotypes, presumed pathogenic T cells can be visualized and isolated with HLA-DQ:gluten tetramers, thereby enabling further characterization. Single and bulk populations of HLA-DQ:gluten tetramer-sorted CD4+ T cells were analyzed by high-throughput DNA sequencing of rearranged TCR-α and -ß genes. Blood and gut biopsy samples from 21 celiac disease patients, taken at various stages of disease and in intervals of weeks to decades apart, were examined. Persistence of the same clonotypes was seen in both compartments over decades, with up to 53% overlap between samples obtained 16 to 28 years apart. Further, we observed that the recall response following oral gluten challenge was dominated by preexisting CD4+ T cell clonotypes. Public features were frequent among gluten-specific T cells, as 10% of TCR-α, TCR-ß, or paired TCR-αß amino acid sequences of total 1813 TCRs generated from 17 patients were observed in 2 or more patients. In established celiac disease, the T cell clonotypes that recognize gluten are persistent for decades, making up fixed repertoires that prevalently exhibit public features. These T cells represent an attractive therapeutic target.


Assuntos
Linfócitos T CD4-Positivos/imunologia , Doença Celíaca/imunologia , Glutens/imunologia , Antígenos HLA-DQ/imunologia , Receptores de Antígenos de Linfócitos T alfa-beta/imunologia , Doença Celíaca/patologia , Feminino , Seguimentos , Humanos , Masculino
19.
PLoS One ; 10(4): e0119605, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25853421

RESUMO

Epstein-Barr virus (EBV) is a non-heritable factor that associates with multiple sclerosis (MS). However its causal relationship with the disease is still unclear. The virus establishes a complex co-existence with the host that includes regulatory influences on gene expression. Hence, if EBV contributes to the pathogenesis of MS it may do so by interacting with disease predisposing genes. To verify this hypothesis we evaluated EBV nuclear antigen 2 (EBNA2, a protein that recent works by our and other groups have implicated in disease development) binding inside MS associated genomic intervals. We found that EBNA2 binding occurs within MS susceptibility sites more than expected by chance (factor of observed vs expected overlap [O/E] = 5.392-fold, p < 2.0e-05). This remains significant after controlling for multiple genomic confounders. We then asked whether this observation is significant per se or should also be viewed in the context of other disease relevant gene-environment interactions, such as those attributable to vitamin D. We therefore verified the overlap between EBNA2 genomic occupancy and vitamin D receptor (VDR) binding sites. EBNA2 shows a striking overlap with VDR binding sites (O/E = 96.16-fold, p < 2.0e-05), even after controlling for the chromatin accessibility state of shared regions (p <0.001). Furthermore, MS susceptibility regions are preferentially targeted by both EBNA2 and VDR than by EBNA2 alone (enrichment difference = 1.722-fold, p = 0.0267). Taken together, these findings demonstrate that EBV participates in the gene-environment interactions that predispose to MS.


Assuntos
Antígenos Nucleares do Vírus Epstein-Barr/metabolismo , Genoma Humano/genética , Esclerose Múltipla/genética , Esclerose Múltipla/virologia , Receptores de Calcitriol/metabolismo , Humanos , Esclerose Múltipla/metabolismo , Polimorfismo de Nucleotídeo Único , Ligação Proteica , Transporte Proteico
20.
Nat Genet ; 46(9): 964-72, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25129143

RESUMO

Creating spontaneous yet genetically tractable human tumors from normal cells presents a fundamental challenge. Here we combined retroviral and transposon insertional mutagenesis to enable cancer gene discovery starting with human primary cells. We used lentiviruses to seed gain- and loss-of-function gene disruption elements, which were further deployed by Sleeping Beauty transposons throughout the genome of human bone explant mesenchymal cells. De novo tumors generated rapidly in this context were high-grade myxofibrosarcomas. Tumor insertion sites were enriched in recurrent somatic copy-number aberration regions from multiple cancer types and could be used to pinpoint new driver genes that sustain somatic alterations in patients. We identified HDLBP, which encodes the RNA-binding protein vigilin, as a candidate tumor suppressor deleted at 2q37.3 in greater than one out of ten tumors across multiple tissues of origin. Hybrid viral-transposon systems may accelerate the functional annotation of cancer genomes by enabling insertional mutagenesis screens in higher eukaryotes that are not amenable to germline transgenesis.


Assuntos
Mutagênese Insercional , Sarcoma/genética , Linhagem Celular , Elementos de DNA Transponíveis , Vetores Genéticos/genética , Genoma Humano , Células HEK293 , Humanos , Proteínas de Ligação a RNA/genética , Retroviridae/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA