Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(2)2024 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-38483255

RESUMO

Spatially resolved transcriptomics (SRT) is a pioneering method for simultaneously studying morphological contexts and gene expression at single-cell precision. Data emerging from SRT are multifaceted, presenting researchers with intricate gene expression matrices, precise spatial details and comprehensive histology visuals. Such rich and intricate datasets, unfortunately, render many conventional methods like traditional machine learning and statistical models ineffective. The unique challenges posed by the specialized nature of SRT data have led the scientific community to explore more sophisticated analytical avenues. Recent trends indicate an increasing reliance on deep learning algorithms, especially in areas such as spatial clustering, identification of spatially variable genes and data alignment tasks. In this manuscript, we provide a rigorous critique of these advanced deep learning methodologies, probing into their merits, limitations and avenues for further refinement. Our in-depth analysis underscores that while the recent innovations in deep learning tailored for SRT have been promising, there remains a substantial potential for enhancement. A crucial area that demands attention is the development of models that can incorporate intricate biological nuances, such as phylogeny-aware processing or in-depth analysis of minuscule histology image segments. Furthermore, addressing challenges like the elimination of batch effects, perfecting data normalization techniques and countering the overdispersion and zero inflation patterns seen in gene expression is pivotal. To support the broader scientific community in their SRT endeavors, we have meticulously assembled a comprehensive directory of readily accessible SRT databases, hoping to serve as a foundation for future research initiatives.


Assuntos
Aprendizado Profundo , Algoritmos , Bases de Dados Factuais , Perfilação da Expressão Gênica , Aprendizado de Máquina
2.
Blood ; 142(17): 1448-1462, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37595278

RESUMO

Hematopoietic stem and progenitor cells (HSPCs) rely on a complex interplay among transcription factors (TFs) to regulate differentiation into mature blood cells. A heptad of TFs (FLI1, ERG, GATA2, RUNX1, TAL1, LYL1, LMO2) bind regulatory elements in bulk CD34+ HSPCs. However, whether specific heptad-TF combinations have distinct roles in regulating hematopoietic differentiation remains unknown. We mapped genome-wide chromatin contacts (HiC, H3K27ac, HiChIP), chromatin modifications (H3K4me3, H3K27ac, H3K27me3) and 10 TF binding profiles (heptad, PU.1, CTCF, STAG2) in HSPC subsets (stem/multipotent progenitors plus common myeloid, granulocyte macrophage, and megakaryocyte erythrocyte progenitors) and found TF occupancy and enhancer-promoter interactions varied significantly across cell types and were associated with cell-type-specific gene expression. Distinct regulatory elements were enriched with specific heptad-TF combinations, including stem-cell-specific elements with ERG, and myeloid- and erythroid-specific elements with combinations of FLI1, RUNX1, GATA2, TAL1, LYL1, and LMO2. Furthermore, heptad-occupied regions in HSPCs were subsequently bound by lineage-defining TFs, including PU.1 and GATA1, suggesting that heptad factors may prime regulatory elements for use in mature cell types. We also found that enhancers with cell-type-specific heptad occupancy shared a common grammar with respect to TF binding motifs, suggesting that combinatorial binding of TF complexes was at least partially regulated by features encoded in DNA sequence motifs. Taken together, this study comprehensively characterizes the gene regulatory landscape in rare subpopulations of human HSPCs. The accompanying data sets should serve as a valuable resource for understanding adult hematopoiesis and a framework for analyzing aberrant regulatory networks in leukemic cells.


Assuntos
Subunidade alfa 2 de Fator de Ligação ao Core , Células-Tronco Hematopoéticas , Humanos , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Subunidade alfa 2 de Fator de Ligação ao Core/metabolismo , Células-Tronco Hematopoéticas/metabolismo , Regulação da Expressão Gênica , Hematopoese/genética , Cromatina/metabolismo
3.
PLoS Comput Biol ; 19(7): e1011249, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37486921

RESUMO

The genetic etiology of brain disorders is highly heterogeneous, characterized by abnormalities in the development of the central nervous system that lead to diminished physical or intellectual capabilities. The process of determining which gene drives disease, known as "gene prioritization," is not entirely understood. Genome-wide searches for gene-disease associations are still underdeveloped due to reliance on previous discoveries and evidence sources with false positive or negative relations. This paper introduces DeepGenePrior, a model based on deep neural networks that prioritizes candidate genes in genetic diseases. Using the well-studied Variational AutoEncoder (VAE), we developed a score to measure the impact of genes on target diseases. Unlike other methods that use prior data to select candidate genes, based on the "guilt by association" principle and auxiliary data sources like protein networks, our study exclusively employs copy number variants (CNVs) for gene prioritization. By analyzing CNVs from 74,811 individuals with autism, schizophrenia, and developmental delay, we identified genes that best distinguish cases from controls. Our findings indicate a 12% increase in fold enrichment in brain-expressed genes compared to previous studies and a 15% increase in genes associated with mouse nervous system phenotypes. Furthermore, we identified common deletions in ZDHHC8, DGCR5, and CATG00000022283 among the top genes related to all three disorders, suggesting a common etiology among these clinically distinct conditions. DeepGenePrior is publicly available online at http://git.dml.ir/z_rahaie/DGP to address obstacles in existing gene prioritization studies identifying candidate genes.


Assuntos
Transtorno Autístico , Aprendizado Profundo , Animais , Camundongos , Variações do Número de Cópias de DNA/genética , Transtorno Autístico/genética , Encéfalo , Predisposição Genética para Doença/genética
4.
PLoS Comput Biol ; 18(6): e1010241, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35749574

RESUMO

Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools including Hi-C significant interaction callers (SIC) and Hi-C loop callers using published Hi-C, capture Hi-C, and Micro-C datasets. Our results demonstrate that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and 3) more likely to link known regulatory features including known functional enhancer-promoter pairs validated by CRISPRi than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distributions only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C, capture Hi-C and Micro-C data.


Assuntos
Cromatina , Estudo de Associação Genômica Ampla , Sítios de Ligação , Cromatina/genética , Genoma , Genômica/métodos
6.
Genomics ; 114(5): 110454, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36030022

RESUMO

Cis-regulatory elements (CREs) are non-coding parts of the genome that play a critical role in gene expression regulation. Enhancers, as an important example of CREs, interact with genes to influence complex traits like disease, heat tolerance and growth rate. Much of what is known about enhancers come from studies of humans and a few model organisms like mouse, with little known about other mammalian species. Previous studies have attempted to identify enhancers in less studied mammals using comparative genomics but with limited success. Recently, Machine Learning (ML) techniques have shown promising results to predict enhancer regions. Here, we investigated the ability of ML methods to identify enhancers in three non-model mammalian species (cattle, pig and dog) using human and mouse enhancer data from VISTA and publicly available ChIP-seq. We tested nine models, using four different representations of the DNA sequences in cross-species prediction using both the VISTA dataset and species-specific ChIP-seq data. We identified between 809,399 and 877,278 enhancer-like regions (ELRs) in the study species (11.6-13.7% of each genome). These predictions were close to the ~8% proportion of ELRs that covered the human genome. We propose that our ML methods have predictive ability for identifying enhancers in non-model mammalian species. We have provided a list of high confidence enhancers at https://github.com/DaviesCentreInformatics/Cross-species-enhancer-prediction and believe these enhancers will be of great use to the community.


Assuntos
Elementos Facilitadores Genéticos , Genômica , Animais , Sequência de Bases , Bovinos , Cães , Genoma Humano , Genômica/métodos , Humanos , Aprendizado de Máquina , Mamíferos/genética , Camundongos , Suínos
7.
Sensors (Basel) ; 23(3)2023 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-36772503

RESUMO

Continuous advancements of technologies such as machine-to-machine interactions and big data analysis have led to the internet of things (IoT) making information sharing and smart decision-making possible using everyday devices. On the other hand, swarm intelligence (SI) algorithms seek to establish constructive interaction among agents regardless of their intelligence level. In SI algorithms, multiple individuals run simultaneously and possibly in a cooperative manner to address complex nonlinear problems. In this paper, the application of SI algorithms in IoT is investigated with a special focus on the internet of medical things (IoMT). The role of wearable devices in IoMT is briefly reviewed. Existing works on applications of SI in addressing IoMT problems are discussed. Possible problems include disease prediction, data encryption, missing values prediction, resource allocation, network routing, and hardware failure management. Finally, research perspectives and future trends are outlined.


Assuntos
Internet das Coisas , Dispositivos Eletrônicos Vestíveis , Humanos , Algoritmos , Cognição , Inteligência , Internet
8.
Int J Mol Sci ; 24(3)2023 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-36768794

RESUMO

Prostate cancer (PC) is the most frequently diagnosed non-skin cancer in the world. Previous studies have shown that genomic alterations represent the most common mechanism for molecular alterations responsible for the development and progression of PC. This highlights the importance of identifying functional genomic variants for early detection in high-risk PC individuals. Great efforts have been made to identify common protein-coding genetic variations; however, the impact of non-coding variations, including regulatory genetic variants, is not well understood. Identification of these variants and the underlying target genes will be a key step in improving the detection and treatment of PC. To gain an understanding of the functional impact of genetic variants, and in particular, regulatory variants in PC, we developed an integrative pipeline (AGV) that uses whole genome/exome sequences, GWAS SNPs, chromosome conformation capture data, and ChIP-Seq signals to investigate the potential impact of genomic variants on the underlying target genes in PC. We identified 646 putative regulatory variants, of which 30 significantly altered the expression of at least one protein-coding gene. Our analysis of chromatin interactions data (Hi-C) revealed that the 30 putative regulatory variants could affect 131 coding and non-coding genes. Interestingly, our study identified the 131 protein-coding genes that are involved in disease-related pathways, including Reactome and MSigDB, for most of which targeted treatment options are currently available. Notably, our analysis revealed several non-coding RNAs, including RP11-136K7.2 and RAMP2-AS1, as potential enhancer elements of the protein-coding genes CDH12 and EZH1, respectively. Our results provide a comprehensive map of genomic variants in PC and reveal their potential contribution to prostate cancer progression and development.


Assuntos
Estudo de Associação Genômica Ampla , Neoplasias da Próstata , Masculino , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Neoplasias da Próstata/genética , Cromatina , Genômica , Polimorfismo de Nucleotídeo Único
9.
BMC Bioinformatics ; 23(1): 138, 2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35439935

RESUMO

BACKGROUND: Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths worldwide. Recent studies have observed causative mutations in susceptible genes related to colorectal cancer in 10 to 15% of the patients. This highlights the importance of identifying mutations for early detection of this cancer for more effective treatments among high risk individuals. Mutation is considered as the key point in cancer research. Many studies have performed cancer subtyping based on the type of frequently mutated genes, or the proportion of mutational processes. However, to the best of our knowledge, combination of these features has never been used together for this task. This highlights the potential to introduce better and more inclusive subtype classification approaches using wider range of related features to enable biomarker discovery and thus inform drug development for CRC. RESULTS: In this study, we develop a new pipeline based on a novel concept called 'gene-motif', which merges mutated gene information with tri-nucleotide motif of mutated sites, for colorectal cancer subtype identification. We apply our pipeline to the International Cancer Genome Consortium (ICGC) CRC samples and identify, for the first time, 3131 gene-motif combinations that are significantly mutated in 536 ICGC colorectal cancer samples. Using these features, we identify seven CRC subtypes with distinguishable phenotypes and biomarkers, including unique cancer related signaling pathways, in which for most of them targeted treatment options are currently available. Interestingly, we also identify several genes that are mutated in multiple subtypes but with unique sequence contexts. CONCLUSION: Our results highlight the importance of considering both the mutation type and mutated genes in identification of cancer subtypes and cancer biomarkers. The new CRC subtypes presented in this study demonstrates distinguished phenotypic properties which can be effectively used to develop new treatments. By knowing the genes and phenotypes associated with the subtypes, a personalized treatment plan can be developed that considers the specific phenotypes associated with their genomic lesion.


Assuntos
Neoplasias Colorretais , Biomarcadores Tumorais/genética , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Genômica , Humanos , Mutação , Fenótipo
10.
BMC Bioinformatics ; 23(1): 298, 2022 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-35879674

RESUMO

BACKGROUND: The advent of high throughput sequencing has enabled researchers to systematically evaluate the genetic variations in cancer, identifying many cancer-associated genes. Although cancers in the same tissue are widely categorized in the same group, they demonstrate many differences concerning their mutational profiles. Hence, there is no definitive treatment for most cancer types. This reveals the importance of developing new pipelines to identify cancer-associated genes accurately and re-classify patients with similar mutational profiles. Classification of cancer patients with similar mutational profiles may help discover subtypes of cancer patients who might benefit from specific treatment types. RESULTS: In this study, we propose a new machine learning pipeline to identify protein-coding genes mutated in many samples to identify cancer subtypes. We apply our pipeline to 12,270 samples collected from the international cancer genome consortium, covering 19 cancer types. As a result, we identify 17 different cancer subtypes. Comprehensive phenotypic and genotypic analysis indicates distinguishable properties, including unique cancer-related signaling pathways. CONCLUSIONS: This new subtyping approach offers a novel opportunity for cancer drug development based on the mutational profile of patients. Additionally, we analyze the mutational signatures for samples in each subtype, which provides important insight into their active molecular mechanisms. Some of the pathways we identified in most subtypes, including the cell cycle and the Axon guidance pathways, are frequently observed in cancer disease. Interestingly, we also identified several mutated genes and different rates of mutation in multiple cancer subtypes. In addition, our study on "gene-motif" suggests the importance of considering both the context of the mutations and mutational processes in identifying cancer-associated genes. The source codes for our proposed clustering pipeline and analysis are publicly available at: https://github.com/bcb-sut/Pan-Cancer .


Assuntos
Neoplasias , Mutação Puntual , Análise por Conglomerados , Genoma Humano , Humanos , Mutação , Neoplasias/genética
11.
Bioinformatics ; 37(20): 3664-3666, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-34028497

RESUMO

MOTIVATION: CircRNAs are covalently closed RNA molecules that are particularly abundant in the brain. While circRNA expression data from the human brain is rapidly accumulating, integration of large-scale datasets remains challenging and time-consuming, and consequently an integrative view of circRNA expression in the human brain is currently lacking. RESULTS: NeuroCirc is a web-based resource that allows interactive exploration of multiple types of circRNA data from the human brain, including large-scale expression datasets, circQTL data and circRNA expression across neuronal differentiation and cellular maturation time-courses. NeuroCirc also allows users to upload their own circRNA expression data and explore it in the integrative platform, thereby supporting circRNA prioritization for experimental validation and functional studies. AVAILABILITY AND IMPLEMENTATION: NeuroCirc is freely available at: https://voineagulab.github.io/NeuroCirc/. The source code and user documentation are available at: https://github.com/Voineagulab/NeuroCirc. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

12.
J Autoimmun ; 127: 102781, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34952359

RESUMO

To investigate the molecular mechanisms through which Epstein-Barr virus (EBV) may contribute to Systemic Lupus Erythematosus (SLE) pathogenesis, we interrogated SLE genetic risk loci for signatures of EBV infection. We first compared the gene expression profile of SLE risk genes across 459 different cell/tissue types. EBV-infected B cells (LCLs) had the strongest representation of highly expressed SLE risk genes. By determining an SLE risk allele effect on gene expression (expression quantitative trait loci, eQTL) in LCLs and 16 other immune cell types, we identified 79 SLE risk locus:gene pairs putatively interacting with EBV infection. A total of 10 SLE risk genes from this list (CD40, LYST, JAZF1, IRF5, BLK, IKZF2, IL12RB2, FAM167A, PTPRC and SLC15A) were targeted by the EBV transcription factor, EBNA2, differentially expressed between LCLs and B cells, and the majority were also associated with EBV DNA copy number, and expression level of EBV encoded genes. Our final gene network model based on these genes is suggestive of a nexus involving SLE risk loci and EBV latency III and B cell proliferation signalling pathways. Collectively, our findings provide further evidence to support the interaction between SLE risk loci and EBV infection that is in part mediated by EBNA2. This interplay may increase the tendency towards EBV lytic switching dependent on the presence of SLE risk alleles. These results support further investigation into targeting EBV as a therapeutic strategy for SLE.


Assuntos
Infecções por Vírus Epstein-Barr , Lúpus Eritematoso Sistêmico , Linfócitos B , Infecções por Vírus Epstein-Barr/complicações , Infecções por Vírus Epstein-Barr/genética , Herpesvirus Humano 4/genética , Humanos , Lúpus Eritematoso Sistêmico/genética , Lúpus Eritematoso Sistêmico/metabolismo , Transcriptoma
13.
Int J Mol Sci ; 23(22)2022 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-36430895

RESUMO

Here we developed KARAJ, a fast and flexible Linux command-line tool to automate the end-to-end process of querying and downloading a wide range of genomic and transcriptomic sequence data types. The input to KARAJ is a list of PMCIDs or publication URLs or various types of accession numbers to automate four tasks as follows; firstly, it provides a summary list of accessible datasets generated by or used in these scientific articles, enabling users to select appropriate datasets; secondly, KARAJ calculates the size of files that users want to download and confirms the availability of adequate space on the local disk; thirdly, it generates a metadata table containing sample information and the experimental design of the corresponding study; and lastly, it enables users to download supplementary data tables attached to publications. Further, KARAJ provides a parallel downloading framework powered by Aspera connect which reduces the downloading time significantly.


Assuntos
Software , Transcriptoma , Genoma , Genômica , Metadados
14.
Biochem Soc Trans ; 49(4): 1621-1631, 2021 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-34282824

RESUMO

Neurodevelopmental and neurodegenerative disorders (NNDs) are a group of conditions with a broad range of core and co-morbidities, associated with dysfunction of the central nervous system. Improvements in high throughput sequencing have led to the detection of putative risk genetic loci for NNDs, however, quantitative neurogenetic approaches need to be further developed in order to establish causality and underlying molecular genetic mechanisms of pathogenesis. Here, we discuss an approach for prioritizing the contribution of genetic risk loci to complex-NND pathogenesis by estimating the possible impacts of these loci on gene regulation. Furthermore, we highlight the use of a tissue-specificity gene expression index and the application of artificial intelligence (AI) to improve the interpretation of the role of genetic risk elements in NND pathogenesis. Given that NND symptoms are associated with brain dysfunction, risk loci with direct, causative actions would comprise genes with essential functions in neural cells that are highly expressed in the brain. Indeed, NND risk genes implicated in brain dysfunction are disproportionately enriched in the brain compared with other tissues, which we refer to as brain-specific expressed genes. In addition, the tissue-specificity gene expression index can be used as a handle to identify non-brain contexts that are involved in NND pathogenesis. Lastly, we discuss how using an AI approach provides the opportunity to integrate the biological impacts of risk loci to identify those putative combinations of causative relationships through which genetic factors contribute to NND pathogenesis.


Assuntos
Predisposição Genética para Doença , Doenças Neurodegenerativas/genética , Mapeamento Cromossômico , Expressão Gênica , Humanos
15.
J Biomed Inform ; 113: 103627, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33259944

RESUMO

In the last few years, the application of Machine Learning approaches like Deep Neural Network (DNN) models have become more attractive in the healthcare system given the rising complexity of the healthcare data. Machine Learning (ML) algorithms provide efficient and effective data analysis models to uncover hidden patterns and other meaningful information from the considerable amount of health data that conventional analytics are not able to discover in a reasonable time. In particular, Deep Learning (DL) techniques have been shown as promising methods in pattern recognition in the healthcare systems. Motivated by this consideration, the contribution of this paper is to investigate the deep learning approaches applied to healthcare systems by reviewing the cutting-edge network architectures, applications, and industrial trends. The goal is first to provide extensive insight into the application of deep learning models in healthcare solutions to bridge deep learning techniques and human healthcare interpretability. And then, to present the existing open challenges and future directions.


Assuntos
Aprendizado Profundo , Algoritmos , Atenção à Saúde , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
16.
BMC Genomics ; 21(1): 225, 2020 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-32164554

RESUMO

BACKGROUND: Hi-C is a molecular biology technique to understand the genome spatial structure. However, data obtained from Hi-C experiments is biased. Therefore, several methods have been developed to model Hi-C data and identify significant interactions. Each method receives its own Hi-C data structure and only work on specific operating systems. RESULTS: We introduce MHiC (Multi-function Hi-C data analysis tool), a tool to identify and visualize statistically signifiant interactions from Hi-C data. The MHiC tool (i) works on different operating systems, (ii) accepts various Hi-C data structures from different Hi-C analysis tools such as HiCUP or HiC-Pro, (iii) identify significant Hi-C interactions with GOTHiC, HiCNorm and Fit-Hi-C methods and (iv) visualizes interactions in Arc or Heatmap diagram. MHiC is an open-source tool which is freely available for download on https://github.com/MHi-C. CONCLUSIONS: MHiC is an integrated tool for the analysis of high-throughput chromosome conformation capture (Hi-C) data.


Assuntos
Cromatina/química , Biologia Computacional/métodos , Algoritmos , Cromatina/genética , Cromossomos/química , Humanos , Modelos Moleculares , Conformação Molecular , Interface Usuário-Computador
17.
Mol Biol Evol ; 33(12): 3205-3212, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27682824

RESUMO

The dinucleotide CpG is highly underrepresented in the genome of human immunodeficiency virus type 1 (HIV-1). To identify the source of CpG depletion in the HIV-1 genome, we investigated two biological mechanisms: (1) CpG methylation-induced transcriptional silencing and (2) CpG recognition by Toll-like receptors (TLRs). We hypothesized that HIV-1 has been under selective evolutionary pressure by these mechanisms leading to the reduction of CpG in its genome. A CpG depleted genome would enable HIV-1 to avoid methylation-induced transcriptional silencing and/or to avoid recognition by TLRs that identify foreign CpG sequences. We investigated these two hypotheses by determining the sequence context dependency of CpG depletion and comparing it with that of CpG methylation and TLR recognition. We found that in both human and HIV-1 genomes the CpG motifs flanked by T/A were depleted most and those flanked by C/G were depleted least. Similarly, our analyses of human methylome data revealed that the CpG motifs flanked by T/A were methylated most and those flanked by C/G were methylated least. Given that a similar CpG depletion pattern was observed for the human genome within which CpGs are not likely to be recognized by TLRs, we argue that the main source of CpG depletion in HIV-1 is likely host-induced methylation. Analyses of CpG motifs in over 100 viruses revealed that this unique CpG representation pattern is specific to the human and simian immunodeficiency viruses.


Assuntos
Ilhas de CpG , HIV-1/genética , Proteínas Repressoras/genética , Sequência de Bases , Evolução Biológica , Metilação de DNA , Bases de Dados de Ácidos Nucleicos , Fosfatos de Dinucleosídeos/genética , Genoma Humano , Humanos , Modelos Estatísticos , Proteínas Repressoras/metabolismo , Receptores Toll-Like/genética , Receptores Toll-Like/metabolismo
18.
J Immunol ; 194(9): 4112-21, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25825438

RESUMO

CD8(+) T cells are important for the control of chronic HIV infection. However, the virus rapidly acquires "escape mutations" that reduce CD8(+) T cell recognition and viral control. The timing of when immune escape occurs at a given epitope varies widely among patients and also among different epitopes within a patient. The strength of the CD8(+) T cell response, as well as mutation rates, patterns of particular amino acids undergoing escape, and growth rates of escape mutants, may affect when escape occurs. In this study, we analyze the epitope-specific CD8(+) T cells in 25 SIV-infected pigtail macaques responding to three SIV epitopes. Two epitopes showed a variable escape pattern and one had a highly monomorphic escape pattern. Despite very different patterns, immune escape occurs with a similar delay of on average 18 d after the epitope-specific CD8(+) T cells reach 0.5% of total CD8(+) T cells. We find that the most delayed escape occurs in one of the highly variable epitopes, and that this is associated with a delay in the epitope-specific CD8(+) T cells responding to this epitope. When we analyzed the kinetics of immune escape, we found that multiple escape mutants emerge simultaneously during the escape, implying that a diverse population of potential escape mutants is present during immune selection. Our results suggest that the conservation or variability of an epitope does not appear to affect the timing of immune escape in SIV. Instead, timing of escape is largely determined by the kinetics of epitope-specific CD8(+) T cells.


Assuntos
Linfócitos T CD8-Positivos/imunologia , Epitopos de Linfócito T/imunologia , Evasão da Resposta Imune/imunologia , Síndrome de Imunodeficiência Adquirida dos Símios/imunologia , Síndrome de Imunodeficiência Adquirida dos Símios/virologia , Vírus da Imunodeficiência Símia/imunologia , Animais , Cinética , Macaca , Vírus da Imunodeficiência Símia/genética , Fatores de Tempo
19.
J Virol ; 88(24): 14310-25, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25275134

RESUMO

UNLABELLED: The influence of major histocompatibility complex class I (MHC-I) alleles on human immunodeficiency virus (HIV) diversity in humans has been well characterized at the population level. MHC-I alleles likely affect viral diversity in the simian immunodeficiency virus (SIV)-infected pig-tailed macaque (Macaca nemestrina) model, but this is poorly characterized. We studied the evolution of SIV in pig-tailed macaques with a range of MHC-I haplotypes. SIV(mac251) genomes were amplified from the plasma of 44 pig-tailed macaques infected with SIV(mac251) at 4 to 10 months after infection and characterized by Illumina deep sequencing. MHC-I typing was performed on cellular RNA using Roche/454 pyrosequencing. MHC-I haplotypes and viral sequence polymorphisms at both individual mutations and groups of mutations spanning 10-amino-acid segments were linked using in-house bioinformatics pipelines, since cytotoxic T lymphocyte (CTL) escape can occur at different amino acids within the same epitope in different animals. The approach successfully identified 6 known CTL escape mutations within 3 Mane-A1*084-restricted epitopes. The approach also identified over 70 new SIV polymorphisms linked to a variety of MHC-I haplotypes. Using functional CD8 T cell assays, we confirmed that one of these associations, a Mane-B028 haplotype-linked mutation in Nef, corresponded to a CTL epitope. We also identified mutations associated with the Mane-B017 haplotype that were previously described to be CTL epitopes restricted by Mamu-B*017:01 in rhesus macaques. This detailed study of pig-tailed macaque MHC-I genetics and SIV polymorphisms will enable a refined level of analysis for future vaccine design and strategies for treatment of HIV infection. IMPORTANCE: Cytotoxic T lymphocytes select for virus escape mutants of HIV and SIV, and this limits the effectiveness of vaccines and immunotherapies against these viruses. Patterns of immune escape variants are similar in HIV type 1-infected human subjects that share the same MHC-I genes, but this has not been studied for SIV infection of macaques. By studying SIV sequence diversity in 44 MHC-typed SIV-infected pigtail macaques, we defined over 70 sites within SIV where mutations were common in macaques sharing particular MHC-I genes. Further, pigtail macaques sharing nearly identical MHC-I genes with rhesus macaques responded to the same CTL epitope and forced immune escape. This allows many reagents developed to study rhesus macaques to also be used to study pigtail macaques. Overall, our study defines sites of immune escape in SIV in pigtailed macaques, and this enables a more refined level of analysis of future vaccine design and strategies for treatment of HIV infection.


Assuntos
Epitopos de Linfócito T/imunologia , Antígenos de Histocompatibilidade Classe I/imunologia , Mutação de Sentido Incorreto , Síndrome de Imunodeficiência Adquirida dos Símios/imunologia , Síndrome de Imunodeficiência Adquirida dos Símios/virologia , Vírus da Imunodeficiência Símia/imunologia , Linfócitos T Citotóxicos/imunologia , Animais , Epitopos de Linfócito T/genética , Haplótipos , Antígenos de Histocompatibilidade Classe I/genética , Evasão da Resposta Imune , Macaca nemestrina , Vírus da Imunodeficiência Símia/classificação , Vírus da Imunodeficiência Símia/genética
20.
J Biomed Inform ; 58: 220-225, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26494601

RESUMO

The human genome encodes for a family of editing enzymes known as APOBEC3 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3). They induce context dependent G-to-A changes, referred to as "hypermutation", in the genome of viruses such as HIV, SIV, HBV and endogenous retroviruses. Hypermutation is characterized by aligning affected sequences to a reference sequence. We show that indels (insertions/deletions) in the sequences lead to an incorrect assignment of APOBEC3 targeted and non-target sites. This can result in an incorrect identification of hypermutated sequences and erroneous biological inferences made based on hypermutation analysis.


Assuntos
Mutação , Alinhamento de Sequência , Vírus/genética , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA