RESUMO
PURPOSE: A large proportion of Common variable immunodeficiency (CVID) patients has duodenal inflammation with increased intraepithelial lymphocytes (IEL) of unknown aetiology. The histologic similarities to celiac disease, lead to confusion regarding treatment (gluten-free diet) of these patients. We aimed to elucidate the role of epigenetic DNA methylation in the aetiology of duodenal inflammation in CVID and differentiate it from true celiac disease. METHODS: DNA was isolated from snap-frozen pieces of duodenal biopsies and analysed for differences in genome-wide epigenetic DNA methylation between CVID patients with increased IEL (CVID_IEL; n = 5) without IEL (CVID_N; n = 3), celiac disease (n = 3) and healthy controls (n = 3). RESULTS: The DNA methylation data of 5-methylcytosine in CpG sites separated CVID and celiac diseases from healthy controls. Differential methylation in promoters of genes were identified as potential novel mediators in CVID and celiac disease. There was limited overlap of methylation associated genes between CVID_IEL and Celiac disease. High frequency of differentially methylated CpG sites was detected in over 100 genes nearby transcription start site (TSS) in both CVID_IEL and celiac disease, compared to healthy controls. Differential methylation of genes involved in regulation of TNF/cytokine production were enriched in CVID_IEL, compared to healthy controls. CONCLUSION: This is the first study to reveal a role of epigenetic DNA methylation in the etiology of duodenal inflammation of CVID patients, distinguishing CVID_IEL from celiac disease. We identified potential biomarkers and therapeutic targets within gene promotors and in high-frequency differentially methylated CpG regions proximal to TSS in both CVID_IEL and celiac disease.
Assuntos
Doença Celíaca , Imunodeficiência de Variável Comum , Ilhas de CpG , Metilação de DNA , Duodeno , Epigênese Genética , Humanos , Imunodeficiência de Variável Comum/genética , Duodeno/metabolismo , Duodeno/patologia , Doença Celíaca/genética , Feminino , Masculino , Adulto , Pessoa de Meia-Idade , Ilhas de CpG/genética , Regiões Promotoras Genéticas/genética , Linfócitos Intraepiteliais/imunologia , Adulto Jovem , Estudo de Associação Genômica Ampla , 5-Metilcitosina/metabolismoRESUMO
BACKGROUND: Shotgun metagenome sequencing data obtained from a host environment will usually be contaminated with sequences from the host organism. Host sequences should be removed before further analysis to avoid biases, reduce downstream computational load, or ensure privacy in the case of a human host. The tools that we identified, as designed specifically to perform host contamination sequence removal, were either outdated, not maintained, or complicated to use. Consequently, we have developed HoCoRT, a fast and user-friendly tool that implements several methods for optimised host sequence removal. We have evaluated the speed and accuracy of these methods. RESULTS: HoCoRT is an open-source command-line tool for host contamination removal. It is designed to be easy to install and use, offering a one-step option for genome indexing. HoCoRT employs a variety of well-known mapping, classification, and alignment methods to classify reads. The user can select the underlying classification method and its parameters, allowing adaptation to different scenarios. Based on our investigation of various methods and parameters using synthetic human gut and oral microbiomes, and on assessment of publicly available data, we provide recommendations for typical datasets with short and long reads. CONCLUSIONS: To decontaminate a human gut microbiome with short reads using HoCoRT, we found the optimal combination of speed and accuracy with BioBloom, Bowtie2 in end-to-end mode, and HISAT2. Kraken2 consistently demonstrated the highest speed, albeit with a trade-off in accuracy. The same applies to an oral microbiome, but here Bowtie2 was notably slower than the other tools. For long reads, the detection of human host reads is more difficult. In this case, a combination of Kraken2 and Minimap2 achieved the highest accuracy and detected 59% of human reads. In comparison to the dedicated DeconSeq tool, HoCoRT using Bowtie2 in end-to-end mode proved considerably faster and slightly more accurate. HoCoRT is available as a Bioconda package, and the source code can be accessed at https://github.com/ignasrum/hocort along with the documentation. It is released under the MIT licence and is compatible with Linux and macOS (except for the BioBloom module).
Assuntos
Microbiota , Software , Humanos , Metagenoma , Análise de Sequência de DNA/métodos , Microbiota/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
MOTIVATION: Adaptive immune receptor (AIR) repertoires (AIRRs) record past immune encounters with exquisite specificity. Therefore, identifying identical or similar AIR sequences across individuals is a key step in AIRR analysis for revealing convergent immune response patterns that may be exploited for diagnostics and therapy. Existing methods for quantifying AIRR overlap scale poorly with increasing dataset numbers and sizes. To address this limitation, we developed CompAIRR, which enables ultra-fast computation of AIRR overlap, based on either exact or approximate sequence matching. RESULTS: CompAIRR improves computational speed 1000-fold relative to the state of the art and uses only one-third of the memory: on the same machine, the exact pairwise AIRR overlap of 104 AIRRs with 105 sequences is found in â¼17 min, while the fastest alternative tool requires 10 days. CompAIRR has been integrated with the machine learning ecosystem immuneML to speed up commonly used AIRR-based machine learning applications. AVAILABILITY AND IMPLEMENTATION: CompAIRR code and documentation are available at https://github.com/uio-bmi/compairr. Docker images are available at https://hub.docker.com/r/torognes/compairr. The code to replicate the synthetic datasets, scripts for benchmarking and creating figures, and all raw data underlying the figures are available at https://github.com/uio-bmi/compairr-benchmarking. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Ecossistema , Software , Humanos , Aprendizado de Máquina , BenchmarkingRESUMO
MOTIVATION: Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. RESULTS: When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic. AVAILABILITY AND IMPLEMENTATION: Source code and binaries are available at https://github.com/torognes/swarm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Software , Análise por ConglomeradosRESUMO
C-type lectin-like domain family 16 member A (CLEC16A) is associated with autoimmune disorders, including multiple sclerosis (MS), but its functional relevance is not completely understood. CLEC16A is expressed in several immune cells, where it affects autophagic processes and receptor expression. Recently, we reported that the risk genotype of an MS-associated single nucleotide polymorphism in CLEC16A intron 19 is associated with higher expression of CLEC16A in CD4+ T cells. Here, we show that CLEC16A expression is induced in CD4+ T cells upon T cell activation. By the use of imaging flow cytometry and confocal microscopy, we demonstrate that CLEC16A is located in Rab4a-positive recycling endosomes in Jurkat TAg T cells. CLEC16A knock-down in Jurkat cells resulted in lower cell surface expression of the T cell receptor, however, this did not have a major impact on T cell activation response in vitro in Jurkat nor in human, primary CD4+ T cells.
Assuntos
Linfócitos T CD4-Positivos/imunologia , Predisposição Genética para Doença/genética , Lectinas Tipo C/genética , Proteínas de Transporte de Monossacarídeos/genética , Esclerose Múltipla/genética , Receptores de Antígenos de Linfócitos T/biossíntese , Proteínas rab4 de Ligação ao GTP/metabolismo , Linhagem Celular Tumoral , Endossomos/metabolismo , Citometria de Fluxo , Humanos , Células Jurkat , Ativação Linfocitária/imunologia , Microscopia Confocal , Esclerose Múltipla/imunologia , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
BACKGROUND: Colorectal cancer (CRC) screening reduces CRC incidence and mortality. However, current screening methods are either hampered by invasiveness or suboptimal performance, limiting their effectiveness as primary screening methods. To aid in the development of a non-invasive screening test with improved sensitivity and specificity, we have initiated a prospective biomarker study (CRCbiome), nested within a large randomized CRC screening trial in Norway. We aim to develop a microbiome-based classification algorithm to identify advanced colorectal lesions in screening participants testing positive for an immunochemical fecal occult blood test (FIT). We will also examine interactions with host factors, diet, lifestyle and prescription drugs. The prospective nature of the study also enables the analysis of changes in the gut microbiome following the removal of precancerous lesions. METHODS: The CRCbiome study recruits participants enrolled in the Bowel Cancer Screening in Norway (BCSN) study, a randomized trial initiated in 2012 comparing once-only sigmoidoscopy to repeated biennial FIT, where women and men aged 50-74 years at study entry are invited to participate. Since 2017, participants randomized to FIT screening with a positive test result have been invited to join the CRCbiome study. Self-reported diet, lifestyle and demographic data are collected prior to colonoscopy after the positive FIT-test (baseline). Screening data, including colonoscopy findings are obtained from the BCSN database. Fecal samples for gut microbiome analyses are collected both before and 2 and 12 months after colonoscopy. Samples are analyzed using metagenome sequencing, with taxonomy profiles, and gene and pathway content as primary measures. CRCbiome data will also be linked to national registries to obtain information on prescription histories and cancer relevant outcomes occurring during the 10 year follow-up period. DISCUSSION: The CRCbiome study will increase our understanding of how the gut microbiome, in combination with lifestyle and environmental factors, influences the early stages of colorectal carcinogenesis. This knowledge will be crucial to develop microbiome-based screening tools for CRC. By evaluating biomarker performance in a screening setting, using samples from the target population, the generalizability of the findings to future screening cohorts is likely to be high. TRIAL REGISTRATION: ClinicalTrials.gov Identifier: NCT01538550 .
Assuntos
Neoplasias Colorretais/diagnóstico , Detecção Precoce de Câncer/métodos , Microbioma Gastrointestinal , Estilo de Vida , Idoso , Estudos de Casos e Controles , Colonoscopia , Neoplasias Colorretais/epidemiologia , Neoplasias Colorretais/microbiologia , Feminino , Seguimentos , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Noruega/epidemiologia , Sangue Oculto , Prognóstico , Estudos Prospectivos , Curva ROCRESUMO
BACKGROUND: Advances in whole genome sequencing strategies have provided the opportunity for genomic and comparative genomic analysis of a vast variety of organisms. The analysis results are highly dependent on the quality of the genome assemblies used. Assessment of the assembly accuracy may significantly increase the reliability of the analysis results and is therefore of great importance. RESULTS: Here, we present a new tool called NucBreak aimed at localizing structural errors in assemblies, including insertions, deletions, duplications, inversions, and different inter- and intra-chromosomal rearrangements. The approach taken by existing alternative tools is based on analysing reads that do not map properly to the assembly, for instance discordantly mapped reads, soft-clipped reads and singletons. NucBreak uses an entirely different and unique method to localise the errors. It is based on analysing the alignments of reads that are properly mapped to an assembly and exploit information about the alternative read alignments. It does not annotate detected errors. We have compared NucBreak with other existing assembly accuracy assessment tools, namely Pilon, REAPR, and FRCbam as well as with several structural variant detection tools, including BreakDancer, Lumpy, and Wham, by using both simulated and real datasets. CONCLUSIONS: The benchmarking results have shown that NucBreak in general predicts assembly errors of different types and sizes with relatively high sensitivity and with lower false discovery rate than the other tools. Such a balance between sensitivity and false discovery rate makes NucBreak a good alternative to the existing assembly accuracy assessment tools and SV detection tools. NucBreak is freely available at https://github.com/uio-bmi/NucBreak under the MPL license.
Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Genoma , Reprodutibilidade dos Testes , SoftwareRESUMO
BACKGROUND: Comparing sets of sequences is a situation frequently encountered in bioinformatics, examples being comparing an assembly to a reference genome, or two genomes to each other. The purpose of the comparison is usually to find where the two sets differ, e.g. to find where a subsequence is repeated or deleted, or where insertions have been introduced. Such comparisons can be done using whole-genome alignments. Several tools for making such alignments exist, but none of them 1) provides detailed information about the types and locations of all differences between the two sets of sequences, 2) enables visualisation of alignment results at different levels of detail, and 3) carefully takes genomic repeats into consideration. RESULTS: We here present NucDiff, a tool aimed at locating and categorizing differences between two sets of closely related DNA sequences. NucDiff is able to deal with very fragmented genomes, repeated sequences, and various local differences and structural rearrangements. NucDiff determines differences by a rigorous analysis of alignment results obtained by the NUCmer, delta-filter and show-snps programs in the MUMmer sequence alignment package. All differences found are categorized according to a carefully defined classification scheme covering all possible differences between two sequences. Information about the differences is made available as GFF3 files, thus enabling visualisation using genome browsers as well as usage of the results as a component in an analysis pipeline. NucDiff was tested with varying parameters for the alignment step and compared with existing alternatives, called QUAST and dnadiff. CONCLUSIONS: We have developed a whole genome alignment difference classification scheme together with the program NucDiff for finding such differences. The proposed classification scheme is comprehensive and can be used by other tools. NucDiff performs comparably to QUAST and dnadiff but gives much more detailed results that can easily be visualized. NucDiff is freely available on https://github.com/uio-cels/NucDiff under the MPL license.
Assuntos
DNA/química , Interface Usuário-Computador , Sequência de Bases , Genômica , Internet , Alinhamento de SequênciaRESUMO
BACKGROUND: As an intracellular human pathogen, Mycobacterium tuberculosis (Mtb) is facing multiple stressful stimuli inside the macrophage and the granuloma. Understanding Mtb responses to stress is essential to identify new virulence factors and pathways that play a role in the survival of the tubercle bacillus. The main goal of this study was to map the regulatory networks of differentially expressed (DE) transcripts in Mtb upon various forms of genotoxic stress. We exposed Mtb cells to oxidative (H2O2 or paraquat), nitrosative (DETA/NO), or alkylation (MNNG) stress or mitomycin C, inducing double-strand breaks in the DNA. Total RNA was isolated from treated and untreated cells and subjected to high-throughput deep sequencing. The data generated was analysed to identify DE genes encoding mRNAs, non-coding RNAs (ncRNAs), and the genes potentially targeted by ncRNAs. RESULTS: The most significant transcriptomic alteration with more than 700 DE genes was seen under nitrosative stress. In addition to genes that belong to the replication, recombination and repair (3R) group, mainly found under mitomycin C stress, we identified DE genes important for bacterial virulence and survival, such as genes of the type VII secretion system (T7SS) and the proline-glutamic acid/proline-proline-glutamic acid (PE/PPE) family. By predicting the structures of hypothetical proteins (HPs) encoded by DE genes, we found that some of these HPs might be involved in mycobacterial genome maintenance. We also applied a state-of-the-art method to predict potential target genes of the identified ncRNAs and found that some of these could regulate several genes that might be directly involved in the response to genotoxic stress. CONCLUSIONS: Our study reflects the complexity of the response of Mtb in handling genotoxic stress. In addition to genes involved in genome maintenance, other potential key players, such as the members of the T7SS and PE/PPE gene family, were identified. This plethora of responses is detected not only at the level of DE genes encoding mRNAs but also at the level of ncRNAs and their potential targets.
Assuntos
Dano ao DNA , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Mycobacterium tuberculosis/genética , Transcriptoma , Análise por Conglomerados , Dano ao DNA/efeitos dos fármacos , Perfilação da Expressão Gênica , Humanos , Peróxido de Hidrogênio/toxicidade , Metilnitronitrosoguanidina/toxicidade , Mycobacterium tuberculosis/efeitos dos fármacos , Sistemas de Secreção Tipo VII/genéticaRESUMO
BACKGROUND: With advances in next generation sequencing technology and analysis methods, single nucleotide variants (SNVs) and indels can be detected with high sensitivity and specificity in exome sequencing data. Recent studies have demonstrated the ability to detect disease-causing copy number variants (CNVs) in exome sequencing data. However, exonic CNV prediction programs have shown high false positive CNV counts, which is the major limiting factor for the applicability of these programs in clinical studies. RESULTS: We have developed a tool (cnvScan) to improve the clinical utility of computational CNV prediction in exome data. cnvScan can accept input from any CNV prediction program. cnvScan consists of two steps: CNV screening and CNV annotation. CNV screening evaluates CNV prediction using quality scores and refines this using an in-house CNV database, which greatly reduces the false positive rate. The annotation step provides functionally and clinically relevant information using multiple source datasets. We assessed the performance of cnvScan on CNV predictions from five different prediction programs using 64 exomes from Primary Immunodeficiency (PIDD) patients, and identified PIDD-causing CNVs in three individuals from two different families. CONCLUSIONS: In summary, cnvScan reduces the time and effort required to detect disease-causing CNVs by reducing the false positive count and providing annotation. This improves the clinical utility of CNV detection in exome data.
Assuntos
Variações do Número de Cópias de DNA/genética , Exoma/genética , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Éxons/genética , Feminino , Humanos , Masculino , Anotação de Sequência Molecular , MutaçãoRESUMO
The functions of several SOS regulated genes in Escherichia coli are still unknown, including dinQ. In this work we characterize dinQ and two small RNAs, agrA and agrB, with antisense complementarity to dinQ. Northern analysis revealed five dinQ transcripts, but only one transcript (+44) is actively translated. The +44 dinQ transcript translates into a toxic single transmembrane peptide localized in the inner membrane. AgrB regulates dinQ RNA by RNA interference to counteract DinQ toxicity. Thus the dinQ-agr locus shows the classical features of a type I TA system and has many similarities to the tisB-istR locus. DinQ overexpression depolarizes the cell membrane and decreases the intracellular ATP concentration, demonstrating that DinQ can modulate membrane-dependent processes. Augmented DinQ strongly inhibits marker transfer by Hfr conjugation, indicating a role in recombination. Furthermore, DinQ affects transformation of nucleoid morphology in response to UV damage. We hypothesize that DinQ is a transmembrane peptide that modulates membrane-dependent activities such as nucleoid compaction and recombination.
Assuntos
Membrana Celular , Proteínas de Escherichia coli/genética , Escherichia coli , Proteínas de Membrana/genética , RNA Bacteriano , Membrana Celular/genética , Membrana Celular/metabolismo , Membrana Celular/efeitos da radiação , Citoplasma , Dano ao DNA/efeitos da radiação , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , Regulação Bacteriana da Expressão Gênica/efeitos da radiação , Peptídeos/genética , Peptídeos/metabolismo , RNA Antissenso/genética , RNA Antissenso/metabolismo , RNA Bacteriano/genética , RNA Bacteriano/metabolismo , Recombinação Genética/genética , Resposta SOS em Genética/efeitos da radiação , Transativadores/genética , Transativadores/metabolismo , Raios UltravioletaRESUMO
Piwi proteins and Piwi-interacting small RNAs (piRNAs) have known functions in transposon silencing in the male germline of fetal and newborn mice. Both are also present in adult testes; however, their function here remains a mystery. Here, we confirm that most piRNAs in meiotic spermatocytes originate from clusters in non-repeat intergenic regions of DNA. The regulation of these piRNA clusters, including the processing of the precursor transcripts into individual piRNAs, is accomplished through mostly unknown processes. We present a possible regulatory mechanism for one such cluster, named cluster 1082B, located on chromosome 7 in the mouse genome. The 1082B precursor transcript and its 788 unique piRNAs are repressed by the Alkbh1 dioxygenase and the testis-specific transcription repressor Tzfp. We observe a remarkable >1000-fold upregulation of individual piRNAs in pachytene spermatocytes isolated from Alkbh1- and Tzfp-deficient murine testes. Repression of cluster 1082B is further supported by the identification of a 10-bp Tzfp recognition sequence contained within the precursor transcript. Downregulation of LINE1 and IAP transcripts in the Alkbh1- and Tzfp-deficient mice leads us to propose a potential role for the 1082B-encoded piRNAs in transposon control.
Assuntos
DNA Liase (Sítios Apurínicos ou Apirimidínicos)/fisiologia , Regulação da Expressão Gênica , Estágio Paquíteno/genética , RNA Interferente Pequeno/metabolismo , Proteínas Repressoras/fisiologia , Espermatócitos/metabolismo , Homólogo AlkB 1 da Histona H2a Dioxigenase , Animais , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/genética , Regulação para Baixo , Genes de Partícula A Intracisternal , Elementos Nucleotídeos Longos e Dispersos , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Dados de Sequência Molecular , Mutação , Precursores de RNA/metabolismo , Proteínas Repressoras/genética , Testículo/metabolismoRESUMO
The ability of epithelial monolayers to self-organize into a dynamic polarized state, where cells migrate in a uniform direction, is essential for tissue regeneration, development, and tumor progression. However, the mechanisms governing long-range polar ordering of motility direction in biological tissues remain unclear. Here, we investigate the self-organizing behavior of quiescent epithelial monolayers that transit to a dynamic state with long-range polar order upon growth factor exposure. We demonstrate that the heightened self-propelled activity of monolayer cells leads to formation of vortex-antivortex pairs that undergo sequential annihilation, ultimately driving the spread of long-range polar order throughout the system. A computational model, which treats the monolayer as an active elastic solid, accurately replicates this behavior, and weakening of cell-to-cell interactions impedes vortex-antivortex annihilation and polar ordering. Our findings uncover a mechanism in epithelia, where elastic solid material characteristics, activated self-propulsion, and topology-mediated guidance converge to fuel a highly efficient polar self-ordering activity.
Assuntos
Comunicação Celular , Movimento Celular , EpitélioRESUMO
Stool samples for fecal immunochemical tests (FIT) are collected in large numbers worldwide as part of colorectal cancer screening programs. Employing FIT samples from 1034 CRCbiome participants, recruited from a Norwegian colorectal cancer screening study, we identify, annotate and characterize more than 18000 DNA viruses, using shotgun metagenome sequencing. Only six percent of them are assigned to a known taxonomic family, with Microviridae being the most prevalent viral family. Linking individual profiles to comprehensive lifestyle and demographic data shows 17/25 of the variables to be associated with the gut virome. Physical activity, smoking, and dietary fiber consumption exhibit strong and consistent associations with both diversity and relative abundance of individual viruses, as well as with enrichment for auxiliary metabolic genes. We demonstrate the suitability of FIT samples for virome analysis, opening an opportunity for large-scale studies of this enigmatic part of the gut microbiome. The diverse viral populations and their connections to the individual lifestyle uncovered herein paves the way for further exploration of the role of the gut virome in health and disease.
Assuntos
Neoplasias Colorretais , Vírus , Humanos , Viroma , Vírus de DNA/genética , Vírus/genética , DNA , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genéticaRESUMO
The recently discovered HEAT-like repeat (HLR) DNA glycosylase superfamily is widely distributed in all domains of life. The present bioinformatics and phylogenetic analysis shows that HLR DNA glycosylase superfamily members in the genus Bacillus form three subfamilies: AlkC, AlkD and AlkF/AlkG. The crystal structure of AlkF shows structural similarity with the DNA glycosylases AlkC and AlkD, however neither AlkF nor AlkG display any DNA glycosylase activity. Instead, both proteins have affinity to branched DNA structures such as three-way and Holliday junctions. A unique ß-hairpin in the AlkF/AlkG subfamily is most likely inserted into the DNA major groove, and could be a structural determinant regulating DNA substrate affinity. We conclude that AlkF and AlkG represent a new family of HLR proteins with affinity for branched DNA structures.
Assuntos
Bacillus cereus/enzimologia , Proteínas de Bactérias/química , DNA Glicosilases/química , Sítios de Ligação , Cromatografia Líquida de Alta Pressão , Análise por Conglomerados , Escherichia coli/genética , Modelos Moleculares , Mutagênese Sítio-Dirigida , Conformação de Ácido Nucleico , Estrutura Terciária de ProteínaRESUMO
AlkB homolog 1 (ALKBH1) is one of nine members of the family of mammalian AlkB homologs. Most Alkbh1(-/-) mice die during embryonic development, and survivors are characterized by defects in tissues originating from the ectodermal lineage. In this study, we show that deletion of Alkbh1 prolonged the expression of pluripotency markers in embryonic stem cells and delayed the induction of genes involved in early differentiation. In vitro differentiation to neural progenitor cells (NPCs) displayed an increased rate of apoptosis in the Alkbh1(-/-) NPCs when compared with wild-type cells. Whole-genome expression analysis and chromatin immunoprecipitation revealed that ALKBH1 regulates both directly and indirectly, a subset of genes required for neural development. Furthermore, our in vitro enzyme activity assays demonstrate that ALKBH1 is a histone dioxygenase that acts specifically on histone H2A. Mass spectrometric analysis demonstrated that histone H2A from Alkbh1(-/-) mice are improperly methylated. Our results suggest that ALKBH1 is involved in neural development by modifying the methylation status of histone H2A.
Assuntos
DNA Liase (Sítios Apurínicos ou Apirimidínicos)/metabolismo , Células-Tronco Embrionárias/citologia , Células-Tronco Embrionárias/enzimologia , Histonas/metabolismo , Células-Tronco Neurais/citologia , Células-Tronco Neurais/enzimologia , Homólogo AlkB 1 da Histona H2a Dioxigenase , Animais , Apoptose/genética , Apoptose/fisiologia , Diferenciação Celular/genética , Diferenciação Celular/fisiologia , Núcleo Celular/enzimologia , Metilação de DNA , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/deficiência , DNA Liase (Sítios Apurínicos ou Apirimidínicos)/genética , Epigenômica , Histonas/genética , Camundongos , Análise em Microsséries , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/enzimologia , TransfecçãoRESUMO
Huntington's disease (HD) is one of several neurodegenerative disorders caused by expansion of CAG repeats in a coding gene. Somatic CAG expansion rates in HD vary between organs, and the greatest instability is observed in the brain, correlating with neuropathology. The fundamental mechanisms of somatic CAG repeat instability are poorly understood, but locally formed secondary DNA structures generated during replication and/or repair are believed to underlie triplet repeat expansion. Recent studies in HD mice have demonstrated that mismatch repair (MMR) and base excision repair (BER) proteins are expansion inducing components in brain tissues. This study was designed to simultaneously investigate the rates and modes of expansion in different tissues of HD R6/1 mice in order to further understand the expansion mechanisms in vivo. We demonstrate continuous small expansions in most somatic tissues (exemplified by tail), which bear the signature of many short, probably single-repeat expansions and contractions occurring over time. In contrast, striatum and cortex display a dramatic--and apparently irreversible--periodic expansion. Expansion profiles displaying this kind of periodicity in the expansion process have not previously been reported. These in vivo findings imply that mechanistically distinct expansion processes occur in different tissues.
Assuntos
Doença de Huntington/genética , Expansão das Repetições de Trinucleotídeos , Animais , Modelos Animais de Doenças , Proteína Huntingtina , Doença de Huntington/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Proteínas do Tecido Nervoso/genética , Proteínas do Tecido Nervoso/metabolismo , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Repetições de TrinucleotídeosRESUMO
Many high-throughput sequencing datasets can be represented as objects with coordinates along a reference genome. Currently, biological investigations often involve a large number of such datasets, for example representing different cell types or epigenetic factors. Drawing overall conclusions from a large collection of results for individual datasets may be challenging and time-consuming. Meaningful interpretation often requires the results to be aggregated according to metadata that represents biological characteristics of interest. In this light, we here propose the hierarchical Genomic Suite HyperBrowser (hGSuite), an open-source extension to the GSuite HyperBrowser platform, which aims to provide a means for extracting key results from an aggregated collection of high-throughput DNA sequencing data. The hGSuite utilizes a metadata-informed data cube to calculate various statistics across the multiple dimensions of the datasets. With this work, we show that the hGSuite and its associated data cube methodology offers a quick and accessible way for exploratory analysis of large genomic datasets. The web-based toolkit named hGsuite Hyperbrowser is available at https://hyperbrowser.uio.no/hgsuite under a GPLv3 license.
Assuntos
Metadados , Software , Genômica/métodos , Genoma , InternetRESUMO
DNA loop extrusion emerges as a key process establishing genome structure and function. We introduce MoDLE, a computational tool for fast, stochastic modeling of molecular contacts from DNA loop extrusion capable of simulating realistic contact patterns genome wide in a few minutes. MoDLE accurately simulates contact maps in concordance with existing molecular dynamics approaches and with Micro-C data and does so orders of magnitude faster than existing approaches. MoDLE runs efficiently on machines ranging from laptops to high performance computing clusters and opens up for exploratory and predictive modeling of 3D genome structure in a wide range of settings.
Assuntos
DNARESUMO
BACKGROUND: The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. RESULTS: A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. CONCLUSIONS: Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.