Búsqueda | OPS/OMS Uruguay

1.

Biopsy Proteome Scoring to Determine Mucosal Remodeling in Celiac Disease.

Johansen, Anette; Sandve, Geir Kjetil F; Ibsen, Jostein Holen; Lundin, Knut E A; Sollid, Ludvig M; Stamnaes, Jorunn.

Gastroenterology ; 167(3): 493-504.e10, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38467384

RESUMEN

BACKGROUND & AIMS: Histologic evaluation of gut biopsies is a cornerstone for diagnosis and management of celiac disease (CeD). Despite its wide use, the method depends on proper biopsy orientation, and it suffers from interobserver variability. Biopsy proteome measurement reporting on the tissue state can be obtained by mass spectrometry analysis of formalin-fixed paraffin-embedded tissue. Here we aimed to transform biopsy proteome data into numerical scores that give observer-independent measures of mucosal remodeling in CeD. METHODS: A pipeline using glass-mounted formalin-fixed paraffin-embedded sections for mass spectrometry-based proteome analysis was established. Proteome data were converted to numerical scores using 2 complementary approaches: a rank-based enrichment score and a score based on machine learning using logistic regression. The 2 scoring approaches were compared with each other and with histology analyzing 18 patients with CeD with biopsies collected before and after treatment with a gluten-free diet as well as biopsies from patients with CeD with varying degree of remission (n = 22). Biopsies from individuals without CeD (n = 32) were also analyzed. RESULTS: The method yielded reliable proteome scoring of both unstained and H&E-stained glass-mounted sections. The scores of the 2 approaches were highly correlated, reflecting that both approaches pick up proteome changes in the same biological pathways. The proteome scores correlated with villus height-to-crypt depth ratio. Thus, the method is able to score biopsies with poor orientation. CONCLUSIONS: Biopsy proteome scores give reliable observer and orientation-independent measures of mucosal remodeling in CeD. The proteomic method can readily be implemented by nonexpert laboratories in parallel to histology assessment and easily scaled for clinical trial settings.

Asunto(s)

Enfermedad Celíaca , Dieta Sin Gluten , Mucosa Intestinal , Proteoma , Proteómica , Enfermedad Celíaca/patología , Enfermedad Celíaca/metabolismo , Enfermedad Celíaca/diagnóstico , Humanos , Mucosa Intestinal/patología , Mucosa Intestinal/metabolismo , Biopsia , Proteoma/análisis , Proteómica/métodos , Femenino , Masculino , Adulto , Aprendizaje Automático , Persona de Mediana Edad , Espectrometría de Masas , Variaciones Dependientes del Observador , Valor Predictivo de las Pruebas , Adhesión en Parafina , Reproducibilidad de los Resultados , Estudios de Casos y Controles

2.

Individualized VDJ recombination predisposes the available Ig sequence space.

Slabodkin, Andrei; Chernigovskaya, Maria; Mikocziova, Ivana; Akbar, Rahmad; Scheffer, Lonneke; Pavlovic, Milena; Bashour, Habib; Snapkov, Igor; Mehta, Brij Bhushan; Weber, Cédric R; Gutierrez-Marcos, Jose; Sollid, Ludvig M; Haff, Ingrid Hobæk; Sandve, Geir Kjetil; Robert, Philippe A; Greiff, Victor.

Genome Res ; 31(12): 2209-2224, 2021 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-34815307

RESUMEN

The process of recombination between variable (V), diversity (D), and joining (J) immunoglobulin (Ig) gene segments determines an individual's naive Ig repertoire and, consequently, (auto)antigen recognition. VDJ recombination follows probabilistic rules that can be modeled statistically. So far, it remains unknown whether VDJ recombination rules differ between individuals. If these rules differed, identical (auto)antigen-specific Ig sequences would be generated with individual-specific probabilities, signifying that the available Ig sequence space is individual specific. We devised a sensitivity-tested distance measure that enables inter-individual comparison of VDJ recombination models. We discovered, accounting for several sources of noise as well as allelic variation in Ig sequencing data, that not only unrelated individuals but also human monozygotic twins and even inbred mice possess statistically distinguishable immunoglobulin recombination models. This suggests that, in addition to genetic, there is also nongenetic modulation of VDJ recombination. We demonstrate that population-wide individualized VDJ recombination can result in orders of magnitude of difference in the probability to generate (auto)antigen-specific Ig sequences. Our findings have implications for immune receptor-based individualized medicine approaches relevant to vaccination, infection, and autoimmunity.

3.

TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences.

Dahal-Koirala, Shiva; Balaban, Gabriel; Neumann, Ralf Stefan; Scheffer, Lonneke; Lundin, Knut Erik Aslaksen; Greiff, Victor; Sollid, Ludvig Magne; Qiao, Shuo-Wang; Sandve, Geir Kjetil.

Brief Bioinform ; 23(2)2022 03 10.

Artículo en Inglés | MEDLINE | ID: mdl-35062022

RESUMEN

T-cell receptor (TCR) sequencing has enabled the development of innovative diagnostic tests for cancers, autoimmune diseases and other applications. However, the rarity of many T-cell clonotypes presents a detection challenge, which may lead to misdiagnosis if diagnostically relevant TCRs remain undetected. To address this issue, we developed TCRpower, a novel computational pipeline for quantifying the statistical detection power of TCR sequencing methods. TCRpower calculates the probability of detecting a TCR sequence as a function of several key parameters: in-vivo TCR frequency, T-cell sample count, read sequencing depth and read cutoff. To calibrate TCRpower, we selected unique TCRs of 45 T-cell clones (TCCs) as spike-in TCRs. We sequenced the spike-in TCRs from TCCs, together with TCRs from peripheral blood, using a 5' RACE protocol. The 45 spike-in TCRs covered a wide range of sample frequencies, ranging from 5 per 100 to 1 per 1 million. The resulting spike-in TCR read counts and ground truth frequencies allowed us to calibrate TCRpower. In our TCR sequencing data, we observed a consistent linear relationship between sample and sequencing read frequencies. We were also able to reliably detect spike-in TCRs with frequencies as low as one per million. By implementing an optimized read cutoff, we eliminated most of the falsely detected sequences in our data (TCR α-chain 99.0% and TCR ß-chain 92.4%), thereby improving diagnostic specificity. TCRpower is publicly available and can be used to optimize future TCR sequencing experiments, and thereby enable reliable detection of disease-relevant TCRs for diagnostic applications.

Asunto(s)

Receptores de Antígenos de Linfocitos T , Humanos , Receptores de Antígenos de Linfocitos T/genética , Receptores de Antígenos de Linfocitos T alfa-beta/genética , Linfocitos T

4.

Adjustment of spurious correlations in co-expression measurements from RNA-Sequencing data.

Hsieh, Ping-Han; Lopes-Ramos, Camila Miranda; Zucknick, Manuela; Sandve, Geir Kjetil; Glass, Kimberly; Kuijjer, Marieke Lydia.

Bioinformatics ; 39(10)2023 10 03.

Artículo en Inglés | MEDLINE | ID: mdl-37802917

RESUMEN

MOTIVATION: Gene co-expression measurements are widely used in computational biology to identify coordinated expression patterns across a group of samples. Coordinated expression of genes may indicate that they are controlled by the same transcriptional regulatory program, or involved in common biological processes. Gene co-expression is generally estimated from RNA-Sequencing data, which are commonly normalized to remove technical variability. Here, we demonstrate that certain normalization methods, in particular quantile-based methods, can introduce false-positive associations between genes. These false-positive associations can consequently hamper downstream co-expression network analysis. Quantile-based normalization can, however, be extremely powerful. In particular, when preprocessing large-scale heterogeneous data, quantile-based normalization methods such as smooth quantile normalization can be applied to remove technical variability while maintaining global differences in expression for samples with different biological attributes. RESULTS: We developed SNAIL (Smooth-quantile Normalization Adaptation for the Inference of co-expression Links), a normalization method based on smooth quantile normalization specifically designed for modeling of co-expression measurements. We show that SNAIL avoids formation of false-positive associations in co-expression as well as in downstream network analyses. Using SNAIL, one can avoid arbitrary gene filtering and retain associations to genes that only express in small subgroups of samples. This highlights the method's potential future impact on network modeling and other association-based approaches in large-scale heterogeneous data. AVAILABILITY AND IMPLEMENTATION: The implementation of the SNAIL algorithm and code to reproduce the analyses described in this work can be found in the GitHub repository https://github.com/kuijjerlab/PySNAIL.

Asunto(s)

Perfilación de la Expresión Génica , ARN , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Algoritmos , Biología Computacional

5.

CompAIRR: ultra-fast comparison of adaptive immune receptor repertoires by exact and approximate sequence matching.

Rognes, Torbjørn; Scheffer, Lonneke; Greiff, Victor; Sandve, Geir Kjetil.

Bioinformatics ; 38(17): 4230-4232, 2022 09 02.

Artículo en Inglés | MEDLINE | ID: mdl-35852318

RESUMEN

MOTIVATION: Adaptive immune receptor (AIR) repertoires (AIRRs) record past immune encounters with exquisite specificity. Therefore, identifying identical or similar AIR sequences across individuals is a key step in AIRR analysis for revealing convergent immune response patterns that may be exploited for diagnostics and therapy. Existing methods for quantifying AIRR overlap scale poorly with increasing dataset numbers and sizes. To address this limitation, we developed CompAIRR, which enables ultra-fast computation of AIRR overlap, based on either exact or approximate sequence matching. RESULTS: CompAIRR improves computational speed 1000-fold relative to the state of the art and uses only one-third of the memory: on the same machine, the exact pairwise AIRR overlap of 104 AIRRs with 105 sequences is found in â¼17 min, while the fastest alternative tool requires 10 days. CompAIRR has been integrated with the machine learning ecosystem immuneML to speed up commonly used AIRR-based machine learning applications. AVAILABILITY AND IMPLEMENTATION: CompAIRR code and documentation are available at https://github.com/uio-bmi/compairr. Docker images are available at https://hub.docker.com/r/torognes/compairr. The code to replicate the synthetic datasets, scripts for benchmarking and creating figures, and all raw data underlying the figures are available at https://github.com/uio-bmi/compairr-benchmarking. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Ecosistema , Programas Informáticos , Humanos , Aprendizaje Automático , Benchmarking

6.

ANDA: an open-source tool for automated image analysis of in vitro neuronal cells.

Wæhler, Hallvard Austin; Labba, Nils-Anders; Paulsen, Ragnhild Elisabeth; Sandve, Geir Kjetil; Eskeland, Ragnhild.

BMC Neurosci ; 24(1): 56, 2023 10 24.

Artículo en Inglés | MEDLINE | ID: mdl-37875799

RESUMEN

BACKGROUND: Imaging of in vitro neuronal differentiation and measurements of cell morphologies have led to novel insights into neuronal development. Live-cell imaging techniques and large datasets of images have increased the demand for automated pipelines for quantitative analysis of neuronal morphological metrics. RESULTS: ANDA is an analysis workflow that quantifies various aspects of neuronal morphology from high-throughput live-cell imaging screens of in vitro neuronal cell types. This tool automates the analysis of neuronal cell numbers, neurite lengths and neurite attachment points. We used chicken, rat, mouse, and human in vitro models for neuronal differentiation and have demonstrated the accuracy, versatility, and efficiency of the tool. CONCLUSIONS: ANDA is an open-source tool that is easy to use and capable of automated processing from time-course measurements of neuronal cells. The strength of this pipeline is the capability to analyse high-throughput imaging screens.

Asunto(s)

Neuritas , Neuronas , Ratones , Ratas , Animales , Humanos , Neuritas/fisiología , Neurogénesis/fisiología , Procesamiento de Imagen Asistido por Computador/métodos , Recuento de Células

7.

T cell receptor repertoire as a potential diagnostic marker for celiac disease.

Yao, Ying; Zia, Asima; Neumann, Ralf Stefan; Pavlovic, Milena; Balaban, Gabriel; Lundin, Knut E A; Sandve, Geir Kjetil; Qiao, Shuo-Wang.

Clin Immunol ; 222: 108621, 2021 01.

Artículo en Inglés | MEDLINE | ID: mdl-33197618

RESUMEN

An individual's T cell repertoire is skewed towards some specificities as a result of past antigen exposure and subsequent clonal expansion. Identifying T cell receptor signatures associated with a disease is challenging due to the overall complexity of antigens and polymorphic HLA allotypes. In celiac disease, the antigen epitopes are well characterised and the specific HLA-DQ2-restricted T-cell repertoire associated with the disease has been explored in depth. By investigating T cell receptor repertoires of unsorted lamina propria T cells from 15 individuals, we provide the first proof-of-concept study showing that it could be possible to infer disease state by matching against a priori known disease-associated T cell receptor sequences.

Asunto(s)

Enfermedad Celíaca/diagnóstico , Enfermedad Celíaca/inmunología , Epítopos de Linfocito T/inmunología , Receptores de Antígenos de Linfocitos T/inmunología , Adolescente , Adulto , Anciano , Biomarcadores , Antígenos HLA-DQ/genética , Antígenos HLA-DQ/inmunología , Humanos , Activación de Linfocitos/inmunología , Persona de Mediana Edad , Membrana Mucosa/citología , Membrana Mucosa/inmunología , Adulto Joven

8.

A map of direct TF-DNA interactions in the human genome.

Gheorghe, Marius; Sandve, Geir Kjetil; Khan, Aziz; Chèneby, Jeanne; Ballester, Benoit; Mathelier, Anthony.

Nucleic Acids Res ; 47(4): e21, 2019 02 28.

Artículo en Inglés | MEDLINE | ID: mdl-30517703

RESUMEN

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the most popular assay to identify genomic regions, called ChIP-seq peaks, that are bound in vivo by transcription factors (TFs). These regions are derived from direct TF-DNA interactions, indirect binding of the TF to the DNA (through a co-binding partner), nonspecific binding to the DNA, and noise/bias/artifacts. Delineating the bona fide direct TF-DNA interactions within the ChIP-seq peaks remains challenging. We developed a dedicated software, ChIP-eat, that combines computational TF binding models and ChIP-seq peaks to automatically predict direct TF-DNA interactions. Our work culminated with predicted interactions covering >4% of the human genome, obtained by uniformly processing 1983 ChIP-seq peak data sets from the ReMap database for 232 unique TFs. The predictions were a posteriori assessed using protein binding microarray and ChIP-exo data, and were predominantly found in high quality ChIP-seq peaks. The set of predicted direct TF-DNA interactions suggested that high-occupancy target regions are likely not derived from direct binding of the TFs to the DNA. Our predictions derived co-binding TFs supported by protein-protein interaction data and defined cis-regulatory modules enriched for disease- and trait-associated SNPs. We provide this collection of direct TF-DNA interactions and cis-regulatory modules through the UniBind web-interface (http://unibind.uio.no).

Asunto(s)

Biología Computacional , ADN/genética , Genoma Humano/genética , Factores de Transcripción/genética , Algoritmos , Sitios de Unión/genética , Inmunoprecipitación de Cromatina , Mapeo Cromosómico/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Unión Proteica/genética , Análisis de Secuencia de ADN/métodos

9.

NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.

Khelik, Ksenia; Sandve, Geir Kjetil; Nederbragt, Alexander Johan; Rognes, Torbjørn.

BMC Bioinformatics ; 21(1): 66, 2020 Feb 21.

Artículo en Inglés | MEDLINE | ID: mdl-32085722

RESUMEN

BACKGROUND: Advances in whole genome sequencing strategies have provided the opportunity for genomic and comparative genomic analysis of a vast variety of organisms. The analysis results are highly dependent on the quality of the genome assemblies used. Assessment of the assembly accuracy may significantly increase the reliability of the analysis results and is therefore of great importance. RESULTS: Here, we present a new tool called NucBreak aimed at localizing structural errors in assemblies, including insertions, deletions, duplications, inversions, and different inter- and intra-chromosomal rearrangements. The approach taken by existing alternative tools is based on analysing reads that do not map properly to the assembly, for instance discordantly mapped reads, soft-clipped reads and singletons. NucBreak uses an entirely different and unique method to localise the errors. It is based on analysing the alignments of reads that are properly mapped to an assembly and exploit information about the alternative read alignments. It does not annotate detected errors. We have compared NucBreak with other existing assembly accuracy assessment tools, namely Pilon, REAPR, and FRCbam as well as with several structural variant detection tools, including BreakDancer, Lumpy, and Wham, by using both simulated and real datasets. CONCLUSIONS: The benchmarking results have shown that NucBreak in general predicts assembly errors of different types and sizes with relatively high sensitivity and with lower false discovery rate than the other tools. Such a balance between sensitivity and false discovery rate makes NucBreak a good alternative to the existing assembly accuracy assessment tools and SV detection tools. NucBreak is freely available at https://github.com/uio-bmi/NucBreak under the MPL license.

Asunto(s)

Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Genoma , Reproducibilidad de los Resultados , Programas Informáticos

10.

Colocalization analyses of genomic elements: approaches, recommendations and challenges.

Kanduri, Chakravarthi; Bock, Christoph; Gundersen, Sveinung; Hovig, Eivind; Sandve, Geir Kjetil.

Bioinformatics ; 35(9): 1615-1624, 2019 05 01.

Artículo en Inglés | MEDLINE | ID: mdl-30307532

RESUMEN

MOTIVATION: Many high-throughput methods produce sets of genomic regions as one of their main outputs. Scientists often use genomic colocalization analysis to interpret such region sets, for example to identify interesting enrichments and to understand the interplay between the underlying biological processes. Although widely used, there is little standardization in how these analyses are performed. Different practices can substantially affect the conclusions of colocalization analyses. RESULTS: Here, we describe the different approaches and provide recommendations for performing genomic colocalization analysis, while also discussing common methodological challenges that may influence the conclusions. As illustrated by concrete example cases, careful attention to analysis details is needed in order to meet these challenges and to obtain a robust and biologically meaningful interpretation of genomic region set data. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genoma , Genómica

11.

High-Throughput Single-Cell Analysis of B Cell Receptor Usage among Autoantigen-Specific Plasma Cells in Celiac Disease.

Roy, Bishnudeo; Neumann, Ralf S; Snir, Omri; Iversen, Rasmus; Sandve, Geir Kjetil; Lundin, Knut E A; Sollid, Ludvig M.

J Immunol ; 199(2): 782-791, 2017 07 15.

Artículo en Inglés | MEDLINE | ID: mdl-28600290

RESUMEN

Characterization of Ag-specific BCR repertoires is essential for understanding disease mechanisms involving humoral immunity. This is optimally done by interrogation of paired H chain V region (VH) and L chain V region (VL) sequences of individual and Ag-specific B cells. By applying single-cell high-throughput sequencing on gut lesion plasma cells (PCs), we have analyzed the transglutaminase 2 (TG2)-specific VH:VL autoantibody repertoire of celiac disease (CD) patients. Autoantibodies against TG2 are a hallmark of CD, and anti-TG2 IgA-producing gut PCs accumulate in patients upon gluten ingestion. Altogether, we analyzed paired VH and VL sequences of 1482 TG2-specific and 1421 non-TG2-specific gut PCs from 10 CD patients. Among TG2-specific PCs, we observed a striking bias in IGHV and IGKV/IGLV gene usage, as well as pairing preferences with a particular presence of the IGHV5-51:IGKV1-5 pair. Selective and biased VH:VL pairing was particularly evident among expanded clones. In general, TG2-specific PCs had lower numbers of mutations both in VH and VL genes than in non-TG2-specific PCs. TG2-specific PCs using IGHV5-51 had particularly few mutations. Importantly, VL segments paired with IGHV5-51 displayed proportionally low mutation numbers, suggesting that the low mutation rate among IGHV5-51 PCs is dictated by the BCR specificity. Finally, we observed selective amino acid changes in VH and VL and striking CDR3 length and J segment selection among TG2-specific IGHV5-51:IGKV1-5 pairs. Hence this study reveals features of a disease- and Ag-specific autoantibody repertoire with preferred VH:VL usage and pairings, limited mutations, clonal dominance, and selection of particular CDR3 sequences.

Asunto(s)

Autoanticuerpos/inmunología , Autoantígenos/inmunología , Enfermedad Celíaca/inmunología , Proteínas de Unión al GTP/inmunología , Células Plasmáticas/inmunología , Receptores de Antígenos de Linfocitos B/inmunología , Transglutaminasas/inmunología , Adulto , Autoantígenos/química , Autoantígenos/genética , Linfocitos B/inmunología , Femenino , Proteínas de Unión al GTP/sangre , Proteínas de Unión al GTP/genética , Glútenes/inmunología , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Inmunoglobulina A/inmunología , Cadenas Pesadas de Inmunoglobulina/genética , Cadenas Pesadas de Inmunoglobulina/inmunología , Región Variable de Inmunoglobulina/genética , Región Variable de Inmunoglobulina/inmunología , Mutación , Proteína Glutamina Gamma Glutamiltransferasa 2 , Receptores de Antígenos de Linfocitos B/genética , Receptores de Antígenos de Linfocitos B/metabolismo , Análisis de la Célula Individual , Transglutaminasas/sangre , Transglutaminasas/genética , Adulto Joven

12.

Mind the gaps: overlooking inaccessible regions confounds statistical testing in genome analysis.

Domanska, Diana; Kanduri, Chakravarthi; Simovski, Boris; Sandve, Geir Kjetil.

BMC Bioinformatics ; 19(1): 481, 2018 Dec 14.

Artículo en Inglés | MEDLINE | ID: mdl-30547739

RESUMEN

BACKGROUND: The current versions of reference genome assemblies still contain gaps represented by stretches of Ns. Since high throughput sequencing reads cannot be mapped to those gap regions, the regions are depleted of experimental data. Moreover, several technology platforms assay a targeted portion of the genomic sequence, meaning that regions from the unassayed portion of the genomic sequence cannot be detected in those experiments. We here refer to all such regions as inaccessible regions, and hypothesize that ignoring these regions in the null model may increase false findings in statistical testing of colocalization of genomic features. RESULTS: Our explorative analyses confirm that the genomic regions in public genomic tracks intersect very little with assembly gaps of human reference genomes (hg19 and hg38). The little intersection was observed only at the beginning and end portions of the gap regions. Further, we simulated a set of synthetic tracks by matching the properties of real genomic tracks in a way that nullified any true association between them. This allowed us to test our hypothesis that not avoiding inaccessible regions (as represented by assembly gaps) in the null model would result in spurious inflation of statistical significance. We contrasted the distributions of test statistics and p-values of Monte Carlo-based permutation tests that either avoided or did not avoid assembly gaps in the null model when testing colocalization between a pair of tracks. We observed that the statistical tests that did not account for assembly gaps in the null model resulted in a distribution of the test statistic that is shifted to the right and a distribution of p-values that is shifted to the left (indicating inflated significance). We observed a similar level of inflated significance in hg19 and hg38, despite assembly gaps covering a smaller proportion of the latter reference genome. CONCLUSION: We provide empirical evidence demonstrating that inaccessible regions, even when covering only a few percentages of the genome, can lead to a substantial amount of false findings if not accounted for in statistical colocalization analysis.

Asunto(s)

Factores de Confusión Epidemiológicos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Estadística como Asunto , Genómica , Humanos

13.

In the loop: promoter-enhancer interactions and bioinformatics.

Mora, Antonio; Sandve, Geir Kjetil; Gabrielsen, Odd Stokke; Eskeland, Ragnhild.

Brief Bioinform ; 17(6): 980-995, 2016 11.

Artículo en Inglés | MEDLINE | ID: mdl-26586731

RESUMEN

Enhancer-promoter regulation is a fundamental mechanism underlying differential transcriptional regulation. Spatial chromatin organization brings remote enhancers in contact with target promoters in cis to regulate gene expression. There is considerable evidence for promoter-enhancer interactions (PEIs). In the recent years, genome-wide analyses have identified signatures and mapped novel enhancers; however, being able to precisely identify their target gene(s) requires massive biological and bioinformatics efforts. In this review, we give a short overview of the chromatin landscape and transcriptional regulation. We discuss some key concepts and problems related to chromatin interaction detection technologies, and emerging knowledge from genome-wide chromatin interaction data sets. Then, we critically review different types of bioinformatics analysis methods and tools related to representation and visualization of PEI data, raw data processing and PEI prediction. Lastly, we provide specific examples of how PEIs have been used to elucidate a functional role of non-coding single-nucleotide polymorphisms. The topic is at the forefront of epigenetic research, and by highlighting some future bioinformatics challenges in the field, this review provides a comprehensive background for future PEI studies.

Asunto(s)

Regiones Promotoras Genéticas , Cromatina , Biología Computacional , Elementos de Facilitación Genéticos , Estudio de Asociación del Genoma Completo

14.

Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking.

Sandve, Geir Kjetil; Greiff, Victor.

Bioinformatics ; 38(21): 4994-4996, 2022 10 31.

Artículo en Inglés | MEDLINE | ID: mdl-36073940

Asunto(s)

Benchmarking , Biología Computacional , Algoritmos , Simulación por Computador

15.

The rainfall plot: its motivation, characteristics and pitfalls.

Domanska, Diana; Vodák, Daniel; Lund-Andersen, Christin; Salvatore, Stefania; Hovig, Eivind; Sandve, Geir Kjetil.

BMC Bioinformatics ; 18(1): 264, 2017 May 18.

Artículo en Inglés | MEDLINE | ID: mdl-28521741

RESUMEN

BACKGROUND: A visualization referred to as rainfall plot has recently gained popularity in genome data analysis. The plot is mostly used for illustrating the distribution of somatic cancer mutations along a reference genome, typically aiming to identify mutation hotspots. In general terms, the rainfall plot can be seen as a scatter plot showing the location of events on the x-axis versus the distance between consecutive events on the y-axis. Despite its frequent use, the motivation for applying this particular visualization and the appropriateness of its usage have never been critically addressed in detail. RESULTS: We show that the rainfall plot allows visual detection even for events occurring at high frequency over very short distances. In addition, event clustering at multiple scales may be detected as distinct horizontal bands in rainfall plots. At the same time, due to the limited size of standard figures, rainfall plots might suffer from inability to distinguish overlapping events, especially when multiple datasets are plotted in the same figure. We demonstrate the consequences of plot congestion, which results in obscured visual data interpretations. CONCLUSIONS: This work provides the first comprehensive survey of the characteristics and proper usage of rainfall plots. We find that the rainfall plot is able to convey a large amount of information without any need for parameterization or tuning. However, we also demonstrate how plot congestion and the use of a logarithmic y-axis may result in obscured visual data interpretations. To aid the productive utilization of rainfall plots, we demonstrate their characteristics and potential pitfalls using both simulated and real data, and provide a set of practical guidelines for their proper interpretation and usage.

Asunto(s)

Motivación , Programas Informáticos , Genoma Humano , Guías como Asunto , Humanos , Mutación/genética , Neoplasias Pancreáticas/genética

16.

NucDiff: in-depth characterization and annotation of differences between two sets of DNA sequences.

Khelik, Ksenia; Lagesen, Karin; Sandve, Geir Kjetil; Rognes, Torbjørn; Nederbragt, Alexander Johan.

BMC Bioinformatics ; 18(1): 338, 2017 Jul 12.

Artículo en Inglés | MEDLINE | ID: mdl-28701187

RESUMEN

BACKGROUND: Comparing sets of sequences is a situation frequently encountered in bioinformatics, examples being comparing an assembly to a reference genome, or two genomes to each other. The purpose of the comparison is usually to find where the two sets differ, e.g. to find where a subsequence is repeated or deleted, or where insertions have been introduced. Such comparisons can be done using whole-genome alignments. Several tools for making such alignments exist, but none of them 1) provides detailed information about the types and locations of all differences between the two sets of sequences, 2) enables visualisation of alignment results at different levels of detail, and 3) carefully takes genomic repeats into consideration. RESULTS: We here present NucDiff, a tool aimed at locating and categorizing differences between two sets of closely related DNA sequences. NucDiff is able to deal with very fragmented genomes, repeated sequences, and various local differences and structural rearrangements. NucDiff determines differences by a rigorous analysis of alignment results obtained by the NUCmer, delta-filter and show-snps programs in the MUMmer sequence alignment package. All differences found are categorized according to a carefully defined classification scheme covering all possible differences between two sequences. Information about the differences is made available as GFF3 files, thus enabling visualisation using genome browsers as well as usage of the results as a component in an analysis pipeline. NucDiff was tested with varying parameters for the alignment step and compared with existing alternatives, called QUAST and dnadiff. CONCLUSIONS: We have developed a whole genome alignment difference classification scheme together with the program NucDiff for finding such differences. The proposed classification scheme is comprehensive and can be used by other tools. NucDiff performs comparably to QUAST and dnadiff but gives much more detailed results that can easily be visualized. NucDiff is freely available on https://github.com/uio-cels/NucDiff under the MPL license.

Asunto(s)

ADN/química , Interfaz Usuario-Computador , Secuencia de Bases , Genómica , Internet , Alineación de Secuencia

17.

Galaxy Portal: interacting with the galaxy platform through mobile devices.

Børnich, Claus; Grytten, Ivar; Hovig, Eivind; Paulsen, Jonas; Cech, Martin; Sandve, Geir Kjetil.

Bioinformatics ; 32(11): 1743-5, 2016 06 01.

Artículo en Inglés | MEDLINE | ID: mdl-26819474

RESUMEN

UNLABELLED: : We present Galaxy Portal app, an open source interface to the Galaxy system through smart phones and tablets. The Galaxy Portal provides convenient and efficient monitoring of job completion, as well as opportunities for inspection of results and execution history. In addition to being useful to the Galaxy community, we believe that the app also exemplifies a useful way of exploiting mobile interfaces for research/high-performance computing resources in general. AVAILABILITY AND IMPLEMENTATION: The source is freely available under a GPL license on GitHub, along with user documentation and pre-compiled binaries and instructions for several platforms: https://github.com/Tarostar/QMLGalaxyPortal It is available for iOS version 7 (and newer) through the Apple App Store, and for Android through Google Play for version 4.1 (API 16) or newer. CONTACT: geirksa@ifi.uio.no.

Asunto(s)

Aplicaciones Móviles , Programas Informáticos

18.

Ten simple rules for quick and dirty scientific programming.

Balaban, Gabriel; Grytten, Ivar; Rand, Knut Dagestad; Scheffer, Lonneke; Sandve, Geir Kjetil.

PLoS Comput Biol ; 17(3): e1008549, 2021 03.

Artículo en Inglés | MEDLINE | ID: mdl-33705383

Asunto(s)

Programas Informáticos , Investigación Biomédica , Biología Computacional , Humanos , Diseño de Software

19.

A map of direct TF-DNA interactions in the human genome.

Gheorghe, Marius; Sandve, Geir Kjetil; Khan, Aziz; Chèneby, Jeanne; Ballester, Benoit; Mathelier, Anthony.

Nucleic Acids Res ; 47(14): 7715, 2019 Aug 22.

Artículo en Inglés | MEDLINE | ID: mdl-31251803

20.

HiBrowse: multi-purpose statistical analysis of genome-wide chromatin 3D organization.

Paulsen, Jonas; Sandve, Geir Kjetil; Gundersen, Sveinung; Lien, Tonje G; Trengereid, Kai; Hovig, Eivind.

Bioinformatics ; 30(11): 1620-2, 2014 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-24511080

RESUMEN

UNLABELLED: Recently developed methods that couple next-generation sequencing with chromosome conformation capture-based techniques, such as Hi-C and ChIA-PET, allow for characterization of genome-wide chromatin 3D structure. Understanding the organization of chromatin in three dimensions is a crucial next step in the unraveling of global gene regulation, and methods for analyzing such data are needed. We have developed HiBrowse, a user-friendly web-tool consisting of a range of hypothesis-based and descriptive statistics, using realistic assumptions in null-models. AVAILABILITY AND IMPLEMENTATION: HiBrowse is supported by all major browsers, and is freely available at http://hyperbrowser.uio.no/3d. Software is implemented in Python, and source code is available for download by following instructions on the main site.

Asunto(s)

Cromatina/química , Programas Informáticos , Interpretación Estadística de Datos , Genoma , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA