Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 244
Filtrar
1.
Bioinformatics ; 40(6)2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38796686

RESUMO

SUMMARY: The increasing development of sequence-based machine learning models has raised the demand for manipulating sequences for this application. However, existing approaches to edit and evaluate genome sequences using models have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo), a scalable and comprehensive tool for performing and supporting in silico mutagenesis experiments. We then demonstrate how pairs of reference and perturbed sequences can be used with machine learning models to prioritize pathogenic variants or discover new functional sequences. AVAILABILITY AND IMPLEMENTATION: SuPreMo was written in Python, and can be run using only one line of code to generate both sequences and 3D genome disruption scores. The codebase, instructions for installation and use, and tutorials are on the GitHub page: https://github.com/ketringjoni/SuPreMo.


Assuntos
Aprendizado de Máquina , Software , Simulação por Computador , Biologia Computacional/métodos , Humanos , Mutagênese
2.
Science ; 384(6698): eadh0829, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38781368

RESUMO

Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.


Assuntos
Processamento Alternativo , Encéfalo , Regulação da Expressão Gênica no Desenvolvimento , Transtornos Mentais , Humanos , Atlas como Assunto , Transtorno do Espectro Autista/genética , Encéfalo/metabolismo , Encéfalo/crescimento & desenvolvimento , Encéfalo/embriologia , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Locos de Características Quantitativas , Esquizofrenia/genética , Transcriptoma , Transtornos Mentais/genética
3.
bioRxiv ; 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38798605

RESUMO

CellWalker2 is a graph diffusion-based method for single-cell genomics data integration. It extends the CellWalker model by incorporating hierarchical relationships between cell types, providing estimates of statistical significance, and adding data structures for analyzing multi-omics data so that gene expression and open chromatin can be jointly modeled. Our open-source software enables users to annotate cells using existing ontologies and to probabilistically match cell types between two or more contexts, including across species. CellWalker2 can also map genomic regions to cell ontologies, enabling precise annotation of elements derived from bulk data, such as enhancers, genetic variants, and sequence motifs. Through simulation studies, we show that CellWalker2 performs better than existing methods in cell type annotation and mapping. We then use data from the brain and immune system to demonstrate CellWalker2's ability to discover cell type-specific regulatory programs and both conserved and divergent cell type relationships in complex tissues.

4.
Science ; 384(6698): eadh0559, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38781390

RESUMO

Nucleotide changes in gene regulatory elements are important determinants of neuronal development and diseases. Using massively parallel reporter assays in primary human cells from mid-gestation cortex and cerebral organoids, we interrogated the cis-regulatory activity of 102,767 open chromatin regions, including thousands of sequences with cell type-specific accessibility and variants associated with brain gene regulation. In primary cells, we identified 46,802 active enhancer sequences and 164 variants that alter enhancer activity. Activity was comparable in organoids and primary cells, suggesting that organoids provide an adequate model for the developing cortex. Using deep learning we decoded the sequence basis and upstream regulators of enhancer activity. This work establishes a comprehensive catalog of functional gene regulatory elements and variants in human neuronal development.


Assuntos
Córtex Cerebral , Neurogênese , Organoides , Humanos , Córtex Cerebral/embriologia , Córtex Cerebral/metabolismo , Cromatina/metabolismo , Cromatina/genética , Aprendizado Profundo , Elementos Facilitadores Genéticos , Regulação da Expressão Gênica no Desenvolvimento , Neurogênese/genética , Neurônios/metabolismo , Organoides/metabolismo , Sequências Reguladoras de Ácido Nucleico , Regiões Promotoras Genéticas , Elementos Reguladores de Transcrição
5.
mSystems ; 9(6): e0032124, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38742892

RESUMO

Ticks are increasingly important vectors of human and agricultural diseases. While many studies have focused on tick-borne bacteria, far less is known about tick-associated viruses and their roles in public health or tick physiology. To address this, we investigated patterns of bacterial and viral communities across two field populations of western black-legged ticks (Ixodes pacificus). Through metatranscriptomic analysis of 100 individual ticks, we quantified taxon prevalence, abundance, and co-occurrence with other members of the tick microbiome. In addition to commonly found tick-associated microbes, we assembled 11 novel RNA virus genomes from Rhabdoviridae, Chuviridae, Picornaviridae, Phenuiviridae, Reoviridae, Solemovidiae, Narnaviridae and two highly divergent RNA virus genomes lacking sequence similarity to any known viral families. We experimentally verified the presence of these in I. pacificus ticks across several life stages. We also unexpectedly identified numerous virus-like transcripts that are likely encoded by tick genomic DNA, and which are distinct from known endogenous viral element-mediated immunity pathways in invertebrates. Taken together, our work reveals that I. pacificus ticks carry a greater diversity of viruses than previously appreciated, in some cases resulting in evolutionarily acquired virus-like transcripts. Our findings highlight how pervasive and intimate tick-virus interactions are, with major implications for both the fundamental biology and vectorial capacity of I. pacificus ticks. IMPORTANCE: Ticks are increasingly important vectors of disease, particularly in the United States where expanding tick ranges and intrusion into previously wild areas has resulted in increasing human exposure to ticks. Emerging human pathogens have been identified in ticks at an increasing rate, and yet little is known about the full community of microbes circulating in various tick species, a crucial first step to understanding how they interact with each and their tick host, as well as their ability to cause disease in humans. We investigated the bacterial and viral communities of the Western blacklegged tick in California and found 11 previously uncharacterized viruses circulating in this population.


Assuntos
Ixodes , Animais , Ixodes/virologia , Ixodes/microbiologia , Transcriptoma , RNA Mensageiro/genética , Microbiota/genética , Genoma Viral/genética , Vírus de RNA/genética , Vírus de RNA/isolamento & purificação , Bactérias/genética , Bactérias/virologia , Bactérias/isolamento & purificação
6.
Nat Genet ; 56(6): 1156-1167, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38811842

RESUMO

Cis-regulatory elements (CREs) interact with trans regulators to orchestrate gene expression, but how transcriptional regulation is coordinated in multi-gene loci has not been experimentally defined. We sought to characterize the CREs controlling dynamic expression of the adjacent costimulatory genes CD28, CTLA4 and ICOS, encoding regulators of T cell-mediated immunity. Tiling CRISPR interference (CRISPRi) screens in primary human T cells, both conventional and regulatory subsets, uncovered gene-, cell subset- and stimulation-specific CREs. Integration with CRISPR knockout screens and assay for transposase-accessible chromatin with sequencing (ATAC-seq) profiling identified trans regulators influencing chromatin states at specific CRISPRi-responsive elements to control costimulatory gene expression. We then discovered a critical CCCTC-binding factor (CTCF) boundary that reinforces CRE interaction with CTLA4 while also preventing promiscuous activation of CD28. By systematically mapping CREs and associated trans regulators directly in primary human T cell subsets, this work overcomes longstanding experimental limitations to decode context-dependent gene regulatory programs in a complex, multi-gene locus critical to immune homeostasis.


Assuntos
Antígenos CD28 , Antígeno CTLA-4 , Cromatina , Regulação da Expressão Gênica , Humanos , Antígeno CTLA-4/genética , Antígenos CD28/genética , Cromatina/genética , Cromatina/metabolismo , Linfócitos T/imunologia , Linfócitos T/metabolismo , Proteína Coestimuladora de Linfócitos T Induzíveis/genética , Proteína Coestimuladora de Linfócitos T Induzíveis/metabolismo , Fator de Ligação a CCCTC/metabolismo , Fator de Ligação a CCCTC/genética , Sistemas CRISPR-Cas
7.
bioRxiv ; 2023 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-38045231

RESUMO

The investigation of chromatin organization in single cells holds great promise for identifying causal relationships between genome structure and function. However, analysis of single-molecule data is hampered by extreme yet inherent heterogeneity, making it challenging to determine the contributions of individual chromatin fibers to bulk trends. To address this challenge, we propose ChromaFactor, a novel computational approach based on non-negative matrix factorization that deconvolves single-molecule chromatin organization datasets into their most salient primary components. ChromaFactor provides the ability to identify trends accounting for the maximum variance in the dataset while simultaneously describing the contribution of individual molecules to each component. Applying our approach to two single-molecule imaging datasets across different genomic scales, we find that these primary components demonstrate significant correlation with key functional phenotypes, including active transcription, enhancer-promoter distance, and genomic compartment. ChromaFactor offers a robust tool for understanding the complex interplay between chromatin structure and function on individual DNA molecules, pinpointing which subpopulations drive functional changes and fostering new insights into cellular heterogeneity and its implications for bulk genomic phenomena.

8.
bioRxiv ; 2023 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-38045412

RESUMO

The most prevalent microbial eukaryote in the human gut is Blastocystis, an obligate commensal protist also common in many other vertebrates. Blastocystis is descended from free-living stramenopile ancestors; how it has adapted to thrive within humans and a wide range of hosts is unclear. Here, we cultivated six Blastocystis strains spanning the diversity of the genus and generated highly contiguous, annotated genomes with long-read DNA-seq, Hi-C, and RNA-seq. Comparative genomics between these strains and two closely related stramenopiles with different lifestyles, the lizard gut symbiont Proteromonas lacertae and the free-living marine flagellate Cafeteria burkhardae, reveal the evolutionary history of the Blastocystis genus. We find substantial gene content variability between Blastocystis strains. Blastocystis isolated from an herbivorous tortoise has many plant carbohydrate metabolizing enzymes, some horizontally acquired from bacteria, likely reflecting fermentation within the host gut. In contrast, human-isolated Blastocystis have gained many heat shock proteins, and we find numerous subtype-specific expansions of host-interfacing genes, including cell adhesion and cell surface glycan genes. In addition, we observe that human-isolated Blastocystis have substantial changes in gene structure, including shortened introns and intergenic regions, as well as genes lacking canonical termination codons. Finally, our data indicate that the common ancestor of Blastocystis lost nearly all ancestral genes for heterokont flagella morphology, including cilia proteins, microtubule motor proteins, and ion channel proteins. Together, these findings underscore the huge functional variability within the Blastocystis genus and provide candidate genes for the adaptations these lineages have undergone to thrive in the gut microbiomes of diverse vertebrates.

9.
bioRxiv ; 2023 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-37961120

RESUMO

Phenotypic divergence between closely related species, including bonobos and chimpanzees (genus Pan), is largely driven by variation in gene regulation. The 3D structure of the genome mediates gene expression; however, genome folding differences in Pan are not well understood. Here, we apply machine learning to predict genome-wide 3D genome contact maps from DNA sequence for 56 bonobos and chimpanzees, encompassing all five extant lineages. We use a pairwise approach to estimate 3D divergence between individuals from the resulting contact maps in 4,420 1 Mb genomic windows. While most pairs were similar, ∼17% were predicted to be substantially divergent in genome folding. The most dissimilar maps were largely driven by single individuals with rare variants that produce unique 3D genome folding in a region. We also identified 89 genomic windows where bonobo and chimpanzee contact maps substantially diverged, including several windows harboring genes associated with traits implicated in Pan phenotypic divergence. We used in silico mutagenesis to identify 51 3D-modifying variants in these bonobo-chimpanzee divergent windows, finding that 34 or 66.67% induce genome folding changes via CTCF binding motif disruption. Our results reveal 3D genome variation at the population-level and identify genomic regions where changes in 3D folding may contribute to phenotypic differences in our closest living relatives.

10.
bioRxiv ; 2023 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-37961123

RESUMO

Computationally editing genome sequences is a common bioinformatics task, but current approaches have limitations, such as incompatibility with structural variants, challenges in identifying responsible sequence perturbations, and the need for vcf file inputs and phased data. To address these bottlenecks, we present Sequence Mutator for Predictive Models (SuPreMo), a scalable and comprehensive tool for performing in silico mutagenesis. We then demonstrate how pairs of reference and perturbed sequences can be used with machine learning models to prioritize pathogenic variants or discover new functional sequences.

11.
medRxiv ; 2023 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-37961381

RESUMO

In frontotemporal lobar degeneration (FTLD), pathological protein aggregation is associated with a decline in human-specialized social-emotional and language functions. Most disease protein aggregates contain either TDP-43 (FTLD-TDP) or tau (FTLD-tau). Here, we explored whether FTLD targets brain regions that express genes containing human accelerated regions (HARs), conserved sequences that have undergone positive selection during recent human evolution. To this end, we used structural neuroimaging from patients with FTLD and normative human regional transcriptomic data to identify genes expressed in FTLD-targeted brain regions. We then integrated primate comparative genomic data to test our hypothesis that FTLD targets brain regions expressing recently evolved genes. In addition, we asked whether genes expressed in FTLD-targeted brain regions are enriched for genes that undergo cryptic splicing when TDP-43 function is impaired. We found that FTLD-TDP and FTLD-tau subtypes target brain regions that express overlapping and distinct genes, including many linked to neuromodulatory functions. Genes whose normative brain regional expression pattern correlated with FTLD cortical atrophy were strongly associated with HARs. Atrophy-correlated genes in FTLD-TDP showed greater overlap with TDP-43 cryptic splicing genes compared with atrophy-correlated genes in FTLD-tau. Cryptic splicing genes were enriched for HAR genes, and vice versa, but this effect was due to the confounding influence of gene length. Analyses performed at the individual-patient level revealed that the expression of HAR genes and cryptically spliced genes within putative regions of disease onset differed across FTLD-TDP subtypes. Overall, our findings suggest that FTLD targets brain regions that have undergone recent evolutionary specialization and provide intriguing potential leads regarding the transcriptomic basis for selective vulnerability in distinct FTLD molecular-anatomical subtypes.

12.
bioRxiv ; 2023 Oct 24.
Artigo em Inglês | MEDLINE | ID: mdl-37961712

RESUMO

Recent studies have highlighted the impact of both transcription and transcripts on 3D genome organization, particularly its dynamics. Here, we propose a deep learning framework, called AkitaR, that leverages both genome sequences and genome-wide RNA-DNA interactions to investigate the roles of chromatin-associated RNAs (caRNAs) on genome folding in HFFc6 cells. In order to disentangle the cis- and trans-regulatory roles of caRNAs, we compared models with nascent transcripts, trans-located caRNAs, open chromatin data, or DNA sequence alone. Both nascent transcripts and trans-located caRNAs improved the models' predictions, especially at cell-type-specific genomic regions. Analyses of feature importance scores revealed the contribution of caRNAs at TAD boundaries, chromatin loops and nuclear sub-structures such as nuclear speckles and nucleoli to the models' predictions. Furthermore, we identified non-coding RNAs (ncRNAs) known to regulate chromatin structures, such as MALAT1 and NEAT1, as well as several novel RNAs, RNY5, RPPH1, POLG-DT and THBS1-IT, that might modulate chromatin architecture through trans-interactions in HFFc6. Our modeling also suggests that transcripts from Alus and other repetitive elements may facilitate chromatin interactions through trans R-loop formation. Our findings provide new insights and generate testable hypotheses about the roles of caRNAs in shaping chromatin organization.

13.
Cell Genom ; 3(10): 100410, 2023 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-37868032

RESUMO

Natural and experimental genetic variants can modify DNA loops and insulating boundaries to tune transcription, but it is unknown how sequence perturbations affect chromatin organization genome wide. We developed a deep-learning strategy to quantify the effect of any insertion, deletion, or substitution on chromatin contacts and systematically scored millions of synthetic variants. While most genetic manipulations have little impact, regions with CTCF motifs and active transcription are highly sensitive, as expected. Our unbiased screen and subsequent targeted experiments also point to noncoding RNA genes and several families of repetitive elements as CTCF-motif-free DNA sequences with particularly large effects on nearby chromatin interactions, sometimes exceeding the effects of CTCF sites and explaining interactions that lack CTCF. We anticipate that our disruption tracks may be of broad interest and utility as a measure of 3D genome sensitivity, and our computational strategies may serve as a template for biological inquiry with deep learning.

14.
BMJ Open Sport Exerc Med ; 9(3): e001625, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37654513

RESUMO

Demand modelling for the allied health professionals (AHPs) workforce showed that significant expansion would be needed to successfully deliver on the National Health Service (NHS) Long Term Plan. The aim was to explore the use of AHP support workers with exercise qualifications in AHP services and to understand their current and potential role in NHS commissioned AHP services in England. The project had two phases and took place between October 2020 and January 2021. In phase one, an electronic survey was carried out to identify the scope and variation of exercise professionals working in AHP support roles in NHS commissioned services. Semi-structured interviews were conducted in phase two to gain further understanding about the experiences of those involved in AHP commissioned services. Survey data were analysed using descriptive statistics and interview data were qualitatively analysed using thematic analysis. Recorded interviews were transcribed and initially coded. Coding was then refined and themes were identified. Support workers with exercise qualifications made a valued contribution to AHP services and were considered cost-effective in delivering a specialised exercise intervention. AHP support workers contributed to a range of tasks relating to clinical exercise prescription. Collated data highlighted inconsistency in the way AHP support workers with exercise qualifications identified themselves, despite similar roles. Variation existed in the level of autonomy for AHP support workers with exercise qualifications, even within the same NHS Agenda for Change band. Attempts to manage this disparity involved numerous governance processes to ensure safe, high-quality healthcare in the context of delegation to support workers. Limited training and development opportunities and the lack of career progression for support workers were consistently acknowledged as a source of frustration and hindrance to individuals fulfilling their potential. AHP support workers with exercise qualifications have potential to positively impact service delivery providing added value to the NHS workforce.

15.
Genome Biol ; 24(1): 186, 2023 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-37563669

RESUMO

Existing single nucleotide polymorphism (SNP) genotyping algorithms do not scale for species with thousands of sequenced strains, nor do they account for conspecific redundancy. Here we present a bioinformatics tool, Maast, which empowers population genetic meta-analysis of microbes at an unrivaled scale. Maast implements a novel algorithm to heuristically identify a minimal set of diverse conspecific genomes, then constructs a reliable SNP panel for each species, and enables rapid and accurate genotyping using a hybrid of whole-genome alignment and k-mer exact matching. We demonstrate Maast's utility by genotyping thousands of Helicobacter pylori strains and tracking SARS-CoV-2 diversification.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Genótipo , SARS-CoV-2/genética , Genoma , Algoritmos , Polimorfismo de Nucleotídeo Único , Técnicas de Genotipagem
16.
Mol Cell ; 83(15): 2624-2640, 2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37419111

RESUMO

The four-dimensional nucleome (4DN) consortium studies the architecture of the genome and the nucleus in space and time. We summarize progress by the consortium and highlight the development of technologies for (1) mapping genome folding and identifying roles of nuclear components and bodies, proteins, and RNA, (2) characterizing nuclear organization with time or single-cell resolution, and (3) imaging of nuclear organization. With these tools, the consortium has provided over 2,000 public datasets. Integrative computational models based on these data are starting to reveal connections between genome structure and function. We then present a forward-looking perspective and outline current aims to (1) delineate dynamics of nuclear architecture at different timescales, from minutes to weeks as cells differentiate, in populations and in single cells, (2) characterize cis-determinants and trans-modulators of genome organization, (3) test functional consequences of changes in cis- and trans-regulators, and (4) develop predictive models of genome structure and function.


Assuntos
Núcleo Celular , Genoma , Genoma/genética , Núcleo Celular/genética , Núcleo Celular/metabolismo , Cromatina/metabolismo
17.
Elife ; 122023 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-37306300

RESUMO

Bacteria within the gut microbiota possess the ability to metabolize a wide array of human drugs, foods, and toxins, but the responsible enzymes for these chemical events remain largely uncharacterized due to the time-consuming nature of current experimental approaches. Attempts have been made in the past to computationally predict which bacterial species and enzymes are responsible for chemical transformations in the gut environment, but with low accuracy due to minimal chemical representation and sequence similarity search schemes. Here, we present an in silico approach that employs chemical and protein Similarity algorithms that Identify MicrobioMe Enzymatic Reactions (SIMMER). We show that SIMMER accurately predicts the responsible species and enzymes for a queried reaction, unlike previous methods. We demonstrate SIMMER use cases in the context of drug metabolism by predicting previously uncharacterized enzymes for 88 drug transformations known to occur in the human gut. We validate these predictions on external datasets and provide an in vitro validation of SIMMER's predictions for metabolism of methotrexate, an anti-arthritic drug. After demonstrating its utility and accuracy, we made SIMMER available as both a command-line and web tool, with flexible input and output options for determining chemical transformations within the human gut. We present SIMMER as a computational addition to the microbiome researcher's toolbox, enabling them to make informed hypotheses before embarking on the lengthy laboratory experiments required to characterize novel bacterial enzymes that can alter human ingested compounds.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Bactérias/metabolismo , Alimentos , Algoritmos
18.
Nat Commun ; 14(1): 3510, 2023 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-37316519

RESUMO

Microbial community function depends on both taxonomic composition and spatial organization. While composition of the human gut microbiome has been deeply characterized, less is known about the organization of microbes between regions such as lumen and mucosa and the microbial genes regulating this organization. Using a defined 117 strain community for which we generate high-quality genome assemblies, we model mucosa/lumen organization with in vitro cultures incorporating mucin hydrogel carriers as surfaces for bacterial attachment. Metagenomic tracking of carrier cultures reveals increased diversity and strain-specific spatial organization, with distinct strains enriched on carriers versus liquid supernatant, mirroring mucosa/lumen enrichment in vivo. A comprehensive search for microbial genes associated with this spatial organization identifies candidates with known adhesion-related functions, as well as novel links. These findings demonstrate that carrier cultures of defined communities effectively recapitulate fundamental aspects of gut spatial organization, enabling identification of key microbial strains and genes.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Microbioma Gastrointestinal/genética , Hidrogéis , Metagenoma , Microbiota/genética , Mucinas
19.
Res Sq ; 2023 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-37292728

RESUMO

Comparing chromatin contact maps is an essential step in quantifying how three-dimensional (3D) genome organization shapes development, evolution, and disease. However, no gold standard exists for comparing contact maps, and even simple methods often disagree. In this study, we propose novel comparison methods and evaluate them alongside existing approaches using genome-wide Hi-C data and 22,500 in silico predicted contact maps. We also quantify the robustness of methods to common sources of biological and technical variation, such as boundary size and noise. We find that simple difference-based methods such as mean squared error are suitable for initial screening, but biologically informed methods are necessary to identify why maps diverge and propose specific functional hypotheses. We provide a reference guide, codebase, and benchmark for rapidly comparing chromatin contact maps at scale to enable biological insights into the 3D organization of the genome.

20.
bioRxiv ; 2023 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-37066196

RESUMO

Comparing chromatin contact maps is an essential step in quantifying how three-dimensional (3D) genome organization shapes development, evolution, and disease. However, no gold standard exists for comparing contact maps, and even simple methods often disagree. In this study, we propose novel comparison methods and evaluate them alongside existing approaches using genome-wide Hi-C data and 22,500 in silico predicted contact maps. We also quantify the robustness of methods to common sources of biological and technical variation, such as boundary size and noise. We find that simple difference-based methods such as mean squared error are suitable for initial screening, but biologically informed methods are necessary to identify why maps diverge and propose specific functional hypotheses. We provide a reference guide, codebase, and benchmark for rapidly comparing chromatin contact maps at scale to enable biological insights into the 3D organization of the genome.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...