RESUMO
Determining the spatial organization and morphological characteristics of molecularly defined cell types is a major bottleneck for characterizing the architecture underpinning brain function. We developed Expansion-Assisted Iterative Fluorescence In Situ Hybridization (EASI-FISH) to survey gene expression in brain tissue, as well as a turnkey computational pipeline to rapidly process large EASI-FISH image datasets. EASI-FISH was optimized for thick brain sections (300 µm) to facilitate reconstruction of spatio-molecular domains that generalize across brains. Using the EASI-FISH pipeline, we investigated the spatial distribution of dozens of molecularly defined cell types in the lateral hypothalamic area (LHA), a brain region with poorly defined anatomical organization. Mapping cell types in the LHA revealed nine spatially and molecularly defined subregions. EASI-FISH also facilitates iterative reanalysis of scRNA-seq datasets to determine marker-genes that further dissociated spatial and morphological heterogeneity. The EASI-FISH pipeline democratizes mapping molecularly defined cell types, enabling discoveries about brain organization.
Assuntos
Região Hipotalâmica Lateral/metabolismo , Hibridização in Situ Fluorescente , Animais , Biomarcadores/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Região Hipotalâmica Lateral/citologia , Imageamento Tridimensional , Masculino , Camundongos Endogâmicos C57BL , Neurônios/metabolismo , Neuropeptídeos/metabolismo , Proteínas Proto-Oncogênicas c-fos/metabolismo , RNA/metabolismo , RNA-Seq , Análise de Célula Única , Transcrição GênicaRESUMO
Many proteins contain disordered regions of low-sequence complexity, which cause aging-associated diseases because they are prone to aggregate. Here, we study FUS, a prion-like protein containing intrinsically disordered domains associated with the neurodegenerative disease ALS. We show that, in cells, FUS forms liquid compartments at sites of DNA damage and in the cytoplasm upon stress. We confirm this by reconstituting liquid FUS compartments in vitro. Using an in vitro "aging" experiment, we demonstrate that liquid droplets of FUS protein convert with time from a liquid to an aggregated state, and this conversion is accelerated by patient-derived mutations. We conclude that the physiological role of FUS requires forming dynamic liquid-like compartments. We propose that liquid-like compartments carry the trade-off between functionality and risk of aggregation and that aberrant phase transitions within liquid-like compartments lie at the heart of ALS and, presumably, other age-related diseases.
Assuntos
Envelhecimento/patologia , Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/patologia , Mutação , Proteína FUS de Ligação a RNA/química , Proteína FUS de Ligação a RNA/genética , Envelhecimento/metabolismo , Esclerose Lateral Amiotrófica/metabolismo , Núcleo Celular/química , Citoplasma/química , Humanos , Príons/química , Agregados Proteicos , Estrutura Terciária de Proteína , Proteína FUS de Ligação a RNA/metabolismoRESUMO
The genomes of living lungfishes can inform on the molecular-developmental basis of the Devonian sarcopterygian fish-tetrapod transition. We de novo sequenced the genomes of the African (Protopterus annectens) and South American lungfishes (Lepidosiren paradoxa). The Lepidosiren genome (about 91 Gb, roughly 30 times the human genome) is the largest animal genome sequenced so far and more than twice the size of the Australian (Neoceratodus forsteri)1 and African2 lungfishes owing to enlarged intergenic regions and introns with high repeat content (about 90%). All lungfish genomes continue to expand as some transposable elements (TEs) are still active today. In particular, Lepidosiren's genome grew extremely fast during the past 100 million years (Myr), adding the equivalent of one human genome every 10 Myr. This massive genome expansion seems to be related to a reduction of PIWI-interacting RNAs and C2H2 zinc-finger and Krüppel-associated box (KRAB)-domain protein genes that suppress TE expansions. Although TE abundance facilitates chromosomal rearrangements, lungfish chromosomes still conservatively reflect the ur-tetrapod karyotype. Neoceratodus' limb-like fins still resemble those of their extinct relatives and remained phenotypically static for about 100 Myr. We show that the secondary loss of limb-like appendages in the Lepidosiren-Protopterus ancestor was probably due to loss of sonic hedgehog limb-specific enhancers.
Assuntos
Evolução Molecular , Peixes , Genoma , Animais , Humanos , África , Nadadeiras de Animais/anatomia & histologia , Austrália , Elementos de DNA Transponíveis/genética , DNA Intergênico/genética , Elementos Facilitadores Genéticos/genética , Extinção Biológica , Peixes/anatomia & histologia , Peixes/classificação , Peixes/genética , Rearranjo Gênico/genética , Genoma/genética , Tamanho do Genoma , Proteínas Hedgehog/genética , Íntrons , Cariótipo , Filogenia , RNA de Interação com Piwi/genética , América do Sul , Fatores de Tempo , Dedos de Zinco/genéticaRESUMO
The concomitant occurrence of tissue growth and organization is a hallmark of organismal development1-3. This often means that proliferating and differentiating cells are found at the same time in a continuously changing tissue environment. How cells adapt to architectural changes to prevent spatial interference remains unclear. Here, to understand how cell movements that are key for growth and organization are orchestrated, we study the emergence of photoreceptor neurons that occur during the peak of retinal growth, using zebrafish, human tissue and human organoids. Quantitative imaging reveals that successful retinal morphogenesis depends on the active bidirectional translocation of photoreceptors, leading to a transient transfer of the entire cell population away from the apical proliferative zone. This pattern of migration is driven by cytoskeletal machineries that differ depending on the direction: microtubules are exclusively required for basal translocation, whereas actomyosin is involved in apical movement. Blocking the basal translocation of photoreceptors induces apical congestion, which hampers the apical divisions of progenitor cells and leads to secondary defects in lamination. Thus, photoreceptor migration is crucial to prevent competition for space, and to allow concurrent tissue growth and lamination. This shows that neuronal migration, in addition to its canonical role in cell positioning4, can be involved in coordinating morphogenesis.
Assuntos
Movimento Celular , Morfogênese , Células Fotorreceptoras , Retina , Animais , Humanos , Actomiosina/metabolismo , Competição entre as Células , Diferenciação Celular , Movimento Celular/fisiologia , Proliferação de Células , Microtúbulos/metabolismo , Morfogênese/fisiologia , Organoides/citologia , Organoides/embriologia , Células Fotorreceptoras/citologia , Células Fotorreceptoras/fisiologia , Retina/citologia , Retina/embriologia , Peixe-Zebra/embriologiaRESUMO
The transition from 'well-marked varieties' of a single species into 'well-defined species'-especially in the absence of geographic barriers to gene flow (sympatric speciation)-has puzzled evolutionary biologists ever since Darwin1,2. Gene flow counteracts the buildup of genome-wide differentiation, which is a hallmark of speciation and increases the likelihood of the evolution of irreversible reproductive barriers (incompatibilities) that complete the speciation process3. Theory predicts that the genetic architecture of divergently selected traits can influence whether sympatric speciation occurs4, but empirical tests of this theory are scant because comprehensive data are difficult to collect and synthesize across species, owing to their unique biologies and evolutionary histories5. Here, within a young species complex of neotropical cichlid fishes (Amphilophus spp.), we analysed genomic divergence among populations and species. By generating a new genome assembly and re-sequencing 453 genomes, we uncovered the genetic architecture of traits that have been suggested to be important for divergence. Species that differ in monogenic or oligogenic traits that affect ecological performance and/or mate choice show remarkably localized genomic differentiation. By contrast, differentiation among species that have diverged in polygenic traits is genomically widespread and much higher overall, consistent with the evolution of effective and stable genome-wide barriers to gene flow. Thus, we conclude that simple trait architectures are not always as conducive to speciation with gene flow as previously suggested, whereas polygenic architectures can promote rapid and stable speciation in sympatry.
Assuntos
Ciclídeos/classificação , Ciclídeos/genética , Especiação Genética , Genoma/genética , Genômica , Simpatria/genética , Animais , Ciclídeos/anatomia & histologia , Feminino , Fluxo Gênico , Deriva Genética , Masculino , Preferência de Acasalamento Animal , Herança Multifatorial/genética , Filogenia , Pigmentação/genética , Polimorfismo GenéticoRESUMO
Bats possess extraordinary adaptations, including flight, echolocation, extreme longevity and unique immunity. High-quality genomes are crucial for understanding the molecular basis and evolution of these traits. Here we incorporated long-read sequencing and state-of-the-art scaffolding protocols1 to generate, to our knowledge, the first reference-quality genomes of six bat species (Rhinolophus ferrumequinum, Rousettus aegyptiacus, Phyllostomus discolor, Myotis myotis, Pipistrellus kuhlii and Molossus molossus). We integrated gene projections from our 'Tool to infer Orthologs from Genome Alignments' (TOGA) software with de novo and homology gene predictions as well as short- and long-read transcriptomics to generate highly complete gene annotations. To resolve the phylogenetic position of bats within Laurasiatheria, we applied several phylogenetic methods to comprehensive sets of orthologous protein-coding and noncoding regions of the genome, and identified a basal origin for bats within Scrotifera. Our genome-wide screens revealed positive selection on hearing-related genes in the ancestral branch of bats, which is indicative of laryngeal echolocation being an ancestral trait in this clade. We found selection and loss of immunity-related genes (including pro-inflammatory NF-κB regulators) and expansions of anti-viral APOBEC3 genes, which highlights molecular mechanisms that may contribute to the exceptional immunity of bats. Genomic integrations of diverse viruses provide a genomic record of historical tolerance to viral infection in bats. Finally, we found and experimentally validated bat-specific variation in microRNAs, which may regulate bat-specific gene-expression programs. Our reference-quality bat genomes provide the resources required to uncover and validate the genomic basis of adaptations of bats, and stimulate new avenues of research that are directly relevant to human health and disease1.
Assuntos
Adaptação Fisiológica/genética , Quirópteros/genética , Evolução Molecular , Genoma/genética , Genômica/normas , Adaptação Fisiológica/imunologia , Animais , Quirópteros/classificação , Quirópteros/imunologia , Elementos de DNA Transponíveis/genética , Imunidade/genética , Anotação de Sequência Molecular/normas , Filogenia , RNA não Traduzido/genética , Padrões de Referência , Reprodutibilidade dos Testes , Integração Viral/genética , Vírus/genéticaRESUMO
Some organisms in nature have developed the ability to enter a state of suspended metabolism called cryptobiosis when environmental conditions are unfavorable. This state-transition requires execution of a combination of genetic and biochemical pathways that enable the organism to survive for prolonged periods. Recently, nematode individuals have been reanimated from Siberian permafrost after remaining in cryptobiosis. Preliminary analysis indicates that these nematodes belong to the genera Panagrolaimus and Plectus. Here, we present precise radiocarbon dating indicating that the Panagrolaimus individuals have remained in cryptobiosis since the late Pleistocene (~46,000 years). Phylogenetic inference based on our genome assembly and a detailed morphological analysis demonstrate that they belong to an undescribed species, which we named Panagrolaimus kolymaensis. Comparative genome analysis revealed that the molecular toolkit for cryptobiosis in P. kolymaensis and in C. elegans is partly orthologous. We show that biochemical mechanisms employed by these two species to survive desiccation and freezing under laboratory conditions are similar. Our experimental evidence also reveals that C. elegans dauer larvae can remain viable for longer periods in suspended animation than previously reported. Altogether, our findings demonstrate that nematodes evolved mechanisms potentially allowing them to suspend life over geological time scales.
Assuntos
Nematoides , Pergelissolo , Humanos , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Larva/genética , Larva/metabolismo , FilogeniaRESUMO
Sea turtles represent an ancient lineage of marine vertebrates that evolved from terrestrial ancestors over 100 Mya. The genomic basis of the unique physiological and ecological traits enabling these species to thrive in diverse marine habitats remains largely unknown. Additionally, many populations have drastically declined due to anthropogenic activities over the past two centuries, and their recovery is a high global conservation priority. We generated and analyzed high-quality reference genomes for the leatherback (Dermochelys coriacea) and green (Chelonia mydas) turtles, representing the two extant sea turtle families. These genomes are highly syntenic and homologous, but localized regions of noncollinearity were associated with higher copy numbers of immune, zinc-finger, and olfactory receptor (OR) genes in green turtles, with ORs related to waterborne odorants greatly expanded in green turtles. Our findings suggest that divergent evolution of these key gene families may underlie immunological and sensory adaptations assisting navigation, occupancy of neritic versus pelagic environments, and diet specialization. Reduced collinearity was especially prevalent in microchromosomes, with greater gene content, heterozygosity, and genetic distances between species, supporting their critical role in vertebrate evolutionary adaptation. Finally, diversity and demographic histories starkly contrasted between species, indicating that leatherback turtles have had a low yet stable effective population size, exhibit extremely low diversity compared with other reptiles, and harbor a higher genetic load compared with green turtles, reinforcing concern over their persistence under future climate scenarios. These genomes provide invaluable resources for advancing our understanding of evolution and conservation best practices in an imperiled vertebrate lineage.
Assuntos
Tartarugas , Animais , Ecossistema , Dinâmica PopulacionalRESUMO
The escape of DNA from mitochondria into the nuclear genome (nuclear mitochondrial DNA, NUMT) is an ongoing process. Although pervasively observed in eukaryotic genomes, their evolutionary trajectories in a mammal-wide context are poorly understood. The main challenge lies in the orthology assignment of NUMTs across species due to their fast evolution and chromosomal rearrangements over the past 200 million years. To address this issue, we systematically investigated the characteristics of NUMT insertions in 45 mammalian genomes and established a novel, synteny-based method to accurately predict orthologous NUMTs and ascertain their evolution across mammals. With a series of comparative analyses across taxa, we revealed that NUMTs may originate from nonrandom regions in mtDNA, are likely found in transposon-rich and intergenic regions, and unlikely code for functional proteins. Using our synteny-based approach, we leveraged 630 pairwise comparisons of genome-wide microsynteny and predicted the NUMT orthology relationships across 36 mammals. With the phylogenetic patterns of NUMT presence-and-absence across taxa, we constructed the ancestral state of NUMTs given the mammal tree using a coalescent method. We found support on the ancestral node of Fereuungulata within Laurasiatheria, whose subordinal relationships are still controversial. This study broadens our knowledge on NUMT insertion and evolution in mammalian genomes and highlights the merit of NUMTs as alternative genetic markers in phylogenetic inference.
Assuntos
Genoma Mitocondrial , Genômica , Animais , Filogenia , Mitocôndrias/genética , DNA Mitocondrial/genética , Mamíferos/genética , Análise de Sequência de DNA , Núcleo Celular/genética , Evolução MolecularRESUMO
Variant calling has been widely used for genotyping and for improving the consensus accuracy of long-read assemblies. Variant calls are commonly hard-filtered with user-defined cutoffs. However, it is impossible to define a single set of optimal cutoffs, as the calls heavily depend on the quality of the reads, the variant caller of choice and the quality of the unpolished assembly. Here, we introduce Merfin, a k-mer based variant-filtering algorithm for improved accuracy in genotyping and genome assembly polishing. Merfin evaluates each variant based on the expected k-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller's internal score. Merfin increased the precision of genotyped calls in several benchmarks, improved consensus accuracy and reduced frameshift errors when applied to human and nonhuman assemblies built from Pacific Biosciences HiFi and continuous long reads or Oxford Nanopore reads, including the first complete human genome. Moreover, we introduce assembly quality and completeness metrics that account for the expected genomic copy numbers.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Nanoporos , Genoma , Genômica , Humanos , Análise de Sequência de DNARESUMO
A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific community: Sample Collection and Processing, Sequencing and Assembly, Annotation, Analysis, and IT and Informatics. The current versions of the resulting standards documents are available on the EBP website, with the recognition that opportunities, technologies, and challenges may improve or change in the future, requiring flexibility for the EBP to meet its goals. Here, we describe some highlights from the proposed standards, and areas where additional challenges will need to be met.
Assuntos
Sequência de Bases/genética , Eucariotos/genética , Genômica/normas , Animais , Biodiversidade , Genômica/métodos , Humanos , Padrões de Referência , Valores de Referência , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normasRESUMO
Horizontal transfer of transposable elements (TEs) is an important mechanism contributing to genetic diversity and innovation. Bats (order Chiroptera) have repeatedly been shown to experience horizontal transfer of TEs at what appears to be a high rate compared with other mammals. We investigated the occurrence of horizontally transferred (HT) DNA transposons involving bats. We found over 200 putative HT elements within bats; 16 transposons were shared across distantly related mammalian clades, and 2 other elements were shared with a fish and two lizard species. Our results indicate that bats are a hotspot for horizontal transfer of DNA transposons. These events broadly coincide with the diversification of several bat clades, supporting the hypothesis that DNA transposon invasions have contributed to genetic diversification of bats.
Assuntos
Quirópteros , Elementos de DNA Transponíveis , Animais , Elementos de DNA Transponíveis/genética , Quirópteros/genética , Transferência Genética Horizontal , Evolução Molecular , Mamíferos/genética , FilogeniaRESUMO
The C. elegans cell lineage provides a unique opportunity to look at how cell lineage affects patterns of gene expression. We developed an automatic cell lineage analyzer that converts high-resolution images of worms into a data table showing fluorescence expression with single-cell resolution. We generated expression profiles of 93 genes in 363 specific cells from L1 stage larvae and found that cells with identical fates can be formed by different gene regulatory pathways. Molecular signatures identified repeating cell fate modules within the cell lineage and enabled the generation of a molecular differentiation map that reveals points in the cell lineage when developmental fates of daughter cells begin to diverge. These results demonstrate insights that become possible using computational approaches to analyze quantitative expression from many genes in parallel using a digital gene expression atlas.
Assuntos
Caenorhabditis elegans/citologia , Caenorhabditis elegans/genética , Linhagem da Célula , Perfilação da Expressão Gênica , Animais , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans , Diferenciação Celular , Perfilação da Expressão Gênica/métodosRESUMO
The planarian Schmidtea mediterranea is an important model for stem cell research and regeneration, but adequate genome resources for this species have been lacking. Here we report a highly contiguous genome assembly of S. mediterranea, using long-read sequencing and a de novo assembler (MARVEL) enhanced for low-complexity reads. The S. mediterranea genome is highly polymorphic and repetitive, and harbours a novel class of giant retroelements. Furthermore, the genome assembly lacks a number of highly conserved genes, including critical components of the mitotic spindle assembly checkpoint, but planarians maintain checkpoint function. Our genome assembly provides a key model system resource that will be useful for studying regeneration and the evolutionary plasticity of core cell biological mechanisms.
Assuntos
Evolução Molecular , Genoma/genética , Planárias/citologia , Planárias/genética , Animais , Proteínas de Ciclo Celular/deficiência , Genômica , Pontos de Checagem da Fase M do Ciclo Celular/genética , Pontos de Checagem da Fase M do Ciclo Celular/fisiologia , Proteínas Mad2/deficiência , Planárias/fisiologia , Regeneração/genética , Reprodução Assexuada/genética , Retroelementos/genéticaRESUMO
In the originally published version of this Article, the sequenced axolotl strain (the homozygous white mutant) was denoted as 'D/D' rather than 'd/d' in Fig. 1a and the accompanying legend, the main text and the Methods section. The original Article has been corrected online.
RESUMO
Salamanders serve as important tetrapod models for developmental, regeneration and evolutionary studies. An extensive molecular toolkit makes the Mexican axolotl (Ambystoma mexicanum) a key representative salamander for molecular investigations. Here we report the sequencing and assembly of the 32-gigabase-pair axolotl genome using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL). We observed a size expansion of introns and intergenic regions, largely attributable to multiplication of long terminal repeat retroelements. We provide evidence that intron size in developmental genes is under constraint and that species-restricted genes may contribute to limb regeneration. The axolotl genome assembly does not contain the essential developmental gene Pax3. However, mutation of the axolotl Pax3 paralogue Pax7 resulted in an axolotl phenotype that was similar to those seen in Pax3-/- and Pax7-/- mutant mice. The axolotl genome provides a rich biological resource for developmental and evolutionary studies.
Assuntos
Ambystoma mexicanum/genética , Evolução Molecular , Genoma/genética , Genômica , Animais , DNA Intergênico/genética , Genes Essenciais/genética , Proteínas de Homeodomínio/genética , Íntrons/genética , Masculino , Camundongos , Fator de Transcrição PAX3/genética , Fator de Transcrição PAX7/genética , Picea/genética , Pinus/genética , Regeneração/genética , Retroelementos/genética , Sequências Repetidas Terminais/genéticaRESUMO
BACKGROUND: Smell abilities differ greatly among vertebrate species due to distinct sensory needs, with exceptional variability reported in the number of olfactory genes and the size of the odour-processing regions of the brain. However, key environmental factors shaping genomic and phenotypic changes linked to the olfactory system remain difficult to identify at macroevolutionary scales. Here, we investigate the association between diverse ecological traits and the number of olfactory chemoreceptors in approximately two hundred ray-finned fishes. RESULTS: We found independent expansions producing large gene repertoires in several lineages of nocturnal amphibious fishes, generally able to perform active terrestrial exploration. We reinforced this finding with on-purpose genomic and transcriptomic analysis of Channallabes apus, a catfish species from a clade with chemosensory-based aerial orientation. Furthermore, we also detected an augmented information-processing capacity in the olfactory bulb of nocturnal amphibious fishes by estimating the number of cells contained in this brain region in twenty-four actinopterygian species. CONCLUSIONS: Overall, we report a convergent genomic and phenotypic magnification of the olfactory system in nocturnal amphibious fishes. This finding suggests the possibility of an analogous evolutionary event in fish-like tetrapod ancestors during the first steps of the water-to-land transition, favouring terrestrial adaptation through enhanced aerial orientation.
Assuntos
Evolução Biológica , Vertebrados , Animais , Vertebrados/genética , Adaptação Fisiológica , Aclimatação , Peixes/genéticaRESUMO
BACKGROUND: PacBio high fidelity (HiFi) sequencing reads are both long (15-20 kb) and highly accurate (> Q20). Because of these properties, they have revolutionised genome assembly leading to more accurate and contiguous genomes. In eukaryotes the mitochondrial genome is sequenced alongside the nuclear genome often at very high coverage. A dedicated tool for mitochondrial genome assembly using HiFi reads is still missing. RESULTS: MitoHiFi was developed within the Darwin Tree of Life Project to assemble mitochondrial genomes from the HiFi reads generated for target species. The input for MitoHiFi is either the raw reads or the assembled contigs, and the tool outputs a mitochondrial genome sequence fasta file along with annotation of protein and RNA genes. Variants arising from heteroplasmy are assembled independently, and nuclear insertions of mitochondrial sequences are identified and not used in organellar genome assembly. MitoHiFi has been used to assemble 374 mitochondrial genomes (368 Metazoa and 6 Fungi species) for the Darwin Tree of Life Project, the Vertebrate Genomes Project and the Aquatic Symbiosis Genome Project. Inspection of 60 mitochondrial genomes assembled with MitoHiFi for species that already have reference sequences in public databases showed the widespread presence of previously unreported repeats. CONCLUSIONS: MitoHiFi is able to assemble mitochondrial genomes from a wide phylogenetic range of taxa from Pacbio HiFi data. MitoHiFi is written in python and is freely available on GitHub ( https://github.com/marcelauliano/MitoHiFi ). MitoHiFi is available with its dependencies as a Docker container on GitHub (ghcr.io/marcelauliano/mitohifi:master).
Assuntos
Genoma Mitocondrial , Filogenia , RNA , Eucariotos , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
MOTIVATION: Long tandem repeat expansions of more than 1000 nt have been suggested to be associated with diseases, but remain largely unexplored in individual human genomes because read lengths have been too short. However, new long-read sequencing technologies can produce single reads of 10 000 nt or more that can span such repeat expansions, although these long reads have high error rates, of 10-20%, which complicates the detection of repetitive elements. Moreover, most traditional algorithms for finding tandem repeats are designed to find short tandem repeats (<1000 nt) and cannot effectively handle the high error rate of long reads in a reasonable amount of time. RESULTS: Here, we report an efficient algorithm for solving this problem that takes advantage of the length of the repeat. Namely, a long tandem repeat has hundreds or thousands of approximate copies of the repeated unit, so despite the error rate, many short k-mers will be error-free in many copies of the unit. We exploited this characteristic to develop a method for first estimating regions that could contain a tandem repeat, by analyzing the k-mer frequency distributions of fixed-size windows across the target read, followed by an algorithm that assembles the k-mers of a putative region into the consensus repeat unit by greedily traversing a de Bruijn graph. Experimental results indicated that the proposed algorithm largely outperformed Tandem Repeats Finder, a widely used program for finding tandem repeats, in terms of sensitivity. AVAILABILITY AND IMPLEMENTATION: https://github.com/morisUtokyo/mTR.
Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Genoma Humano , Humanos , Repetições de Microssatélites , Análise de Sequência de DNARESUMO
Fluorescence microscopy is a key driver of discoveries in the life sciences, with observable phenomena being limited by the optics of the microscope, the chemistry of the fluorophores, and the maximum photon exposure tolerated by the sample. These limits necessitate trade-offs between imaging speed, spatial resolution, light exposure, and imaging depth. In this work we show how content-aware image restoration based on deep learning extends the range of biological phenomena observable by microscopy. We demonstrate on eight concrete examples how microscopy images can be restored even if 60-fold fewer photons are used during acquisition, how near isotropic resolution can be achieved with up to tenfold under-sampling along the axial direction, and how tubular and granular structures smaller than the diffraction limit can be resolved at 20-times-higher frame rates compared to state-of-the-art methods. All developed image restoration methods are freely available as open source software in Python, FIJI, and KNIME.