ABSTRACT
Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.
Subject(s)
DNA Mutational Analysis/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Genome, Human , Genomics/methods , HumansABSTRACT
Calling structural variations (SVs) is technically challenging, but using long reads remains the most accurate way to identify complex genomic alterations. Here we present Sniffles2, which improves over current methods by implementing a repeat aware clustering coupled with a fast consensus sequence and coverage-adaptive filtering. Sniffles2 is 11.8 times faster and 29% more accurate than state-of-the-art SV callers across different coverages (5-50×), sequencing technologies (ONT and HiFi) and SV types. Furthermore, Sniffles2 solves the problem of family-level to population-level SV calling to produce fully genotyped VCF files. Across 11 probands, we accurately identified causative SVs around MECP2, including highly complex alleles with three overlapping SVs. Sniffles2 also enables the detection of mosaic SVs in bulk long-read data. As a result, we identified multiple mosaic SVs in brain tissue from a patient with multiple system atrophy. The identified SV showed a remarkable diversity within the cingulate cortex, impacting both genes involved in neuron function and repetitive elements.
Subject(s)
Mosaicism , Humans , High-Throughput Nucleotide Sequencing/methods , Genomic Structural Variation/genetics , Software , Sequence Analysis, DNA/methodsABSTRACT
Rhabdomeric opsins (r-opsins) are light sensors in cephalic eye photoreceptors, but also function in additional sensory organs. This has prompted questions on the evolutionary relationship of these cell types, and if ancient r-opsins were non-photosensory. A molecular profiling approach in the marine bristleworm Platynereis dumerilii revealed shared and distinct features of cephalic and non-cephalic r-opsin1-expressing cells. Non-cephalic cells possess a full set of phototransduction components, but also a mechanosensory signature. Prompted by the latter, we investigated Platynereis putative mechanotransducer and found that nompc and pkd2.1 co-expressed with r-opsin1 in TRE cells by HCR RNA-FISH. To further assess the role of r-Opsin1 in these cells, we studied its signaling properties and unraveled that r-Opsin1 is a Gαq-coupled blue light receptor. Profiling of cells from r-opsin1 mutants versus wild-types, and a comparison under different light conditions reveals that in the non-cephalic cells light - mediated by r-Opsin1 - adjusts the expression level of a calcium transporter relevant for auditory mechanosensation in vertebrates. We establish a deep-learning-based quantitative behavioral analysis for animal trunk movements and identify a light- and r-Opsin-1-dependent fine-tuning of the worm's undulatory movements in headless trunks, which are known to require mechanosensory feedback. Our results provide new data on peripheral cell types of likely light sensory/mechanosensory nature. These results point towards a concept in which such a multisensory cell type evolved to allow for fine-tuning of mechanosensation by light. This implies that light-independent mechanosensory roles of r-opsins may have evolved secondarily.
Subject(s)
Biological Evolution , Mechanoreceptors/physiology , Photoreceptor Cells, Invertebrate/physiology , Polychaeta/physiology , Animals , Evolution, MolecularABSTRACT
In October 2020, 62 scientists from nine nations worked together remotely in the Second Baylor College of Medicine & DNAnexus hackathon, focusing on different related topics on Structural Variation, Pan-genomes, and SARS-CoV-2 related research. The overarching focus was to assess the current status of the field and identify the remaining challenges. Furthermore, how to combine the strengths of the different interests to drive research and method development forward. Over the four days, eight groups each designed and developed new open-source methods to improve the identification and analysis of variations among species, including humans and SARS-CoV-2. These included improvements in SV calling, genotyping, annotations and filtering. Together with advancements in benchmarking existing methods. Furthermore, groups focused on the diversity of SARS-CoV-2. Daily discussion summary and methods are available publicly at https://github.com/collaborativebioinformatics provides valuable insights for both participants and the research community.
Subject(s)
COVID-19 , SARS-CoV-2 , Animals , Genome, Viral , Humans , VertebratesABSTRACT
Mapping reads to a genome remains challenging, especially for non-model organisms with lower quality assemblies, or for organisms with higher mutation rates. While most research has focused on speeding up the mapping process, little attention has been paid to optimize the choice of mapper and parameters for a user's dataset. Here, we present Teaser, a software that assists in these choices through rapid automated benchmarking of different mappers and parameter settings for individualized data. Within minutes, Teaser completes a quantitative evaluation of an ensemble of mapping algorithms and parameters. We use Teaser to demonstrate how Bowtie2 can be optimized for different data.