RESUMO
Metagenomic sequencing combined with Oxford Nanopore Technology has the potential to become a point-of-care test for infectious disease in public health and clinical settings, providing rapid diagnosis of infection, guiding individual patient management and treatment strategies, and informing infection prevention and control practices. However, publicly available, streamlined, and reproducible pipelines for analyzing Nanopore metagenomic sequencing data are still lacking. Here we introduce NanoSPC, a scalable, portable and cloud compatible pipeline for analyzing Nanopore sequencing data. NanoSPC can identify potentially pathogenic viruses and bacteria simultaneously to provide comprehensive characterization of individual samples. The pipeline can also detect single nucleotide variants and assemble high quality complete consensus genome sequences, permitting high-resolution inference of transmission. We implement NanoSPC using Nextflow manager within Docker images to allow reproducibility and portability of the analysis. Moreover, we deploy NanoSPC to our scalable pathogen pipeline platform, enabling elastic computing for high throughput Nanopore data on HPC cluster as well as multiple cloud platforms, such as Google Cloud, Amazon Elastic Computing Cloud, Microsoft Azure and OpenStack. Users could either access our web interface (https://nanospc.mmmoxford.uk) to run cloud-based analysis, monitor process, and visualize results, as well as download Docker images and run command line to analyse data locally.
Assuntos
Genoma Viral , Metagenômica/métodos , Sequenciamento por Nanoporos/métodos , Software , Vírus/genética , Bactérias/genética , Bactérias/isolamento & purificação , Computação em Nuvem , Vírus/isolamento & purificaçãoRESUMO
There is a need to identify microbial sequences that may form part of transmission chains, or that may represent importations across national boundaries, amidst large numbers of SARS-CoV-2 and other bacterial or viral sequences. Reference-based compression is a sequence analysis technique that allows both a compact storage of sequence data and comparisons between sequences. Published implementations of the approach are being challenged by the large sample collections now being generated. Our aim was to develop a fast software detecting highly similar sequences in large collections of microbial genomes, including millions of SARS-CoV-2 genomes. To do so, we developed Catwalk, a tool that bypasses bottlenecks in the generation, comparison and in-memory storage of microbial genomes generated by reference mapping. It is a compiled solution, coded in Nim to increase performance. It can be accessed via command line, rest api or web server interfaces. We tested Catwalk using both SARS-CoV-2 and Mycobacterium tuberculosis genomes generated by prospective public-health sequencing programmes. Pairwise sequence comparisons, using clinically relevant similarity cut-offs, took about 0.39 and 0.66 µs, respectively; in 1 s, between 1 and 2 million sequences can be searched. Catwalk operates about 1700 times faster than, and uses about 8â% of the RAM of, a Python reference-based compression and comparison tool in current use for outbreak detection. Catwalk can rapidly identify close relatives of a SARS-CoV-2 or M. tuberculosis genome amidst millions of samples.
Assuntos
COVID-19 , Mycobacterium tuberculosis , Bases de Dados de Ácidos Nucleicos , Humanos , Mycobacterium tuberculosis/genética , Estudos Prospectivos , SARS-CoV-2/genética , SoftwareRESUMO
We conducted voluntary Covid-19 testing programmes for symptomatic and asymptomatic staff at a UK teaching hospital using naso-/oro-pharyngeal PCR testing and immunoassays for IgG antibodies. 1128/10,034 (11.2%) staff had evidence of Covid-19 at some time. Using questionnaire data provided on potential risk-factors, staff with a confirmed household contact were at greatest risk (adjusted odds ratio [aOR] 4.82 [95%CI 3.45-6.72]). Higher rates of Covid-19 were seen in staff working in Covid-19-facing areas (22.6% vs. 8.6% elsewhere) (aOR 2.47 [1.99-3.08]). Controlling for Covid-19-facing status, risks were heterogenous across the hospital, with higher rates in acute medicine (1.52 [1.07-2.16]) and sporadic outbreaks in areas with few or no Covid-19 patients. Covid-19 intensive care unit staff were relatively protected (0.44 [0.28-0.69]), likely by a bundle of PPE-related measures. Positive results were more likely in Black (1.66 [1.25-2.21]) and Asian (1.51 [1.28-1.77]) staff, independent of role or working location, and in porters and cleaners (2.06 [1.34-3.15]).
Assuntos
Infecções por Coronavirus/epidemiologia , Pessoal de Saúde/estatística & dados numéricos , Pneumonia Viral/epidemiologia , Adolescente , Adulto , Fatores Etários , Idoso , Infecções Assintomáticas/epidemiologia , Betacoronavirus/isolamento & purificação , COVID-19 , Infecções por Coronavirus/transmissão , Infecções por Coronavirus/virologia , Feminino , Hospitais de Ensino/estatística & dados numéricos , Humanos , Incidência , Transmissão de Doença Infecciosa do Paciente para o Profissional/estatística & dados numéricos , Unidades de Terapia Intensiva/estatística & dados numéricos , Masculino , Pessoa de Meia-Idade , Pandemias , Pneumonia Viral/transmissão , Pneumonia Viral/virologia , Risco , SARS-CoV-2 , Inquéritos e Questionários , Reino Unido/epidemiologia , Adulto JovemRESUMO
Perceptual, motor and cognitive processes are based on rich interactions between remote regions in the human brain. Such interactions can be carried out through phase synchronization of oscillatory signals. Neuronal synchronization has been primarily studied within the same frequency range, e.g., within alpha or beta frequency bands. Yet, recent research shows that neuronal populations can also demonstrate phase synchronization between different frequency ranges. An extraction of such cross-frequency interactions in EEG/MEG recordings remains, however, methodologically challenging. Here we present a new method for the robust extraction of cross-frequency phase-to-phase synchronized components. Generalized Cross-Frequency Decomposition (GCFD) reconstructs the time courses of synchronized neuronal components, their spatial filters and patterns. Our method extends the previous state of the art, Cross-Frequency Decomposition (CFD), to the whole range of frequencies: it works for any f 1 and f 2 whenever f 1:f 2 is a rational number. GCFD gives a compact description of non-linearly interacting neuronal sources on the basis of their cross-frequency phase coupling. We successfully validated the new method in simulations and tested it with real EEG recordings including resting state data and steady state visually evoked potentials (SSVEP).