RESUMEN
The SARS-CoV-2 Omicron variant was first identified in November 2021 in Botswana and South Africa1-3. It has since spread to many countries and is expected to rapidly become dominant worldwide. The lineage is characterized by the presence of around 32 mutations in spike-located mostly in the N-terminal domain and the receptor-binding domain-that may enhance viral fitness and enable antibody evasion. Here we isolated an infectious Omicron virus in Belgium from a traveller returning from Egypt. We examined its sensitivity to nine monoclonal antibodies that have been clinically approved or are in development4, and to antibodies present in 115 serum samples from COVID-19 vaccine recipients or individuals who have recovered from COVID-19. Omicron was completely or partially resistant to neutralization by all monoclonal antibodies tested. Sera from recipients of the Pfizer or AstraZeneca vaccine, sampled five months after complete vaccination, barely inhibited Omicron. Sera from COVID-19-convalescent patients collected 6 or 12 months after symptoms displayed low or no neutralizing activity against Omicron. Administration of a booster Pfizer dose as well as vaccination of previously infected individuals generated an anti-Omicron neutralizing response, with titres 6-fold to 23-fold lower against Omicron compared with those against Delta. Thus, Omicron escapes most therapeutic monoclonal antibodies and, to a large extent, vaccine-elicited antibodies. However, Omicron is neutralized by antibodies generated by a booster vaccine dose.
Asunto(s)
Anticuerpos Neutralizantes/inmunología , Anticuerpos Antivirales/inmunología , COVID-19/virología , Evasión Inmune/inmunología , Inmunización Secundaria , SARS-CoV-2/inmunología , Adulto , Anticuerpos Monoclonales/inmunología , Vacuna BNT162/administración & dosificación , Vacuna BNT162/inmunología , Bélgica , COVID-19/inmunología , COVID-19/transmisión , ChAdOx1 nCoV-19/administración & dosificación , ChAdOx1 nCoV-19/inmunología , Convalecencia , Femenino , Humanos , Masculino , Mutación , Pruebas de Neutralización , Filogenia , SARS-CoV-2/clasificación , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , ViajeRESUMEN
The ongoing degradation of natural systems and other environmental changes has put our society at a crossroad with respect to our future relationship with our planet. While the concept of One Health describes how human health is inextricably linked with environmental health, many of these complex interdependencies are still not well-understood. Here, we describe how the advent of real-time genomic analyses can benefit One Health and how it can enable timely, in-depth ecosystem health assessments. We introduce nanopore sequencing as the only disruptive technology that currently allows for real-time genomic analyses and that is already being used worldwide to improve the accessibility and versatility of genomic sequencing. We showcase real-time genomic studies on zoonotic disease, food security, environmental microbiome, emerging pathogens, and their antimicrobial resistances, and on environmental health itself - from genomic resource creation for wildlife conservation to the monitoring of biodiversity, invasive species, and wildlife trafficking. We stress why equitable access to real-time genomics in the context of One Health will be paramount and discuss related practical, legal, and ethical limitations.
Asunto(s)
Ecosistema , Salud Única , Humanos , Genómica , Biodiversidad , GenomaRESUMEN
Since the start of the COVID-19 pandemic, an unprecedented number of genomic sequences of SARS-CoV-2 have been generated and shared with the scientific community. The unparalleled volume of available genetic data presents a unique opportunity to gain real-time insights into the virus transmission during the pandemic, but also a daunting computational hurdle if analyzed with gold-standard phylogeographic approaches. To tackle this practical limitation, we here describe and apply a rapid analytical pipeline to analyze the spatiotemporal dispersal history and dynamics of SARS-CoV-2 lineages. As a proof of concept, we focus on the Belgian epidemic, which has had one of the highest spatial densities of available SARS-CoV-2 genomes. Our pipeline has the potential to be quickly applied to other countries or regions, with key benefits in complementing epidemiological analyses in assessing the impact of intervention measures or their progressive easement.
Asunto(s)
COVID-19/transmisión , COVID-19/virología , Genoma Viral , Filogeografía , SARS-CoV-2/genética , Bélgica , COVID-19/epidemiología , Evolución Molecular , Genómica , Humanos , Funciones de Verosimilitud , Mutación , Aislamiento de Pacientes , Filogenia , Distanciamiento Físico , Análisis Espacio-Temporal , Flujo de TrabajoRESUMEN
The first cluster of patients suffering from coronavirus disease 2019 (COVID-19) was identified on December 21, 2019, and as of July 29, 2020, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections have been linked with 664,333 deaths and number at least 16,932,996 worldwide. Unprecedented in global societal impact, the COVID-19 pandemic has tested local, national, and international preparedness for viral outbreaks to the limits. Just as it will be vital to identify missed opportunities and improve contingency planning for future outbreaks, we must also highlight key successes and build on them. Concomitant to the emergence of a novel viral disease, there is a 'research and development gap' that poses a threat to the overall pace and quality of outbreak response during its most crucial early phase. Here, we outline key components of an adequate research response to novel viral outbreaks using the example of SARS-CoV-2. We highlight the exceptional recent progress made in fundamental science, resulting in the fastest scientific response to a major infectious disease outbreak or pandemic. We underline the vital role of the international research community, from the implementation of diagnostics and contact tracing procedures to the collective search for vaccines and antiviral therapies, sustained by unique information sharing efforts.
Asunto(s)
Investigación Biomédica/tendencias , Infecciones por Coronavirus/virología , Cooperación Internacional , Neumonía Viral/virología , Betacoronavirus/genética , Betacoronavirus/fisiología , Investigación Biomédica/organización & administración , COVID-19 , Trazado de Contacto , Infecciones por Coronavirus/epidemiología , Infecciones por Coronavirus/mortalidad , Infecciones por Coronavirus/terapia , Humanos , Pandemias , Neumonía Viral/epidemiología , Neumonía Viral/mortalidad , Neumonía Viral/terapia , SARS-CoV-2RESUMEN
We investigated the genetic profiles of killer cell immunoglobulin-like receptors (KIRs) in Ebola virus-infected patients. We studied the relationship between KIR-human leukocyte antigen (HLA) combinations and the clinical outcomes of patients with Ebola virus disease (EVD). We genotyped KIRs and HLA class I alleles using DNA from uninfected controls, EVD survivors, and persons who died of EVD. The activating 2DS4-003 and inhibitory 2DL5 genes were significantly more common among persons who died of EVD; 2DL2 was more common among survivors. We used logistic regression analysis and Bayesian modeling to identify 2DL2, 2DL5, 2DS4-003, HLA-B-Bw4-Thr, and HLA-B-Bw4-Ile as probably having a significant relationship with disease outcome. Our findings highlight the importance of innate immune response against Ebola virus and show the association between KIRs and the clinical outcome of EVD.
Asunto(s)
Fiebre Hemorrágica Ebola , Alelos , Teorema de Bayes , Genotipo , Antígenos HLA , Fiebre Hemorrágica Ebola/epidemiología , Humanos , Receptores KIR/genéticaRESUMEN
The human cytomegalovirus (HCMV) genome was sequenced by hierarchical shotgun almost 30 years ago. Over these years, low and high passaged strains have been sequenced, improving, albeit still far from complete, the understanding of the coding potential, expression dynamics and diversity of wild-type HCMV strains. Next-generation sequencing (NGS) platforms have enabled a huge advancement, facilitating the comparison of differentially passaged strains, challenging diagnostics and research based on a single or reduced gene set genotyping. In addition, it allowed to link genetic features to different viral phenotypes as for example, correlating large genomic re-arrangements to viral attenuation or different mutations to antiviral resistance and cell tropism. NGS platforms provided the first high-resolution experiments to HCMV dynamics, allowing the study of intra-host viral population structures and the description of rare transcriptional events. Long-read sequencing has recently become available, helping to identify new genomic re-arrangements, partially accounting for the genetic variability displayed in clinical isolates, as well as, in changing the understanding of the HCMV transcriptome. Better knowledge of the transcriptome resulted in a vast number of new splicing events and alternative transcripts, although most of them still need additional validation. This review summarizes the sequencing efforts reached so far, discussing its approaches and providing a revision and new nuances on HCMV sequence variability in the sequencing field.
Asunto(s)
Citomegalovirus/genética , Farmacorresistencia Viral/genética , Genómica , Citomegalovirus/efectos de los fármacos , Citomegalovirus/patogenicidad , Genoma Viral/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación , Transcriptoma/genéticaRESUMEN
INTRODUCTION: Herpesviruses are a widespread family of double-stranded DNA viruses that establish life-long persistent infection in their hosts. Cumulative evidence tends to argue for the association of human herpesviruses, such as Kaposi's sarcoma herpesvirus (KHSV), Epstein-Barr virus (EBV), and human cytomegalovirus (HCMV) with various human disorders and diseases. The present study aims to investigate the presence of herpesviruses in colorectal cancer (CRC). METHODOLOGY: We investigated the presence of herpesviruses in 69 formalin-fixed paraffin embedded tissue (FFPE) biopsies, using a pan-herpesvirus nested polymerase chain reaction (PCR) with degenerate primers and HCMV specific primers to identify the presence of herpesviruses in CRC tissue. RESULTS: None of the samples we examined were positive for herpesviruses. CONCLUSIONS: Our results suggest that there is no (or very low) prevalence of lifelong herpesvirus infection in Algerian CRC patients. Larger cohorts may provide more insight into the prevalence of herpesviruses in Algerian CRC biopsies.
Asunto(s)
Neoplasias Colorrectales , Infecciones por Virus de Epstein-Barr , Infecciones por Herpesviridae , Humanos , Herpesvirus Humano 4/genética , Infecciones por Herpesviridae/epidemiología , Citomegalovirus , Neoplasias Colorrectales/epidemiologíaRESUMEN
Coronavirus Disease 2019 (COVID-19) vaccination has resulted in excellent protection against fatal disease, including in older adults. However, risk factors for post-vaccination fatal COVID-19 are largely unknown. We comprehensively studied three large nursing home outbreaks (20-35% fatal cases among residents) by combining severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) aerosol monitoring, whole-genome phylogenetic analysis and immunovirological profiling of nasal mucosa by digital nCounter transcriptomics. Phylogenetic investigations indicated that each outbreak stemmed from a single introduction event, although with different variants (Delta, Gamma and Mu). SARS-CoV-2 was detected in aerosol samples up to 52 d after the initial infection. Combining demographic, immune and viral parameters, the best predictive models for mortality comprised IFNB1 or age, viral ORF7a and ACE2 receptor transcripts. Comparison with published pre-vaccine fatal COVID-19 transcriptomic and genomic signatures uncovered a unique IRF3 low/IRF7 high immune signature in post-vaccine fatal COVID-19 outbreaks. A multi-layered strategy, including environmental sampling, immunomonitoring and early antiviral therapy, should be considered to prevent post-vaccination COVID-19 mortality in nursing homes.
Asunto(s)
COVID-19 , Humanos , Anciano , Filogenia , COVID-19/epidemiología , SARS-CoV-2/genética , Casas de Salud , Vacunación , Brotes de Enfermedades/prevención & controlRESUMEN
BK polyomavirus (BKPyV) is a human DNA virus generally divided into twelve subgroups based on the genetic diversity of Viral Protein 1 (VP1). BKPyV can cause polyomavirus-associated nephropathy (PVAN) after kidney transplantation. Detection of BKPyV DNA in blood (viremia) is a source of concern and increase in plasma viral load is associated with a higher risk of developing PVAN. In this work, we looked for possible associations of specific BKPyV genetic features with higher plasma viral load in kidney transplant patients. We analyzed BKPyV complete genome in three-month samples from kidney recipients who developed viremia during their follow-up period. BKPyV sequences were obtained by next-generation sequencing and were de novo assembled using the new BKAnaLite pipeline. Based on the data from 72 patients, we identified 24 viral groups with unique amino acid sequences: three in the VP1 subgroup IVc2, six in Ib1, ten in Ib2, one in Ia, and four in II. In none of the groups did the mean plasma viral load reach a statistically significant difference from the overall mean observed at three months after transplantation. Further investigation is needed to better understand the link between the newly described BKPyV genetic variants and pathogenicity in kidney transplant recipients.
Asunto(s)
Virus BK , Enfermedades Renales , Trasplante de Riñón , Infecciones por Polyomavirus , Poliomavirus , Infecciones Tumorales por Virus , Virus BK/genética , ADN Viral/genética , Variación Genética , Humanos , Trasplante de Riñón/efectos adversos , Poliomavirus/genética , Receptores de Trasplantes , ViremiaRESUMEN
We report the complete genome sequence of a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron variant (lineage B.1.1.529) from a Belgian patient with a history of recent travel to Egypt. At the time of writing, this genome constituted the first confirmed case of an infection with the Omicron variant in Europe.
RESUMEN
The emergence of drug-resistant strains of the parasite Leishmania infantum infecting dogs and humans represents an increasing threat. L. infantum genomes are complex and unstable with extensive structural variations, ranging from aneuploidies to multiple copy number variations (CNVs). These CNVs have recently been validated as biomarkers of Leishmania concerning virulence, tissue tropism, and drug resistance. As a proof-of-concept to develop a novel diagnosis platform (LeishGenApp), four L. infantum samples from humans and dogs were nanopore sequenced. Samples were epidemiologically typed within the Mediterranean L. infantum group, identifying members of the JCP5 and non-JCP5 subgroups, using the conserved region (CR) of the maxicircle kinetoplast. Aneuploidies were frequent and heterogenous between samples, yet only chromosome 31 tetrasomy was common between all the samples. A high frequency of aneuploidies was observed for samples with long passage history (MHOM/TN/80/IPT-1), whereas fewer were detected for samples maintained in vivo (MCRI/ES/2006/CATB033). Twenty-two genes were studied to generate a genetic pharmacoresistance profile against miltefosine, allopurinol, trivalent antimonials, amphotericin, and paromomycin. MHOM/TN/80/IPT-1 and MCRI/ES/2006/CATB033 displayed a genetic profile with potential resistance against miltefosine and allopurinol. Meanwhile, MHOM/ES/2016/CATB101 and LCAN/ES/2020/CATB102 were identified as potentially resistant against paromomycin. All four samples displayed a genetic profile for resistance against trivalent antimonials. Overall, this proof-of-concept revealed the potential of nanopore sequencing and LeishGenApp for the determination of epidemiological, drug resistance, and pathogenicity biomarkers in L. infantum.
RESUMEN
Recent metagenomics studies have revealed several tick species to host a variety of previously undiscovered RNA viruses. Ixodes ricinus, which is known to be a vector for many viral, bacterial, and protozoan pathogens, is the most prevalent tick species in Europe. For this study, we decided to investigate the virosphere of Belgian I. ricinus ticks. High-throughput sequencing of tick pools collected from six different sampling sites revealed the presence of viruses belonging to many different viral orders and families, including Mononegavirales, Bunyavirales, Partitiviridae, and Reoviridae. Of particular interest was the detection of several new reoviruses, two of which cluster together with members of the genus Coltivirus. This includes a new strain of Eyach virus, a known causative agent of tick-borne encephalitis. All genome segments of this new strain are highly similar to those of previously published Eyach virus genomes, except for the fourth segment, encoding VP4, which is markedly more dissimilar, potentially indicating the occurrence of a genetic reassortment. Further polymerase chain reaction-based screening of over 230 tick pools for 14 selected viruses showed that most viruses could be found in all six sampling sites, indicating the wide spread of these viruses throughout the Belgian tick population. Taken together, these results illustrate the role of ticks as important virus reservoirs, highlighting the need for adequate tick control measures.
RESUMEN
At the end of 2020, several new variants of SARS-CoV-2-designated variants of concern-were detected and quickly suspected to be associated with a higher transmissibility and possible escape of vaccine-induced immunity. In Belgium, this discovery has motivated the initiation of a more ambitious genomic surveillance program, which is drastically increasing the number of SARS-CoV-2 genomes to analyse for monitoring the circulation of viral lineages and variants of concern. In order to efficiently analyse the massive collection of genomic data that are the result of such increased sequencing efforts, streamlined analytical strategies are crucial. In this study, we illustrate how to efficiently map the spatio-temporal dispersal of target mutations at a regional level. As a proof of concept, we focus on the Belgian province of Liège that has been consistently sampled throughout 2020, but was also one of the main epicenters of the second European epidemic wave. Specifically, we employ a recently developed phylogeographic workflow to infer the regional dispersal history of viral lineages associated with three specific mutations on the spike protein (S98F, A222V and S477N) and to quantify their relative importance through time. Our analytical pipeline enables analysing large data sets and has the potential to be quickly applied and updated to track target mutations in space and time throughout the course of an epidemic.
Asunto(s)
Genoma Viral , Mutación , SARS-CoV-2/genética , Glicoproteína de la Espiga del Coronavirus/genética , Bélgica , Monitoreo Epidemiológico , HumanosRESUMEN
Human BK polyomavirus (BKPyV) prevalence has been increasing due to the introduction of more potent immunosuppressive agents in transplant recipients, and its clinical interest. BKPyV has been linked mostly to polyomavirus-associated hemorrhagic cystitis, in allogenic hematopoietic stem cell transplant, and polyomavirus-associated nephropathy in kidney transplant patients. BKPyV is a circular double-stranded DNA virus that encodes for seven proteins, of which Viral Protein 1 (VP1), the major structural protein, has been extensively used for genotyping. BKPyV also contains the noncoding control region (NCCR), configured by five repeat blocks (OPQRS) known to be highly repetitive and diverse, and linked to viral infectivity and replication. BKPyV genetic diversity has been mainly studied based on the NCCR and VP1, due to the high occurrence of BKPyV-associated diseases in transplant patients and their clinical implications. Here BKTyper is presented, a free online genotyper for BKPyV, based on a VP1 genotyping and a novel algorithm for NCCR block identification. VP1 genotyping is based on a modified implementation of the BK typing and grouping regions (BKTGR) algorithm, providing a maximum-likelihood phylogenetic tree using a custom internal BKPyV database. Novel NCCR block identification relies on a minimum of 12-bp motif recognition and a novel sorting algorithm. A graphical representation of the OPQRS block organization is provided.
Asunto(s)
Virus BK/clasificación , Proteínas de la Cápside/genética , Técnicas de Genotipaje , ARN no Traducido/genética , Programas Informáticos , Algoritmos , Variación Genética , Filogenia , Replicación Viral/genéticaRESUMEN
Almost all eukaryotes have transposable elements (TEs) against which they have developed defense mechanisms. In the Drosophila germline, the main transposable element (TE) regulation pathway is mediated by specific Piwi-interacting small RNAs (piRNAs). Nonetheless, for unknown reasons, TEs sometimes escape cellular control during interspecific hybridization processes. Because the piRNA pathway genes are involved in piRNA biogenesis and TE control, we sequenced and characterized nine key genes from this pathway in Drosophilabuzzatii and Drosophilakoepferae species and studied their expression pattern in ovaries of both species and their F1 hybrids. We found that gene structure is, in general, maintained between both species and that two genes-armitage and aubergine-are under positive selection. Three genes-krimper, methyltransferase 2, and zucchini-displayed higher expression values in hybrids than both parental species, while others had RNA levels similar to the parental species with the highest expression. This suggests that the overexpression of some piRNA pathway genes can be a primary response to hybrid stress. Therefore, these results reinforce the hypothesis that TE deregulation may be due to the protein incompatibility caused by the rapid evolution of these genes, leading to a TE silencing failure, rather than to an underexpression of piRNA pathway genes.
Asunto(s)
Proteínas de Drosophila/genética , Drosophila/crecimiento & desarrollo , Ovario/química , ARN Interferente Pequeño/genética , Animales , Cruzamiento , Elementos Transponibles de ADN , Drosophila/genética , Evolución Molecular , Femenino , Regulación de la Expresión Génica , Hibridación Genética , Análisis de Secuencia de ADN , Transducción de SeñalRESUMEN
Ebolaviruses pose a substantial threat to wildlife populations and to public health in Africa. Evolutionary analyses of virus genome sequences can contribute significantly to elucidate the origin of new outbreaks, which can help guide surveillance efforts. The reconstructed between-outbreak evolutionary history of Zaire ebolavirus so far has been highly consistent. By removing the confounding impact of population growth bursts during local outbreaks on the free mixing assumption that underlies coalescent-based demographic reconstructions, we find-contrary to what previous results indicated-that the circulation dynamics of Ebola virus in its animal reservoir are highly uncertain. Our findings also accentuate the need for a more fine-grained picture of the Ebola virus diversity in its reservoir to reliably infer the reservoir origin of outbreak lineages. In addition, the recent appearance of slower-evolving variants is in line with latency as a survival mechanism and with bats as the natural reservoir host.
Asunto(s)
Enfermedades de los Animales/epidemiología , Quirópteros/virología , Reservorios de Enfermedades/virología , Ebolavirus/aislamiento & purificación , Fiebre Hemorrágica Ebola/veterinaria , África , Enfermedades de los Animales/virología , Animales , Ebolavirus/clasificación , Ebolavirus/genética , Genotipo , Fiebre Hemorrágica Ebola/epidemiología , Fiebre Hemorrágica Ebola/virología , Humanos , FilogeniaRESUMEN
Viruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources. To this end, during this three-day continuation of the Virus Hunting Toolkit codeathon series (VHT-2), a new integrated and federated viral index was elaborated. This Federated Index of Viral Experiments (FIVE) integrates pre-existing and novel functional and taxonomy annotations and virus-host pairings. Variability in the context of viral genomic diversity is often overlooked in virus databases. As a proof-of-concept, FIVE was the first attempt to include viral genome variation for HIV, the most well-studied human pathogen, through viral genome diversity graphs. As per the publication of this manuscript, FIVE is the first implementation of a virus-specific federated index of such scope. FIVE is coded in BigQuery for optimal access of large quantities of data and is publicly accessible. Many projects of database or index federation fail to provide easier alternatives to access or query information. To this end, a Python API query system was developed to enhance the accessibility of FIVE.
Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Metagenómica/métodos , Virus/genética , Biología Computacional/métodos , Variación Genética , Genoma Viral , Interacciones Huésped-Patógeno , Humanos , Interfaz Usuario-Computador , Proteínas Virales/genética , Proteínas Virales/metabolismo , Virus/metabolismo , Navegador WebRESUMEN
Emergence of SARS-CoV-2 causing COVID-19 has resulted in hundreds of thousands of deaths. In search for key targets of effective therapeutics, robust animal models mimicking COVID-19 in humans are urgently needed. Here, we show that Syrian hamsters, in contrast to mice, are highly permissive to SARS-CoV-2 and develop bronchopneumonia and strong inflammatory responses in the lungs with neutrophil infiltration and edema, further confirmed as consolidations visualized by micro-CT alike in clinical practice. Moreover, we identify an exuberant innate immune response as key player in pathogenesis, in which STAT2 signaling plays a dual role, driving severe lung injury on the one hand, yet restricting systemic virus dissemination on the other. Our results reveal the importance of STAT2-dependent interferon responses in the pathogenesis and virus control during SARS-CoV-2 infection and may help rationalizing new strategies for the treatment of COVID-19 patients.
Asunto(s)
Betacoronavirus/fisiología , Infecciones por Coronavirus/patología , Infecciones por Coronavirus/virología , Modelos Animales de Enfermedad , Neumonía Viral/patología , Neumonía Viral/virología , Factor de Transcripción STAT2/metabolismo , Transducción de Señal , Animales , Betacoronavirus/patogenicidad , COVID-19 , Infecciones por Coronavirus/inmunología , Infecciones por Coronavirus/metabolismo , Cricetinae , Inmunidad Innata , Interferón Tipo I/genética , Interferón Tipo I/metabolismo , Pulmón/patología , Pulmón/virología , Ratones , Pandemias , Neumonía Viral/inmunología , Neumonía Viral/metabolismo , SARS-CoV-2 , Factor de Transcripción STAT2/genética , Replicación ViralRESUMEN
Genomic sequencing for early identification of Ebola virus remains a big challenge in low-income countries. Here, we report the complete genome sequence of an Ebola virus strain obtained during the 2017 Likati outbreak in the Democratic Republic of the Congo (DRC) by using the Oxford Nanopore Technologies (ONT) MinION sequencer.
RESUMEN
A wealth of viral data sits untapped in publicly available metagenomic data sets when it might be extracted to create a usable index for the virological research community. We hypothesized that work of this complexity and scale could be done in a hackathon setting. Ten teams comprised of over 40 participants from six countries, assembled to create a crowd-sourced set of analysis and processing pipelines for a complex biological data set in a three-day event on the San Diego State University campus starting 9 January 2019. Prior to the hackathon, 141,676 metagenomic data sets from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) were pre-assembled into contiguous assemblies (contigs) by NCBI staff. During the hackathon, a subset consisting of 2953 SRA data sets (approximately 55 million contigs) was selected, which were further filtered for a minimal length of 1 kb. This resulted in 4.2 million (Mio) contigs, which were aligned using BLAST against all known virus genomes, phylogenetically clustered and assigned metadata. Out of the 4.2 Mio contigs, 360,000 contigs were labeled with domains and an additional subset containing 4400 contigs was screened for virus or virus-like genes. The work yielded valuable insights into both SRA data and the cloud infrastructure required to support such efforts, revealing analysis bottlenecks and possible workarounds thereof. Mainly: (i) Conservative assemblies of SRA data improves initial analysis steps; (ii) existing bioinformatic software with weak multithreading/multicore support can be elevated by wrapper scripts to use all cores within a computing node; (iii) redesigning existing bioinformatic algorithms for a cloud infrastructure to facilitate its use for a wider audience; and (iv) a cloud infrastructure allows a diverse group of researchers to collaborate effectively. The scientific findings will be extended during a follow-up event. Here, we present the applied workflows, initial results, and lessons learned from the hackathon.