RESUMEN
Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute's GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.
Asunto(s)
Biología Computacional/métodos , Difusión de la Información , Internet , Programas Informáticos , Flujo de Trabajo , Nube Computacional , Biología Computacional/educación , Visualización de Datos , Humanos , National Heart, Lung, and Blood Institute (U.S.) , National Human Genome Research Institute (U.S.) , Reproducibilidad de los Resultados , Estados UnidosRESUMEN
[This corrects the article DOI: 10.1371/journal.pgen.1000832.].
RESUMEN
BACKGROUND: The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging. RESULTS: H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community. CONCLUSION: The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.
Asunto(s)
Biología Computacional/métodos , Genómica/métodos , África , Humanos , Reproducibilidad de los ResultadosRESUMEN
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.
Asunto(s)
Línea Celular Tumoral/química , Genoma Humano , Glioma/genética , Línea Celular Tumoral/citología , Genotipo , Humanos , Datos de Secuencia Molecular , Mutación , Polimorfismo de Nucleótido Simple , Proteínas/genética , Análisis de Secuencia de ADNRESUMEN
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
RESUMEN
BACKGROUND: Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. RESULTS: In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). CONCLUSIONS: The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data interface to simplify development of analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets.
Asunto(s)
Genómica/métodos , Programas Informáticos , Bases de Datos de Ácidos Nucleicos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN/métodosRESUMEN
The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.
Asunto(s)
Biología Computacional/métodos , Genoma Humano , Neoplasias/genética , Cromotripsis , Análisis de Datos , Bases de Datos Genéticas , Genómica , Humanos , Internet , Mutación , Programas Informáticos , Interfaz Usuario-Computador , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: The emergence of next-generation sequencing technology presents tremendous opportunities to accelerate the discovery of rare variants or mutations that underlie human genetic disorders. Although the complete sequencing of the affected individuals' genomes would be the most powerful approach to finding such variants, the cost of such efforts make it impractical for routine use in disease gene research. In cases where candidate genes or loci can be defined by linkage, association, or phenotypic studies, the practical sequencing target can be made much smaller than the whole genome, and it becomes critical to have capture methods that can be used to purify the desired portion of the genome for shotgun short-read sequencing without biasing allelic representation or coverage. One major approach is array-based capture which relies on the ability to create a custom in-situ synthesized oligonucleotide microarray for use as a collection of hybridization capture probes. This approach is being used by our group and others routinely and we are continuing to improve its performance. RESULTS: Here, we provide a complete protocol optimized for large aggregate sequence intervals and demonstrate its utility with the capture of all predicted amino acid coding sequence from 3,038 human genes using 241,700 60-mer oligonucleotides. Further, we demonstrate two techniques by which the efficiency of the capture can be increased: by introducing a step to block cross hybridization mediated by common adapter sequences used in sequencing library construction, and by repeating the hybridization capture step. These improvements can boost the targeting efficiency to the point where over 85% of the mapped sequence reads fall within 100 bases of the targeted regions. CONCLUSIONS: The complete protocol introduced in this paper enables researchers to perform practical capture experiments, and includes two novel methods for increasing the targeting efficiency. Coupled with the new massively parallel sequencing technologies, this provides a powerful approach to identifying disease-causing genetic variants that can be localized within the genome by traditional methods.
Asunto(s)
Sitios Genéticos , Genoma Humano , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ADN/métodos , ADN de Neoplasias/genética , Genes Relacionados con las Neoplasias , Biblioteca Genómica , Humanos , Alineación de SecuenciaRESUMEN
Thermophilic organisms flourish in varied high-temperature environmental niches that are deadly to other organisms. Recently, genomic evidence has implicated a critical role for disulfide bonds in the structural stabilization of intracellular proteins from certain of these organisms, contrary to the conventional view that structural disulfide bonds are exclusively extracellular. Here both computational and structural data are presented to explore the occurrence of disulfide bonds as a protein-stabilization method across many thermophilic prokaryotes. Based on computational studies, disulfide-bond richness is found to be widespread, with thermophiles containing the highest levels. Interestingly, only a distinct subset of thermophiles exhibit this property. A computational search for proteins matching this target phylogenetic profile singles out a specific protein, known as protein disulfide oxidoreductase, as a potential key player in thermophilic intracellular disulfide-bond formation. Finally, biochemical support in the form of a new crystal structure of a thermophilic protein with three disulfide bonds is presented together with a survey of known structures from the literature. Together, the results provide insight into biochemical specialization and the diversity of methods employed by organisms to stabilize their proteins in exotic environments. The findings also motivate continued efforts to sequence genomes from divergent organisms.
Asunto(s)
Proteínas Arqueales/genética , Proteínas Bacterianas/genética , Disulfuros/química , Secuencia de Aminoácidos , Proteínas Arqueales/química , Bacterias , Proteínas Bacterianas/química , Biología Computacional , Disulfuros/análisis , Genoma Arqueal , Genoma Bacteriano , Datos de Secuencia Molecular , Conformación Proteica , TemperaturaRESUMEN
The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous compute environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. To address this need, in 2016 H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon, with the purpose of building key genomics analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on quay.io.
RESUMEN
As genomic datasets continue to grow, the feasibility of downloading data to a local organization and running analysis on a traditional compute environment is becoming increasingly problematic. Current large-scale projects, such as the ICGC PanCancer Analysis of Whole Genomes (PCAWG), the Data Platform for the U.S. Precision Medicine Initiative, and the NIH Big Data to Knowledge Center for Translational Genomics, are using cloud-based infrastructure to both host and perform analysis across large data sets. In PCAWG, over 5,800 whole human genomes were aligned and variant called across 14 cloud and HPC environments; the processed data was then made available on the cloud for further analysis and sharing. If run locally, an operation at this scale would have monopolized a typical academic data centre for many months, and would have presented major challenges for data storage and distribution. However, this scale is increasingly typical for genomics projects and necessitates a rethink of how analytical tools are packaged and moved to the data. For PCAWG, we embraced the use of highly portable Docker images for encapsulating and sharing complex alignment and variant calling workflows across highly variable environments. While successful, this endeavor revealed a limitation in Docker containers, namely the lack of a standardized way to describe and execute the tools encapsulated inside the container. As a result, we created the Dockstore ( https://dockstore.org), a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within. This service greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH).
RESUMEN
The Genomic Disulfide Analysis Program (GDAP) provides web access to computationally predicted protein disulfide bonds for over one hundred microbial genomes, including both bacterial and achaeal species. In the GDAP process, sequences of unknown structure are mapped, when possible, to known homologous Protein Data Bank (PDB) structures, after which specific distance criteria are applied to predict disulfide bonds. GDAP also accepts user-supplied protein sequences and subsequently queries the PDB sequence database for the best matches, scans for possible disulfide bonds and returns the results to the client. These predictions are useful for a variety of applications and have previously been used to show a dramatic preference in certain thermophilic archaea and bacteria for disulfide bonds within intracellular proteins. Given the central role these stabilizing, covalent bonds play in such organisms, the predictions available from GDAP provide a rich data source for designing site-directed mutants with more stable thermal profiles. The GDAP web application is a gateway to this information and can be used to understand the role disulfide bonds play in protein stability both in these unusual organisms and in sequences of interest to the individual researcher. The prediction server can be accessed at http://www.doe-mbi.ucla.edu/Services/GDAP.
Asunto(s)
Proteínas Arqueales/química , Proteínas Bacterianas/química , Cisteína/análisis , Disulfuros/análisis , Programas Informáticos , Interpretación Estadística de Datos , Genoma Arqueal , Genoma Bacteriano , Internet , Análisis de Secuencia de ProteínaRESUMEN
The wealth of available genomic data has spawned a corresponding interest in computational methods that can impart biological meaning and context to these experiments. Traditional computational methods have drawn relationships between pairs of proteins or genes based on notions of equality or similarity between their patterns of occurrence or behavior. For example, two genes displaying similar variation in expression, over a number of experiments, may be predicted to be functionally related. We have introduced a natural extension of these approaches, instead identifying logical relationships involving triplets of proteins. Triplets provide for various discrete kinds of logic relationships, leading to detailed inferences about biological associations. For instance, a protein C might be encoded within an organism if, and only if, two other proteins A and B are also both encoded within the organism, thus suggesting that gene C is functionally related to genes A and B. The method has been applied fruitfully to both phylogenetic and microarray expression data, and has been used to associate logical combinations of protein activity with disease state phenotypes, revealing previously unknown ternary relationships among proteins, and illustrating the inherent complexities that arise in biological data.
Asunto(s)
Fenómenos Fisiológicos Celulares , Biología Computacional/métodos , Bases de Datos Genéticas , Algoritmos , Animales , Perfilación de la Expresión Génica , Glioma/genética , Humanos , Modelos Biológicos , Análisis de Secuencia por Matrices de Oligonucleótidos , Filogenia , Proteínas/genética , Proteínas/fisiologíaRESUMEN
The Tyrolean Iceman, a 5,300-year-old Copper age individual, was discovered in 1991 on the Tisenjoch Pass in the Italian part of the Ötztal Alps. Here we report the complete genome sequence of the Iceman and show 100% concordance between the previously reported mitochondrial genome sequence and the consensus sequence generated from our genomic data. We present indications for recent common ancestry between the Iceman and present-day inhabitants of the Tyrrhenian Sea, that the Iceman probably had brown eyes, belonged to blood group O and was lactose intolerant. His genetic predisposition shows an increased risk for coronary heart disease and may have contributed to the development of previously reported vascular calcifications. Sequences corresponding to ~60% of the genome of Borrelia burgdorferi are indicative of the earliest human case of infection with the pathogen for Lyme borreliosis.
Asunto(s)
Genoma Humano , Genoma Mitocondrial , Momias , Secuencia de Bases , Borrelia burgdorferi/genética , Mapeo Cromosómico , ADN Mitocondrial/genética , Predisposición Genética a la Enfermedad , Historia Antigua , Humanos , Enfermedad de Lyme/historia , Mitocondrias/genética , Momias/microbiología , Paleontología , Fenotipo , Análisis de Secuencia de ADN , Calcificación VascularRESUMEN
Multiple self-healing squamous epithelioma (MSSE), also known as Ferguson-Smith disease (FSD), is an autosomal-dominant skin cancer condition characterized by multiple squamous-carcinoma-like locally invasive skin tumors that grow rapidly for a few weeks before spontaneously regressing, leaving scars. High-throughput genomic sequencing of a conservative estimate (24.2 Mb) of the disease locus on chromosome 9 using exon array capture identified independent mutations in TGFBR1 in three unrelated families. Subsequent dideoxy sequencing of TGFBR1 identified 11 distinct monoallelic mutations in 18 affected families, firmly establishing TGFBR1 as the causative gene. The nature of the sequence variants, which include mutations in the extracellular ligand-binding domain and a series of truncating mutations in the kinase domain, indicates a clear genotype-phenotype correlation between loss-of-function TGFBR1 mutations and MSSE. This distinguishes MSSE from the Marfan syndrome-related disorders in which missense mutations in TGFBR1 lead to developmental defects with vascular involvement but no reported predisposition to cancer.
Asunto(s)
Mutación , Proteínas Serina-Treonina Quinasas/genética , Receptores de Factores de Crecimiento Transformadores beta/genética , Neoplasias Cutáneas/genética , Secuencia de Aminoácidos , Secuencia de Bases , Carcinoma/genética , Carcinoma/metabolismo , Codón sin Sentido , Secuencia Conservada , Cartilla de ADN/genética , Femenino , Mutación del Sistema de Lectura , Estudios de Asociación Genética , Haplotipos , Humanos , Queratoacantoma/genética , Queratoacantoma/metabolismo , Masculino , Síndrome de Marfan/genética , Modelos Moleculares , Datos de Secuencia Molecular , Proteínas Mutantes/química , Proteínas Mutantes/genética , Proteínas Mutantes/metabolismo , Mutación Missense , Proteínas Serina-Treonina Quinasas/química , Proteínas Serina-Treonina Quinasas/metabolismo , Estructura Terciaria de Proteína , Receptor Tipo I de Factor de Crecimiento Transformador beta , Receptores de Factores de Crecimiento Transformadores beta/química , Receptores de Factores de Crecimiento Transformadores beta/metabolismo , Homología de Secuencia de Aminoácido , Neoplasias Cutáneas/metabolismoRESUMEN
Polymorphisms in the interleukin-4 receptor alpha chain (IL-4R alpha) have been linked to asthma incidence and severity, but a causal relationship has remained uncertain. In particular, a glutamine to arginine substitution at position 576 (Q576R) of IL-4R alpha has been associated with severe asthma, especially in African Americans. We show that mice carrying the Q576R polymorphism exhibited intense allergen-induced airway inflammation and remodeling. The Q576R polymorphism did not affect proximal signal transducer and activator of transcription (STAT) 6 activation, but synergized with STAT6 in a gene target- and tissue-specific manner to mediate heightened expression of a subset of IL-4- and IL-13-responsive genes involved in allergic inflammation. Our findings indicate that the Q576R polymorphism directly promotes asthma in carrier populations by selectively augmenting IL-4R alpha-dependent signaling.
Asunto(s)
Asma/genética , Receptores de Superficie Celular/genética , Alelos , Animales , Asma/etiología , Humanos , Inmunoglobulina E/biosíntesis , Interleucina-13/fisiología , Interleucina-4/biosíntesis , Ratones , Ratones Transgénicos , Mutación , Ovalbúmina/inmunología , Polimorfismo Genético , Factor de Transcripción STAT6/metabolismo , Transducción de Señal , Células Th2/inmunologíaRESUMEN
Autosomal recessive cutis laxa (ARCL) describes a group of syndromal disorders that are often associated with a progeroid appearance, lax and wrinkled skin, osteopenia and mental retardation. Homozygosity mapping in several kindreds with ARCL identified a candidate region on chromosome 17q25. By high-throughput sequencing of the entire candidate region, we detected disease-causing mutations in the gene PYCR1. We found that the gene product, an enzyme involved in proline metabolism, localizes to mitochondria. Altered mitochondrial morphology, membrane potential and increased apoptosis rate upon oxidative stress were evident in fibroblasts from affected individuals. Knockdown of the orthologous genes in Xenopus and zebrafish led to epidermal hypoplasia and blistering that was accompanied by a massive increase of apoptosis. Our findings link mutations in PYCR1 to altered mitochondrial function and progeroid changes in connective tissues.
Asunto(s)
Cutis Laxo/etiología , Cutis Laxo/genética , Mutación , Pirrolina Carboxilato Reductasas/genética , Piel/metabolismo , Agenesia del Cuerpo Calloso , Secuencia de Bases , Estudios de Casos y Controles , Preescolar , Cromosomas Humanos Par 17 , Consanguinidad , Cutis Laxo/metabolismo , Femenino , Fibroblastos/metabolismo , Mutación del Sistema de Lectura , Eliminación de Gen , Genes Recesivos , Marcadores Genéticos , Homocigoto , Humanos , Lactante , Recién Nacido , Discapacidad Intelectual/genética , Masculino , Datos de Secuencia Molecular , Mutación Missense , Linaje , Mapeo Físico de Cromosoma , Polimorfismo de Nucleótido Simple , Pirrolina Carboxilato Reductasas/metabolismo , Piel/citología , Piel/ultraestructura , delta-1-Pirrolina-5-Carboxilato ReductasaRESUMEN
The Generic Model Organism Database (GMOD) initiative provides species-agnostic data models and software tools for representing curated model organism data. Here we describe GMODWeb, a GMOD project designed to speed the development of model organism database (MOD) websites. Sites created with GMODWeb provide integration with other GMOD tools and allow users to browse and search through a variety of data types. GMODWeb was built using the open source Turnkey web framework and is available from http://turnkey.sourceforge.net.