RESUMO
Whole genome sequencing (WGS) is a widely available, inexpensive means of providing a wealth of information about an organism's diversity and evolution. However, WGS for many pathogenic bacteria remain limited because they are difficult, slow and/or dangerous to culture. To avoid culturing, metagenomic sequencing can be performed directly on samples, but the sequencing effort required to characterize low frequency organisms can be expensive. Recently developed methods for selective whole genome amplification (SWGA) can enrich target DNA to provide efficient sequencing. We amplified Coxiella burnetii (a bacterial select agent and human/livestock pathogen) from 3 three environmental samples that were overwhelmed with host DNA. The 68- to 147-fold enrichment of the bacterial sequences provided enough genome coverage for SNP analyses and phylogenetic placement. SWGA is a valuable tool for the study of difficult-to-culture organisms and has the potential to facilitate high-throughput population characterizations as well as targeted epidemiological or forensic investigations.
Assuntos
Coxiella burnetii/genética , Genoma Bacteriano , Metagenoma , Animais , Coxiella burnetii/classificação , Coxiella burnetii/isolamento & purificação , Feminino , Cabras/microbiologia , Metagenômica/métodos , Leite/microbiologia , Filogenia , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma/métodosRESUMO
BACKGROUND: Each year, 9 million individuals cycle in and out of jails. The under-characterization of incarceration as an exposure poses substantial challenges to understanding how varying levels of exposure to jail may affect health. Thus, we characterized levels of jail incarceration including recidivism, number of incarcerations, total and average number of days incarcerated, and time to reincarceration. METHODS: We created a cohort of 75,203 individuals incarcerated at the Coconino County Detention Facility in Flagstaff, Arizona, from 2001 to 2018 from jail intake and release records. RESULTS: The median number of incarcerations during the study period was one (interquartile range [IQR] = 1-2). Forty percent of individuals had >1 incarceration. The median length of stay for first observed incarcerations was 1 day (IQR = 0-5). The median total days incarcerated was 3 (IQR = 1-23). Average length of stay increased by number of incarcerations. By 18 months, 27% of our sample had been reincarcerated. CONCLUSION: Characteristics of jail incarceration have been largely left out of public health research. A better understanding of jail incarcerations can help design analyses to assess health outcomes of individuals incarcerated in jail. Our study is an early step in shaping an understanding of jail incarceration as an exposure for future epidemiologic research. See video abstract at, http://links.lww.com/EDE/B536.
Assuntos
Disparidades nos Níveis de Saúde , Prisioneiros/estatística & dados numéricos , Saúde Pública , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Arizona , Projetos de Pesquisa Epidemiológica , Feminino , Seguimentos , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Adulto JovemRESUMO
BACKGROUND: Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. Unfortunately, this level of resolution is not sufficient in many applications where strain-level identification of bacteria is required (biodefense, forensics, clinical diagnostics, and outbreak investigations). Adding more genomic targets will increase the resolution, but the challenge is identifying the appropriate targets. VaST was developed to address this challenge by finding the minimum number of targets that, in combination, achieve maximum strain-level resolution for any strain complex. The final combination of target regions identified by the algorithm produce a unique haplotype for each strain which can be used as a fingerprint for identifying unknown samples in a TAS assay. VaST ensures that the targets have conserved primer regions so that the targets can be amplified in all of the known strains and it also favors the inclusion of targets with basal variants which makes the set more robust when identifying previously unseen strains. RESULTS: We analyzed VaST's performance using a number of different pathogenic species that are relevant to human disease outbreaks and biodefense. The number of targets required to achieve full resolution ranged from 20 to 88% fewer sites than what would be required in the worst case and most of the resolution is achieved within the first 20 targets. We computationally and experimentally validated one of the VaST panels and found that the targets led to accurate phylogenetic placement of strains, even when the strains were not a part of the original panel design. CONCLUSIONS: VaST is an open source software that, when provided a set of variant sites, can find the minimum number of sites that will provide maximum resolution of a strain complex, and it has many different run-time options that can accommodate a wide range of applications. VaST can be an effective tool in the design of strain identification panels that, when combined with TAS technologies, offer an efficient and inexpensive strain typing protocol.
Assuntos
Bactérias/classificação , Bactérias/genética , Técnicas de Tipagem Bacteriana/métodos , Genes Bacterianos , Genoma Bacteriano , Genômica/métodos , Tipagem de Sequências Multilocus/métodos , Polimorfismo de Nucleotídeo Único , Bactérias/isolamento & purificação , Genótipo , Humanos , FilogeniaRESUMO
The transcontinental spread of multidrug-resistant (MDR) tuberculosis is poorly characterized in molecular epidemiologic studies. We used genomic sequencing to understand the establishment and dispersion of MDR Mycobacterium tuberculosis within a group of immigrants to the United States. We used a genomic epidemiology approach to study a genotypically matched (by spoligotype, IS6110 restriction fragment length polymorphism, and mycobacterial interspersed repetitive units-variable number of tandem repeat signature) lineage 2/Beijing MDR strain implicated in an outbreak of tuberculosis among refugees in Thailand and consecutive cases within California. All 46 MDR M. tuberculosis genomes from both Thailand and California were highly related, with a median difference of 10 single-nucleotide polymorphisms (SNPs). The Wat Tham Krabok (WTK) strain is a new sequence type distinguished from all known Beijing strains by 55 SNPs and a genomic deletion (Rv1267c) associated with increased fitness. Sequence data revealed a highly prevalent MDR strain that included several closely related but distinct allelic variants within Thailand, rather than the occurrence of a single outbreak. In California, sequencing data supported multiple independent introductions of WTK with subsequent transmission and reactivation within the state, as well as a potential super spreader with a prolonged infectious period. Twenty-seven drug resistance-conferring mutations and 4 putative compensatory mutations were found within WTK strains. Genomic sequencing has substantial epidemiologic value in both low- and high-burden settings in understanding transmission chains of highly prevalent MDR strains.
Assuntos
Surtos de Doenças , Mycobacterium tuberculosis/genética , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , California , Genoma Bacteriano , Genótipo , Humanos , Epidemiologia Molecular , Tipagem Molecular , Filogenia , Polimorfismo de Fragmento de Restrição , Polimorfismo de Nucleotídeo Único , Prevalência , Tailândia , Tuberculose Resistente a Múltiplos Medicamentos/epidemiologiaRESUMO
The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health.
Assuntos
Microbiologia do Ar , DNA Bacteriano/isolamento & purificação , Metagenômica/métodos , Bacillus thuringiensis/isolamento & purificação , Bactérias/classificação , Bactérias/isolamento & purificação , Biomassa , Cidades , Variações do Número de Cópias de DNA , DNA Bacteriano/genética , District of Columbia , Monitoramento Ambiental , Fungos/classificação , Fungos/isolamento & purificação , Metagenoma , Estações do Ano , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
Escherichia coli is a diverse pathogen, causing a range of disease in humans, from self-limiting diarrhea to urinary tract infections (UTIs). Uropathogenic E. coli (UPEC) is the most frequently observed uropathogen in UTIs, a common disease in high-income countries, incurring billions of dollars yearly in treatment costs. Although E. coli is easily grown and identified in the clinical laboratory, genotyping the pathogen is more complicated, yet critical for reducing the incidence of disease. These goals can be achieved through whole-genome sequencing of E. coli isolates, but this approach is relatively slow and typically requires culturing the pathogen in the laboratory. To genotype E. coli rapidly and inexpensively directly from clinical samples, including but not limited to urine, we developed and validated a multiplex amplicon sequencing assay, called ColiSeq. The assay consists of targets designed for E. coli species confirmation, high resolution genotyping, and mixture deconvolution. To demonstrate its utility, we screened the ColiSeq assay against 230 clinical urine samples collected from a hospital system in Flagstaff, Arizona, USA. A limit of detection analysis demonstrated the ability of ColiSeq to identify E. coli at a concentration of ~2 genomic equivalent (GEs)/mL and to generate high-resolution genotyping at a concentration of 1 × 105 GEs/mL. The results of this study suggest that ColiSeq could be a valuable method to understand the source of UPEC strains and guide infection mitigation efforts. As sequence-based diagnostics become accepted in the clinical laboratory, workflows such as ColiSeq will provide actionable information to improve patient outcomes.IMPORTANCEUrinary tract infections (UTIs), caused primarily by Escherichia coli, create an enormous health care burden in the United States and other high-income countries. The early detection of E. coli from clinical samples, including urine, is important to target therapy and prevent further patient complications. Additionally, understanding the source of E. coli exposure will help with future mitigation efforts. In this study, we developed, tested, and validated an amplicon sequencing assay focused on direct detection of E. coli from urine. The resulting sequence data were demonstrated to provide strain level resolution of the pathogen, not only confirming the presence of E. coli, which can focus treatment efforts, but also providing data needed for source attribution and contact tracing. This assay will generate inexpensive, rapid, and reproducible data that can be deployed by public health agencies to track, diagnose, and potentially mitigate future UTIs caused by E. coli.
Assuntos
Infecções por Escherichia coli , Escherichia coli , Infecções Urinárias , Humanos , Infecções por Escherichia coli/microbiologia , Infecções por Escherichia coli/diagnóstico , Infecções Urinárias/microbiologia , Infecções Urinárias/diagnóstico , Escherichia coli/genética , Escherichia coli/isolamento & purificação , Escherichia coli Uropatogênica/genética , Escherichia coli Uropatogênica/isolamento & purificação , Escherichia coli Uropatogênica/classificação , Genótipo , Sequenciamento Completo do Genoma/métodos , Técnicas de Genotipagem/métodos , Reação em Cadeia da Polimerase Multiplex/métodosRESUMO
Leptospirosis (caused by pathogenic bacteria in the genus Leptospira ) is prevalent worldwide but more common in tropical and subtropical regions. Transmission can occur following direct exposure to infected urine from reservoir hosts, such as rats, or a urine-contaminated environment, which then can serve as an infection source for additional rats and other mammals, including humans. The brown rat, Rattus norvegicus , is an important reservoir of leptospirosis in urban settings. We investigated leptospirosis among brown rats in Boston, Massachusetts and hypothesized that rat dispersal in this urban setting influences the movement, persistence, and diversity of Leptospira . We analyzed DNA from 328 rat kidney samples collected from 17 sites in Boston over a seven-year period (2016-2022); 59 rats representing 12 of 17 sites were positive for Leptospira . We used 21 neutral microsatellite loci to genotype 311 rats and utilized the resulting data to investigate genetic connectivity among sampling sites. We generated whole genome sequences for 28 Leptospira isolates obtained from frozen and fresh tissue from some of the 59 Leptospira -positive rat kidneys. When isolates were not obtained, we attempted Leptospira genomic DNA capture and enrichment, which yielded 14 additional Leptospira genomes from rats. We also generated an enriched Leptospira genome from a 2018 human case in Boston. We found evidence of high genetic structure and limited dispersal among rat populations that is likely influenced by major roads and/or other unknown dispersal barriers, resulting in distinct rat population groups within the city; at certain sites these groups persisted for multiple years. We identified multiple distinct phylogenetic clades of L. interrogans among rats, with specific clades tightly linked to distinct rat populations. This pattern suggests L. interrogans persists in local rat populations and movement of leptospirosis in this urban rat community is driven by rat dispersal. Finally, our genomic analyses of the 2018 human leptospirosis case in Boston suggests a link to rats as the source. These findings will be useful for guiding rat control and human leptospirosis mitigation efforts in this and other urban settings.
RESUMO
Background: Most seasonally circulating enteroviruses result in asymptomatic or mildly symptomatic infections. In rare cases, however, infection with some subtypes can result in paralysis or death. Of the 300 subtypes known, only poliovirus is reportable, limiting our understanding of the distribution of other enteroviruses that can cause clinical disease. Objective: The overarching objectives of this study were to: 1) describe the distribution of enteroviruses in Arizona during the late summer and fall of 2022, the time of year when they are thought to be most abundant, and 2) demonstrate the utility of viral pan-assay approaches for semi-agnostic discovery that can be followed up by more targeted assays and phylogenomics. Methods: This study utilizes pooled nasal samples collected from school-aged children and long-term care facility residents, and wastewater from multiple locations in Arizona during July-October of 2022. We used PCR to amplify and sequence a region common to all enteroviruses, followed by species-level bioinformatic characterization using the QIIME 2 platform. For Enterovirus-D68 (EV-D68), detection was carried out using RT-qPCR, followed by confirmation using near-complete whole EV-D68 genome sequencing using a newly designed tiled amplicon approach. Results: In the late summer and early fall of 2022, multiple enterovirus species were identified in Arizona wastewater, with Coxsackievirus A6, EV-D68, and Coxsackievirus A19 composing 86% of the characterized reads sequenced. While EV-D68 was not identified in pooled human nasal samples, and the only reported acute flaccid myelitis case in Arizona did not test positive for the virus, an in-depth analysis of EV-D68 in wastewater revealed that the virus was circulating from August through mid-October. A phylogenetic analysis on this relatively limited dataset revealed just a few importations into the state, with a single clade indicating local circulation. Significance: This study further supports the utility of wastewater-based epidemiology to identify potential public health threats. Our further investigations into EV-D68 shows how these data might help inform healthcare diagnoses for children presenting with concerning neurological symptoms.
RESUMO
BACKGROUND: Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. RESULTS: This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. CONCLUSIONS: FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.
Assuntos
Enzimas/química , Proteínas/química , Proteômica/métodos , Motivos de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Enzimas/metabolismo , Modelos Moleculares , Conformação Proteica , Dobramento de Proteína , Proteínas/metabolismo , Análise de Sequência de ProteínaRESUMO
Due to the large number of negative tests, individually screening large populations for rare pathogens can be wasteful and expensive. Sample pooling methods improve the efficiency of large-scale pathogen screening campaigns by reducing the number of tests and reagents required to accurately categorize positive and negative individuals. Such methods rely on group testing theory which mainly focuses on minimizing the total number of tests; however, many other practical concerns and tradeoffs must be considered when choosing an appropriate method for a given set of circumstances. Here we use computational simulations to determine how several theoretical approaches compare in terms of (a) the number of tests, to minimize costs and save reagents, (b) the number of sequential steps, to reduce the time it takes to complete the assay, (c) the number of samples per pool, to avoid the limits of detection, (d) simplicity, to reduce the risk of human error, and (e) robustness, to poor estimates of the number of positive samples. We found that established methods often perform very well in one area but very poorly in others. Therefore, we introduce and validate a new method which performs fairly well across each of the above criteria making it a good general use approach.
Assuntos
Coxiella/isolamento & purificação , Testes Diagnósticos de Rotina/métodos , Infecções por Bactérias Gram-Negativas/diagnóstico , Programas de Rastreamento/métodos , Manejo de Espécimes/métodos , Simulação por Computador , Infecções por Bactérias Gram-Negativas/microbiologia , HumanosRESUMO
Prostate cancer is the most commonly diagnosed male cancer and the second leading cause of cancer deaths among men in the United States, with approximately 220,000 new diagnoses and approximately 27,000 deaths each year. Men with clinical low-risk disease can receive active surveillance to safely preserve quality of life, provided that the risk of an undetected aggressive cancer can be managed. Thus, prediction of a tumor's metastatic potential, ideally using only a biopsy sample, is critical to choosing appropriate treatment. We previously proposed and verified a metastasis potential score (MPS) based on regions prone to copy number alterations in metastatic prostate cancer; MPS is highly predictive of metastatic potential in primary tumors. We developed a novel, targeted postligation amplification sequencing approach, which we call the next-generation copy number alteration assay, to efficiently interrogate 902 genomic sites that belong to 194 genomic regions used in the MPS calculation. The assay is designed to work with the latest generation of sequencing platforms to produce estimates of copy number alteration events. The assay's technical reproducibility, robustness to low starting genomic material, and accuracy have been verified. The assay performed very well on cell lines, a cohort of prostate cancer surgical research samples, and matched punched biopsy samples, making it a significant step toward incorporating sequencing techniques for prostate cancer evaluation.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias da Próstata/genética , Variações do Número de Cópias de DNA , Testes Genéticos/economia , Testes Genéticos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Humanos , Masculino , Prognóstico , Neoplasias da Próstata/diagnóstico , Neoplasias da Próstata/etiologia , Qualidade de Vida , Fatores de Risco , Fatores de TempoRESUMO
DNA metabarcoding assays are powerful tools for delving into the DNA in wildlife feces, giving unprecedented ability to detect species, understand natural history, and identify pathogens for a range of applications in management, conservation, and research. Next-generation sequencing technology is developing rapidly, which makes it especially important that predictability and reproducibility of DNA metabarcoding assays are explored together with the post-depositional ecology of the target taxon's fecal DNA. Here, we defined the constraints of an assay called 'Species from Feces' used by government agencies, research groups, and non-governmental organizations to identify bat species from guano. We tested assay sensitivity by examining how time and humidity affect the ability to recover and successfully sequence DNA in guano, assessing whether a fecal pellet from a rare bat species could be detected in a background of feces from other bat species, and evaluating the efficacy of Species from Feces as a survey tool for bat roosts in temperate and tropical areas. We found that the assay performs well with feces over two years old in dry, cool environments, and fails by 12 months at 100% relative humidity. We also found that it reliably identifies rare DNA, has great utility for surveying roosts in temperate and tropical regions, and detects more bat species than do visual surveys. We attribute the success of Species from Feces to characteristics of the assay paired with application in taxa that are particularly well-suited for fecal DNA survival. In a time of rapid evolution of DNA metabarcoding approaches and their use with feces, this study illustrates the strengths and limitations of applied assays.
Assuntos
Quirópteros/genética , Fezes/química , Testes Genéticos/métodos , Animais , Arqueologia , Umidade , Mineração , Reprodutibilidade dos Testes , Sudoeste dos Estados Unidos , Especificidade da Espécie , Fatores de TempoRESUMO
This special issue of Practicing Anthropology presents multidisciplinary and multisectoral views of a community engaged health disparities project titled "Health Disparities in Jail Populations: Converging Epidemics of Infectious Disease, Chronic Illness, Behavioral Health, and Substance Abuse." The overall project incorporated traditional anthropological mixed-methods approaches with theory and methods from informatics, epidemiology, genomics, evolutionary and computational biology, community engagement, and applied/translational science.
RESUMO
Enteroviruses are a common cause of respiratory and gastrointestinal illness, and multiple subtypes, including poliovirus, can cause neurologic disease. In recent years, enterovirus D68 (EV-D68) has been associated with serious neurologic illnesses, including acute flaccid myelitis (AFM), frequently preceded by respiratory disease. A cluster of 11 suspect cases of pediatric AFM was identified in September 2016 in Phoenix, AZ. To determine if these cases were associated with EV-D68, we performed multiple genomic analyses of nasopharyngeal (NP) swabs and cerebrospinal fluid (CSF) material from the patients, including real-time PCR and amplicon sequencing targeting the EV-D68 VP1 gene and unbiased microbiome and metagenomic sequencing. Four of the 11 patients were classified as confirmed cases of AFM, and an additional case was classified as probable AFM. Real-time PCR and amplicon sequencing detected EV-D68 virus RNA in the three AFM patients from which NP swabs were collected, as well as in a fourth patient diagnosed with acute disseminated encephalomyelitis, a disease that commonly follows bacterial or viral infections, including enterovirus. No other obvious etiological causes for AFM were identified by 16S or RNA and DNA metagenomic sequencing in these cases, strengthening the likelihood that EV-D68 is an etiological factor. Herpes simplex viral DNA was detected in the CSF of the fourth case of AFM and in one additional suspect case from the cluster. Multiple genomic techniques, such as those described here, can be used to diagnose patients with suspected EV-D68 respiratory illness, to aid in AFM diagnosis, and for future EV-D68 surveillance and epidemiology.IMPORTANCE Enteroviruses frequently result in respiratory and gastrointestinal illness; however, multiple subtypes, including poliovirus, can cause severe neurologic disease. Recent biennial increases (i.e., 2014, 2016, and 2018) in cases of non-polio acute flaccid paralysis have led to speculations that other enteroviruses, specifically enterovirus D68 (EV-D68), are emerging to fill the niche that was left from poliovirus eradication. A cluster of 11 suspect cases of pediatric acute flaccid myelitis (AFM) was identified in 2016 in Phoenix, AZ. Multiple genomic analyses identified the presence of EV-D68 in the majority of clinical AFM cases. Beyond limited detection of herpesvirus, no other likely etiologies were found in the cluster. These findings strengthen the likelihood that EV-D68 is a cause of AFM and show that the rapid molecular assays developed for this study are useful for investigations of AFM and EV-D68.
Assuntos
Viroses do Sistema Nervoso Central/epidemiologia , Viroses do Sistema Nervoso Central/virologia , Análise por Conglomerados , Enterovirus Humano D/classificação , Enterovirus Humano D/isolamento & purificação , Mielite/epidemiologia , Mielite/virologia , Doenças Neuromusculares/epidemiologia , Doenças Neuromusculares/virologia , Filogenia , Arizona/epidemiologia , Líquido Cefalorraquidiano/virologia , Enterovirus Humano D/genética , Humanos , Epidemiologia Molecular , Nasofaringe/virologia , RNA Viral/genética , RNA Viral/isolamento & purificação , Reação em Cadeia da Polimerase em Tempo Real , Análise de Sequência de DNARESUMO
BACKGROUND: Structural genomics projects such as the Protein Structure Initiative (PSI) yield many new structures, but often these have no known molecular functions. One approach to recover this information is to use 3D templates - structure-function motifs that consist of a few functionally critical amino acids and may suggest functional similarity when geometrically matched to other structures. Since experimentally determined functional sites are not common enough to define 3D templates on a large scale, this work tests a computational strategy to select relevant residues for 3D templates. RESULTS: Based on evolutionary information and heuristics, an Evolutionary Trace Annotation (ETA) pipeline built templates for 98 enzymes, half taken from the PSI, and sought matches in a non-redundant structure database. On average each template matched 2.7 distinct proteins, of which 2.0 share the first three Enzyme Commission digits as the template's enzyme of origin. In many cases (61%) a single most likely function could be predicted as the annotation with the most matches, and in these cases such a plurality vote identified the correct function with 87% accuracy. ETA was also found to be complementary to sequence homology-based annotations. When matches are required to both geometrically match the 3D template and to be sequence homologs found by BLAST or PSI-BLAST, the annotation accuracy is greater than either method alone, especially in the region of lower sequence identity where homology-based annotations are least reliable. CONCLUSION: These data suggest that knowledge of evolutionarily important residues improves functional annotation among distant enzyme homologs. Since, unlike other 3D template approaches, the ETA method bypasses the need for experimental knowledge of the catalytic mechanism, it should prove a useful, large scale, and general adjunct to combine with other methods to decipher protein function in the structural proteome.
Assuntos
Motivos de Aminoácidos/genética , Enzimas , Evolução Molecular , Inteligência Artificial , Bases de Dados de Proteínas , Enzimas/química , Enzimas/genética , Enzimas/metabolismo , Funções Verossimilhança , Modelos Biológicos , Reconhecimento Automatizado de Padrão , Conformação Proteica , Proteoma , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Homologia Estrutural de Proteína , Relação Estrutura-AtividadeRESUMO
The environmental health status of jail populations in the United States constitutes a significant public health threat for prisoners and the general population. The ecology of jails creates a dynamic condition in relation to general population health due to the concentrated potential exposure to infectious diseases, difficult access to treatment for chronic health conditions, interruption in continuity of care for serious behavioral health conditions, as well as on-going issues for the prevention and treatment of substance abuse disorders. This paper reports on elements of a cross-sectional survey embedded in a parent project, "Health Disparities in Jail Populations." The overall project includes a comprehensive secondary data analysis of the health status of county jail populations, along with primary data collection that includes a cross-sectional health and health care services survey of incarcerated individuals, coupled with collection of biological samples to investigate infectious disease characteristics of a county jail population. This paper reports on the primary results of the survey data collection that indicate that this is a population with complex and interacting co-morbidities, as well as significant health disparities compared to the general population.
Assuntos
Disparidades nos Níveis de Saúde , Prisioneiros , Determinantes Sociais da Saúde , Adulto , Feminino , Inquéritos Epidemiológicos , Humanos , Masculino , Morbidade , Prisões , Estados UnidosRESUMO
BACKGROUND: Incarcerated populations have increased in the last 20 years and >12 million individuals cycle in and out of jails each year. Previous research has predominately focused on the prison population. However, a substantial gap exists in understanding the health, well-being, and health care utilization patterns in jail populations. OBJECTIVE: This pilot study has 5 main objectives: (1) define recidivists of the jail system, individuals characterized by high incarceration rates; (2) describe and compare the demographic and clinical characteristics of incarcerated individuals; (3) identify jail-associated health disparities; (4) estimate associations between incarceration and health; and (5) describe model patterns in health care and jail utilization. METHODS: The project has two processes-a secondary data analysis and primary data collection-which includes a cross-sectional health survey and biological sample collection to investigate infectious disease characteristics of the jail population. This protocol contains pilot elements in four areas: (1) instrument validity and reliability; (2) individual item assessment; (3) proof of concept of content and database accessibility; and (4) pilot test of the "honest broker" system. Secondary data analysis includes the analysis of 6 distinct databases, each covered by a formal memorandum of agreement between Northern Arizona University and the designated institution: (1) the Superior Court of Arizona Public Case Finder database; (2) North Country Health Care; (3) Health Choice Integrated Care; (4) Criminal Justice Information Services; (5) Correctional Electronic Medical Records; and (6) iLEADS. We will perform data integration processes using an automated honest broker design. We will administer a cross-sectional health survey, which includes questions about health status, health history, health care utilization, substance use practices, physical activity, adverse childhood events, and behavioral health, among 200 Coconino County Detention Facility inmates. Concurrent with the survey administration, we will collect Methicillin-resistant and Methicillin-sensitive Staphylococcus aureus (samples from the nose) and dental microbiome (Streptococcus sobrinus and Streptococcus mutans samples from the mouth) from consenting participants. RESULTS: To date, we have permission to link data across acquired databases. We have initiated data transfer, protection, and initial assessment of the 6 secondary databases. Of 199 inmates consented and enrolled, we have permission from 97.0% (193/199) to access and link electronic medical and incarceration records to their survey responses, and 95.0% (189/199) of interviewed inmates have given nasal and buccal swabs for analysis of S. aureus and the dental microbiome. CONCLUSIONS: This study is designed to increase the understanding of health needs and health care utilization patterns among jail populations, with a special emphasis on frequently incarcerated individuals. Our findings will help identify intervention points throughout the criminal justice and health care systems to improve health and reduce health disparities among jail inmates. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/10337.
RESUMO
Bats and their associated guano microbiota provide important terrestrial and subterranean ecosystem services and serve as a reservoir for a wide range of epizootic and zoonotic diseases. Unfortunately, large-scale studies of bats and their guano microbiotas are limited by the time and cost of sample collection, which requires specially trained individuals to work at night to capture bats when they are most active. Indirectly surveying bat gut microbiota through guano deposits could be a more cost-effective alternative, but it must first be established whether the postdefecation exposure to an aerobic environment has a large impact on the guano microbial community. A number of recent studies on mammalian feces have shown that the impact of aerobic exposure is highly species specific; therefore, it is difficult to predict how exposure will affect the bat guano microbiota without empirical data. In our study, we collected fresh guano samples from 24 individuals of 10 bat species that are common throughout the arid environments of the American southwest and subjected the samples to 0, 1, and 12 hr of exposure. The biodiversity decreased rapidly after the shift from an anaerobic to an aerobic environment-much faster than previously reported in mammalian species. However, the relative composition of the core guano microbiota remained stable and, using highly sensitive targeted PCR methods, we found that pathogens present in the original, non-exposed samples could still be recovered after 12 hr of exposure. These results suggest that with careful sample analysis protocols, a more efficient passive collection strategy is feasible; for example, guano could be collected on tarps placed near the roost entrance. Such passive collection methods would greatly reduce the cost of sample collection by allowing more sites or roosts to be surveyed with a fraction of trained personnel, time, and effort investments needed.
RESUMO
West Nile Virus (WNV) has been detected annually in Maricopa County, Arizona, since 2003. With this in mind, we sought to determine if contemporary strains are endemic to the county or are annually imported. As part of this effort, we developed a new protocol for tiled amplicon sequencing of WNV to efficiently attain greater than 99% coverage of 14 WNV genomes collected directly from positive mosquito pools distributed throughout Maricopa County between 2014 and 2017. Bayesian phylogenetic analyses revealed that contemporary genomes fall within two major lineages; NA/WN02 and SW/WN03. We found that all of the Arizona strains possessed an amino acid substitution known to be under positive selection, which has arisen independently at least four times in Arizona. The SW/WN03 strains exhibited transient behavior, with at least 10 separate introductions into Arizona when considering both historical and contemporary strains. However, NA/WN02 strains are geographically differentiated and appear to be endemic in Arizona, with two clades that have been circulating for four and seven years. This establishment in Maricopa County provides the first evidence of local overwintering by a WNV strain over the course of several years in Arizona. Within a national context, the placement of eleven contemporary Arizona strains in the NA/WN02 lineage indicates while WNV first entered the northeastern United States in 1999, the most ancestral extant strains of WNV are now circulating in the American southwest.
Assuntos
Filogenia , Febre do Nilo Ocidental/genética , Vírus do Nilo Ocidental/genética , Substituição de Aminoácidos/genética , Animais , Culicidae/virologia , Surtos de Doenças , Variação Genética , Genótipo , Humanos , New England , Febre do Nilo Ocidental/virologia , Vírus do Nilo Ocidental/classificação , Vírus do Nilo Ocidental/patogenicidadeRESUMO
The development of new and effective drugs is strongly affected by the need to identify drug targets and to reduce side effects. Resolving these issues depends partially on a thorough understanding of the biological function of proteins. Unfortunately, the experimental determination of protein function is expensive and time consuming. To support and accelerate the determination of protein functions, algorithms for function prediction are designed to gather evidence indicating functional similarity with well studied proteins. One such approach is the MASH pipeline, described in the first half of this paper. MASH identifies matches of geometric and chemical similarity between motifs, representing known functional sites, and substructures of functionally uncharacterized proteins (targets). Observations from several research groups concur that statistically significant matches can indicate functionally related active sites. One major subproblem is the design of effective motifs, which have many matches to functionally related targets (sensitive motifs), and few matches to functionally unrelated targets (specific motifs). Current techniques select and combine structural, physical, and evolutionary properties to generate motifs that mirror functional characteristics in active sites. This approach ignores incidental similarities that may occur with functionally unrelated proteins. To address this problem, we have developed Geometric Sieving (GS), a parallel distributed algorithm that efficiently refines motifs, designed by existing methods, into optimized motifs with maximal geometric and chemical dissimilarity from all known protein structures. In exhaustive comparison of all possible motifs based on the active sites of 10 well-studied proteins, we observed that optimized motifs were among the most sensitive and specific.