RESUMO
Biobank projects are generating genomic data for many thousands of individuals. Computational methods are needed to handle these massive data sets, including genetic ancestry (GA) inference tools. Current methods for GA inference do not scale to biobank-size genomic datasets. We present Rye-a new algorithm for GA inference at biobank scale. We compared the accuracy and runtime performance of Rye to the widely used RFMix, ADMIXTURE and iAdmix programs and applied it to a dataset of 488221 genome-wide variant samples from the UK Biobank. Rye infers GA based on principal component analysis of genomic variant samples from ancestral reference populations and query individuals. The algorithm's accuracy is powered by Metropolis-Hastings optimization and its speed is provided by non-negative least squares regression. Rye produces highly accurate GA estimates for three-way admixed populations-African, European and Native American-compared to RFMix and ADMIXTURE (${R}^2 = \ 0.998 - 1.00$), and shows 50× runtime improvement compared to ADMIXTURE on the UK Biobank dataset. Rye analysis of UK Biobank samples demonstrates how it can be used to infer GA at both continental and subcontinental levels. We discuss user consideration and options for the use of Rye; the program and its documentation are distributed on the GitHub repository: https://github.com/healthdisparities/rye.
Assuntos
Genética Populacional , Secale , Humanos , Secale/genética , Bancos de Espécimes Biológicos , Algoritmos , Genômica , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Genome-enabled approaches to molecular epidemiology have become essential to public health agencies and the microbial research community. We developed the algorithm STing to provide turn-key solutions for molecular typing and gene detection directly from next generation sequence data of microbial pathogens. Our implementation of STing uses an innovative k-mer search strategy that eliminates the computational overhead associated with the time-consuming steps of quality control, assembly, and alignment, required by more traditional methods. We compared STing to six of the most widely used programs for genome-based molecular typing and demonstrate its ease of use, accuracy, speed and efficiency. STing shows superior accuracy and performance for standard multilocus sequence typing schemes, along with larger genome-scale typing schemes, and it enables rapid automated detection of antimicrobial resistance and virulence factor genes. STing determines the sequence type of traditional 7-gene MLST with 100% accuracy in less than 10 seconds per isolate. We hope that the adoption of STing will help to democratize microbial genomics and thereby maximize its benefit for public health.
Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Tipagem de Sequências Multilocus/métodos , Resistência Microbiana a Medicamentos/genética , Genes Microbianos , Genômica/métodos , Software , Fatores de Virulência/genéticaRESUMO
European and African descendants settled the continental US during the 17th-19th centuries, coming into contact with established Native American populations. The resulting admixture among these groups yielded a significant reservoir of Native American ancestry in the modern US population. We analyzed the patterns of Native American admixture seen for the three largest genetic ancestry groups in the US population: African descendants, Western European descendants, and Spanish descendants. The three groups show distinct Native American ancestry profiles, which are indicative of their historical patterns of migration and settlement across the country. Native American ancestry in the modern African descendant population does not coincide with local geography, instead forming a single group with origins in the southeastern US, consistent with the Great Migration of the early 20th century. Western European descendants show Native American ancestry that tracks their geographic origins across the US, indicative of ongoing contact during westward expansion, and Native American ancestry can resolve Spanish descendant individuals into distinct local groups formed by more recent migration from Mexico and Puerto Rico. We found an anomalous pattern of Native American ancestry from the US southwest, which most likely corresponds to the Nuevomexicano descendants of early Spanish settlers to the region. We addressed a number of controversies surrounding this population, including the extent of Sephardic Jewish ancestry. Nuevomexicanos are less admixed than nearby Mexican-American individuals, with more European and less Native American and African ancestry, and while they do show demonstrable Sephardic Jewish ancestry, the fraction is no greater than seen for other New World Spanish descendant populations.
Assuntos
Migração Humana/tendências , Indígenas Norte-Americanos/genética , População Negra/genética , Genética Populacional/métodos , Genoma Humano/genética , Geografia , Haplótipos , Hispânico ou Latino/genética , Humanos , Americanos Mexicanos/genética , Estados Unidos , População Branca/genéticaRESUMO
BACKGROUND: Pharmacogenomic (PGx) variants mediate how individuals respond to medication, and response differences among racial/ethnic groups have been attributed to patterns of PGx diversity. We hypothesized that genetic ancestry (GA) would provide higher resolution for stratifying PGx risk, since it serves as a more reliable surrogate for genetic diversity than self-identified race/ethnicity (SIRE), which includes a substantial social component. We analyzed a cohort of 8628 individuals from the United States (US), for whom we had both SIRE information and whole genome genotypes, with a focus on the three largest SIRE groups in the US: White, Black (African-American), and Hispanic (Latino). Our approach to the question of PGx risk stratification entailed the integration of two distinct methodologies: population genetics and evidence-based medicine. This integrated approach allowed us to consider the clinical implications for the observed patterns of PGx variation found within and between population groups. RESULTS: Whole genome genotypes were used to characterize individuals' continental ancestry fractions-European, African, and Native American-and individuals were grouped according to their GA profiles. SIRE and GA groups were found to be highly concordant. Continental ancestry predicts individuals' SIRE with > 96% accuracy, and accordingly, GA provides only a marginal increase in resolution for PGx risk stratification. In light of the concordance between SIRE and GA, taken together with the fact that information on SIRE is readily available to clinicians, we evaluated PGx variation between SIRE groups to explore the potential clinical utility of race and ethnicity. PGx variants are highly diverged compared to the genomic background; 82 variants show significant frequency differences among SIRE groups, and genome-wide patterns of PGx variation are almost entirely concordant with SIRE. The vast majority of PGx variation is found within rather than between groups, a well-established fact for almost all genetic variants, which is often taken to argue against the clinical utility of population stratification. Nevertheless, analysis of highly differentiated PGx variants illustrates how SIRE partitions PGx variation based on groups' characteristic ancestry patterns. These cases underscore the extent to which SIRE carries clinically valuable information for stratifying PGx risk among populations, albeit with less utility for predicting individual-level PGx alleles (genotypes), supporting the concept of population pharmacogenomics. CONCLUSIONS: Perhaps most interestingly, we show that individuals who identify as Black or Hispanic stand to gain far more from the consideration of race/ethnicity in treatment decisions than individuals from the majority White population.
Assuntos
Etnicidade/genética , Genoma Humano , Genótipo , Medição de Risco , Genética Populacional , Humanos , Farmacogenética , Estados UnidosRESUMO
BACKGROUND: Hispanic/Latino (HL) populations bear a disproportionately high burden of type 2 diabetes (T2D). The ability to predict T2D genetic risk using polygenic risk scores (PRS) offers great promise for improved screening and prevention. However, there are a number of complications related to the accurate inference of genetic risk across HL populations with distinct ancestry profiles. We investigated how ancestry affects the inference of T2D genetic risk using PRS in diverse HL populations from Colombia and the United States (US). In Colombia, we compared T2D genetic risk for the Mestizo population of Antioquia to the Afro-Colombian population of Chocó, and in the US, we compared European-American versus Mexican-American populations. METHODS: Whole genome sequences and genotypes from the 1000 Genomes Project and the ChocoGen Research Project were used for genetic ancestry inference and for T2D polygenic risk score (PRS) calculation. Continental ancestry fractions for HL genomes were inferred via comparison with African, European, and Native American reference genomes, and PRS were calculated using T2D risk variants taken from multiple genome-wide association studies (GWAS) conducted on cohorts with diverse ancestries. A correction for ancestry bias in T2D risk inference based on the frequencies of ancestral versus derived alleles was developed and applied to PRS calculations in the HL populations studied here. RESULTS: T2D genetic risk in Colombian and US HL populations is positively correlated with African and Native American ancestry and negatively correlated with European ancestry. The Afro-Colombian population of Chocó has higher predicted T2D risk than Antioquia, and the Mexican-American population has higher predicted risk than the European-American population. The inferred relative risk of T2D is robust to differences in the ancestry of the GWAS cohorts used for variant discovery. For trans-ethnic GWAS, population-specific variants and variants with same direction effects across populations yield consistent results. Nevertheless, the control for bias in T2D risk prediction confirms that explicit consideration of genetic ancestry can yield more reliable cross-population genetic risk inferences. CONCLUSIONS: T2D associations that replicate across populations provide for more reliable risk inference, and modeling population-specific frequencies of ancestral and derived risk alleles can help control for biases in PRS estimation.
Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Hispânico ou Latino/genética , População Branca/genética , Colômbia , Diabetes Mellitus Tipo 2/epidemiologia , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único/genética , Prevalência , Fatores de Risco , Estados UnidosRESUMO
Human populations from around the world show striking phenotypic variation across a wide variety of traits. Genome-wide association studies (GWAS) are used to uncover genetic variants that influence the expression of heritable human traits; accordingly, population-specific distributions of GWAS-implicated variants may shed light on the genetic basis of human phenotypic diversity. With this in mind, we developed the GlobAl Distribution of GEnetic Traits web server (GADGET http://gadget.biosci.gatech.edu). The GADGET web server provides users with a dynamic visual platform for exploring the relationship between worldwide genetic diversity and the genetic architecture underlying numerous human phenotypes. GADGET integrates trait-implicated single nucleotide polymorphisms (SNPs) from GWAS, with population genetic data from the 1000 Genomes Project, to calculate genome-wide polygenic trait scores (PTS) for 818 phenotypes in 2504 individual genomes. Population-specific distributions of PTS are shown for 26 human populations across 5 continental population groups, with traits ordered based on the extent of variation observed among populations. Users of GADGET can also upload custom trait SNP sets to visualize global PTS distributions for their own traits of interest.
Assuntos
Herança Multifatorial , Software , Estudo de Associação Genômica Ampla , Humanos , Internet , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The convergence of hypervirulence and multidrug resistance in Klebsiella pneumoniae is a significant concern. Here, we report the first screen for hypermucoviscosity, a trait associated with increased virulence, using a U.S. surveillance collection of carbapenem-resistant (CR) K. pneumoniae isolates. We identified one hypermucoviscous isolate, which carried a gene encoding the KPC-3 carbapenemase, among numerous resistance genes. The strain further exhibited colistin heteroresistance undetected by diagnostics. This convergence of diverse resistance mechanisms and increased virulence underscores the need for enhanced K. pneumoniae surveillance.
Assuntos
Enterobacteriáceas Resistentes a Carbapenêmicos/genética , Carbapenêmicos/farmacologia , Colistina/farmacologia , Enterobacteriáceas Resistentes a Carbapenêmicos/efeitos dos fármacos , Farmacorresistência Bacteriana/genética , Genótipo , Klebsiella pneumoniae/efeitos dos fármacos , Klebsiella pneumoniae/genética , Testes de Sensibilidade Microbiana , VirulênciaRESUMO
Transposable elements (TEs) are an important source of human genetic variation with demonstrable effects on phenotype. Recently, a number of computational methods for the detection of polymorphic TE (polyTE) insertion sites from next-generation sequence data have been developed. The use of such tools will become increasingly important as the pace of human genome sequencing accelerates. For this report, we performed a comparative benchmarking and validation analysis of polyTE detection tools in an effort to inform their selection and use by the TE research community. We analyzed a core set of seven tools with respect to ease of use and accessibility, polyTE detection performance and runtime parameters. An experimentally validated set of 893 human polyTE insertions was used for this purpose, along with a series of simulated data sets that allowed us to assess the impact of sequence coverage on tool performance. The recently developed tool MELT showed the best overall performance followed by Mobster and then RetroSeq. PolyTE detection tools can best detect Alu insertion events in the human genome with reduced reliability for L1 insertions and substantially lowered performance for SVA insertions. We also show evidence that different polyTE detection tools are complementary with respect to their ability to detect a complete set of insertion events. Accordingly, a combined approach, coupled with manual inspection of individual results, may yield the best overall performance. In addition to the benchmarking results, we also provide notes on tool installation and usage as well as suggestions for future polyTE detection algorithm development.
Assuntos
Benchmarking/métodos , Elementos de DNA Transponíveis , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Software , Algoritmos , Genoma Humano , HumanosRESUMO
Transposable element (TE) derived sequences are known to contribute to the regulation of the human genome. The majority of known TE-derived regulatory sequences correspond to relatively ancient insertions, which are fixed across human populations. The extent to which human genetic variation caused by recent TE activity leads to regulatory polymorphisms among populations has yet to be thoroughly explored. In this study, we searched for associations between polymorphic TE (polyTE) loci and human gene expression levels using an expression quantitative trait loci (eQTL) approach. We compared locus-specific polyTE insertion genotypes to B cell gene expression levels among 445 individuals from 5 human populations. Numerous human polyTE loci correspond to both cis and trans eQTL, and their regulatory effects are directly related to cell type-specific function in the immune system. PolyTE loci are associated with differences in expression between European and African population groups, and a single polyTE loci is indirectly associated with the expression of numerous genes via the regulation of the B cell-specific transcription factor PAX5. The polyTE-gene expression associations we found indicate that human TE genetic variation can have important phenotypic consequences. Our results reveal that TE-eQTL are involved in population-specific gene regulation as well as transcriptional network modification.
Assuntos
Linfócitos B/metabolismo , Elementos de DNA Transponíveis/imunologia , Redes Reguladoras de Genes , Genoma Humano , Locos de Características Quantitativas , Linfócitos B/imunologia , População Negra , Citocinas/genética , Citocinas/imunologia , Regulação da Expressão Gênica , Loci Gênicos , Humanos , Imunidade Inata , Fator de Transcrição PAX5/genética , Fator de Transcrição PAX5/imunologia , Polimorfismo de Nucleotídeo Único , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T/imunologia , População BrancaRESUMO
BACKGROUND: Modern Latin American populations were formed via genetic admixture among ancestral source populations from Africa, the Americas and Europe. We are interested in studying how combinations of genetic ancestry in admixed Latin American populations may impact genomic determinants of health and disease. For this study, we characterized the impact of ancestry and admixture on genetic variants that underlie health- and disease-related phenotypes in population genomic samples from Colombia, Mexico, Peru, and Puerto Rico. RESULTS: We analyzed a total of 347 admixed Latin American genomes along with 1102 putative ancestral source genomes from Africans, Europeans, and Native Americans. We characterized the genetic ancestry, relatedness, and admixture patterns for each of the admixed Latin American genomes, finding a spectrum of ancestry proportions within and between populations. We then identified single nucleotide polymorphisms (SNPs) with anomalous ancestry-enrichment patterns, i.e. SNPs that exist in any given Latin American population at a higher frequency than expected based on the population's genetic ancestry profile. For this set of ancestry-enriched SNPs, we inspected their phenotypic impact on disease, metabolism, and the immune system. All four of the Latin American populations show ancestry-enrichment for a number of shared pathways, yielding evidence of similar selection pressures on these populations during their evolution. For example, all four populations show ancestry-enriched SNPs in multiple genes from immune system pathways, such as the cytokine receptor interaction, T cell receptor signaling, and antigen presentation pathways. We also found SNPs with excess African or European ancestry that are associated with ancestry-specific gene expression patterns and play crucial roles in the immune system and infectious disease responses. Genes from both the innate and adaptive immune system were found to be regulated by ancestry-enriched SNPs with population-specific regulatory effects. CONCLUSIONS: Ancestry-enriched SNPs in Latin American populations have a substantial effect on health- and disease-related phenotypes. The concordant impact observed for same phenotypes across populations points to a process of adaptive introgression, whereby ancestry-enriched SNPs with specific functional utility appear to have been retained in modern populations by virtue of their effects on health and fitness.
Assuntos
Doença/etnologia , Doença/genética , Genética Populacional , Genoma Humano , Genômica/métodos , Polimorfismo de Nucleotídeo Único , População Negra , Etnicidade/genética , Nível de Saúde , Humanos , América Latina , População BrancaRESUMO
Rapid and accurate identification of the sequence type (ST) of bacterial pathogens is critical for epidemiological surveillance and outbreak control. Cheaper and faster next-generation sequencing (NGS) technologies have taken preference over the traditional method of amplicon sequencing for multilocus sequence typing (MLST). But data generated by NGS platforms necessitate quality control, genome assembly and sequence similarity searching before an isolate's ST can be determined. These are computationally intensive and time consuming steps, which are not ideally suited for real-time molecular epidemiology. Here, we present stringMLST, an assembly- and alignment-free, lightweight, platform-independent program capable of rapidly typing bacterial isolates directly from raw sequence reads. The program implements a simple hash table data structure to find exact matches between short sequence strings (k-mers) and an MLST allele library. We show that stringMLST is more accurate, and order of magnitude faster, than its contemporary genome-based ST detection tools. AVAILABILITY AND IMPLEMENTATION: The source code and documentations are available at http://jordan.biology.gatech.edu/page/software/stringMLST CONTACT: lavanya.rishishwar@gatech.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Bactérias/classificação , Técnicas de Tipagem Bacteriana/métodos , Tipagem de Sequências Multilocus/métodos , Software , Bactérias/genética , Campylobacter jejuni/classificação , Campylobacter jejuni/genética , Chlamydia trachomatis/classificação , Chlamydia trachomatis/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neisseria meningitidis/classificação , Neisseria meningitidis/genética , Análise de Sequência de DNA/métodos , Streptococcus pneumoniae/classificação , Streptococcus pneumoniae/genéticaRESUMO
The dinitrogenase reductase gene (nifH) is the most widely established molecular marker for the study of nitrogen-fixing prokaryotes in nature. A large number of PCR primer sets have been developed for nifH amplification, and the effective deployment of these approaches should be guided by a rapid, easy-to-use analysis protocol. Bioinformatic analysis of marker gene sequences also requires considerable expertise. In this study, we advance the state of the art for nifH analysis by evaluating nifH primer set performance, developing an improved amplicon sequencing workflow, and implementing a user-friendly bioinformatics pipeline. The developed amplicon sequencing workflow is a three-stage PCR-based approach that uses established technologies for incorporating sample-specific barcode sequences and sequencing adapters. Based on our primer evaluation, we recommend the Ando primer set be used with a modified annealing temperature of 58°C, as this approach captured the largest diversity of nifH templates, including paralog cluster IV/V sequences. To improve nifH sequence analysis, we developed a computational pipeline which infers taxonomy and optionally filters out paralog sequences. In addition, we employed an empirical model to derive optimal operational taxonomic unit (OTU) cutoffs for the nifH gene at the species, genus, and family levels. A comprehensive workflow script named TaxADivA (TAXonomy Assignment and DIVersity Assessment) is provided to ease processing and analysis of nifH amplicons. Our approach is then validated through characterization of diazotroph communities across environmental gradients in beach sands impacted by the Deepwater Horizon oil spill in the Gulf of Mexico, in a peat moss-dominated wetland, and in various plant compartments of a sugarcane field.IMPORTANCE Nitrogen availability often limits ecosystem productivity, and nitrogen fixation, exclusive to prokaryotes, comprises a major source of nitrogen input that sustains food webs. The nifH gene, which codes for the iron protein of the nitrogenase enzyme, is the most widely established molecular marker for the study of nitrogen-fixing microorganisms (diazotrophs) in nature. In this study, a flexible sequencing/analysis pipeline, named TaxADivA, was developed for nifH amplicons produced by Illumina paired-end sequencing, and it enables an inference of taxonomy, performs clustering, and produces output in formats that may be used by programs that facilitate data exploration and analysis. Diazotroph diversity and community composition are linked to ecosystem functioning, and our results advance the phylogenetic characterization of diazotroph communities by providing empirically derived nifH similarity cutoffs for species, genus, and family levels. The utility of our pipeline is validated for diazotroph communities in a variety of ecosystems, including contaminated beach sands, peatland ecosystems, living plant tissues, and rhizosphere soil.
Assuntos
Bactérias/genética , Microbiota/genética , Fixação de Nitrogênio , Oxirredutases/genética , Microbiologia do Solo , Bactérias/classificação , Bactérias/metabolismo , Fenômenos Fisiológicos Bacterianos , Biologia Computacional , DNA Bacteriano/genética , Ecossistema , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenômica , Microbiota/fisiologia , Nitrogênio/metabolismo , Filogenia , Reação em Cadeia da Polimerase , Rizosfera , Análise de Sequência de DNARESUMO
Insulators are regulatory elements that help to organize eukaryotic chromatin via enhancer-blocking and chromatin barrier activity. Although there are several examples of transposable element (TE)-derived insulators, the contribution of TEs to human insulators has not been systematically explored. Mammalian-wide interspersed repeats (MIRs) are a conserved family of TEs that have substantial regulatory capacity and share sequence characteristics with tRNA-related insulators. We sought to evaluate whether MIRs can serve as insulators in the human genome. We applied a bioinformatic screen using genome sequence and functional genomic data from CD4(+) T cells to identify a set of 1,178 predicted MIR insulators genome-wide. These predicted MIR insulators were computationally tested to serve as chromatin barriers and regulators of gene expression in CD4(+) T cells. The activity of predicted MIR insulators was experimentally validated using in vitro and in vivo enhancer-blocking assays. MIR insulators are enriched around genes of the T-cell receptor pathway and reside at T-cell-specific boundaries of repressive and active chromatin. A total of 58% of the MIR insulators predicted here show evidence of T-cell-specific chromatin barrier and gene regulatory activity. MIR insulators appear to be CCCTC-binding factor (CTCF) independent and show a distinct local chromatin environment with marked peaks for RNA Pol III and a number of histone modifications, suggesting that MIR insulators recruit transcriptional complexes and chromatin modifying enzymes in situ to help establish chromatin and regulatory domains in the human genome. The provisioning of insulators by MIRs across the human genome suggests a specific mechanism by which TE sequences can be used to modulate gene regulatory networks.
Assuntos
Genoma Humano , Elementos Isolantes/genética , Mamíferos/genética , Retroelementos/genética , Animais , Sequência de Bases , Cromatina/metabolismo , Biologia Computacional , Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica , Humanos , Especificidade de Órgãos/genética , Reprodutibilidade dos Testes , Linfócitos T/metabolismoRESUMO
BACKGROUND: Mitochondrial replacement (MR) therapy is a new assisted reproductive technology that allows women with mitochondrial disorders to give birth to healthy children by combining their nuclei with mitochondria from unaffected egg donors. Evolutionary biologists have raised concerns about the safety of MR therapy based on the extent to which nuclear and mitochondrial genomes are observed to co-evolve within natural populations, i.e. the nuclear-mitochondrial mismatch hypothesis. In support of this hypothesis, a number of previous studies on model organisms have provided evidence for incompatibility between nuclear and mitochondrial genomes from divergent populations of the same species. RESULTS: We tested the nuclear-mitochondrial mismatch hypothesis for humans by observing the extent of naturally occurring nuclear-mitochondrial mismatch seen for 2,504 individuals across 26 populations, from 5 continental populations groups, characterized as part of the 1000 Genomes Project (1KGP). We also performed a replication analysis on mitochondrial DNA (mtDNA) haplotypes for 1,043 individuals from 58 populations, characterized as part of the Human Genome Diversity Project (HGDP). Nuclear DNA (nDNA) and mtDNA sequences from the 1KGP were directly compared within and between populations, and the population distributions of mtDNA haplotypes derived from both sequence (1KGP) and genotype (HGDP) data were evaluated. Levels of nDNA and mtDNA pairwise sequence divergence are highly correlated, consistent with their co-evolution among human populations. However, there are numerous cases of co-occurrence of nuclear and mitochondrial genomes from divergent populations within individual humans. Furthermore, pairs of individuals with closely related nuclear genomes can have highly divergent mtDNA haplotypes. Supposedly mismatched nuclear-mitochondrial genome combinations are found not only within individuals from populations known to be admixed, where they may be expected, but also from populations with low overall levels of observed admixture. CONCLUSIONS: These results show that mitochondrial and nuclear genomes from divergent human populations can co-exist within healthy individuals, indicating that mismatched nDNA-mtDNA combinations are not deleterious or subject to purifying selection. Accordingly, human nuclear-mitochondrial mismatches are not likely to jeopardize the safety of MR therapy.
Assuntos
DNA Mitocondrial/metabolismo , DNA/metabolismo , Evolução Molecular , DNA/química , DNA Mitocondrial/química , DNA Mitocondrial/classificação , Frequência do Gene , Variação Genética , Genoma Humano , Genótipo , Haplótipos , Projeto Genoma Humano , Humanos , Doenças Mitocondriais/genética , Doenças Mitocondriais/terapia , Terapia de Substituição Mitocondrial , FilogeniaRESUMO
The DNA is cells is continuously exposed to reactive oxygen species resulting in toxic and mutagenic DNA damage. Although the repair of oxidative DNA damage occurs primarily through the base excision repair (BER) pathway, the nucleotide excision repair (NER) pathway processes some of the same lesions. In addition, damage tolerance mechanisms, such as recombination and translesion synthesis, enable cells to tolerate oxidative DNA damage, especially when BER and NER capacities are exceeded. Thus, disruption of BER alone or disruption of BER and NER in Saccharomyces cerevisiae leads to increased mutations as well as large-scale genomic rearrangements. Previous studies demonstrated that a particular region of chromosome II is susceptible to chronic oxidative stress-induced chromosomal rearrangements, suggesting the existence of DNA damage and/or DNA repair hotspots. Here we investigated the relationship between oxidative damage and genomic instability utilizing chromatin immunoprecipitation combined with DNA microarray technology to profile DNA repair sites along yeast chromosomes under different oxidative stress conditions. We targeted the major yeast AP endonuclease Apn1 as a representative BER protein. Our results indicate that Apn1 target sequences are enriched for cytosine and guanine nucleotides. We predict that BER protects these sites in the genome because guanines and cytosines are thought to be especially susceptible to oxidative attack, thereby preventing large-scale genome destabilization from chronic accumulation of DNA damage. Information from our studies should provide insight into how regional deployment of oxidative DNA damage management systems along chromosomes protects against large-scale rearrangements. Copyright © 2017 John Wiley & Sons, Ltd.
Assuntos
Mapeamento Cromossômico , Enzimas Reparadoras do DNA/metabolismo , Endodesoxirribonucleases/metabolismo , Estresse Oxidativo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Sítios de Ligação/genética , Dano ao DNA , Reparo do DNA , Enzimas Reparadoras do DNA/química , Endodesoxirribonucleases/química , Instabilidade Genômica , Espécies Reativas de Oxigênio/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/químicaRESUMO
This study assesses racial and ethnic differences in overall burden of firearm-related mortality and in change in firearm-related mortality among youths from 1999 to 2020.
Assuntos
Armas de Fogo , Ferimentos por Arma de Fogo , Adolescente , Criança , Humanos , Etnicidade/estatística & dados numéricos , Armas de Fogo/estatística & dados numéricos , Homicídio/etnologia , Homicídio/estatística & dados numéricos , Suicídio/etnologia , Suicídio/estatística & dados numéricos , Estados Unidos/epidemiologia , Ferimentos por Arma de Fogo/epidemiologia , Ferimentos por Arma de Fogo/etnologia , Ferimentos por Arma de Fogo/mortalidade , Grupos Raciais/estatística & dados numéricosRESUMO
Haemophilus haemolyticus has been recently discovered to have the potential to cause invasive disease. It is closely related to nontypeable Haemophilus influenzae (NT H. influenzae). NT H. influenzae and H. haemolyticus are often misidentified because none of the existing tests targeting the known phenotypes of H. haemolyticus are able to specifically identify H. haemolyticus Through comparative genomic analysis of H. haemolyticus and NT H. influenzae, we identified genes unique to H. haemolyticus that can be used as targets for the identification of H. haemolyticus A real-time PCR targeting purT (encoding phosphoribosylglycinamide formyltransferase 2 in the purine synthesis pathway) was developed and evaluated. The lower limit of detection was 40 genomes/PCR; the sensitivity and specificity in detecting H. haemolyticus were 98.9% and 97%, respectively. To improve the discrimination of H. haemolyticus and NT H. influenzae, a testing scheme combining two targets (H. haemolyticus purT and H. influenzae hpd, encoding protein D lipoprotein) was also evaluated and showed 96.7% sensitivity and 98.2% specificity for the identification of H. haemolyticus and 92.8% sensitivity and 100% specificity for the identification of H. influenzae, respectively. The dual-target testing scheme can be used for the diagnosis and surveillance of infection and disease caused by H. haemolyticus and NT H. influenzae.
Assuntos
Hibridização Genômica Comparativa/métodos , Infecções por Haemophilus/diagnóstico , Haemophilus influenzae/classificação , Haemophilus influenzae/genética , Lipoproteínas/genética , Fosforribosilglicinamido Formiltransferase/genética , Sequência de Bases , DNA Bacteriano/genética , Genoma Bacteriano/genética , Infecções por Haemophilus/microbiologia , Haemophilus influenzae/isolamento & purificação , Humanos , Limite de Detecção , Reação em Cadeia da Polimerase em Tempo Real/métodos , Sensibilidade e Especificidade , Análise de Sequência de DNARESUMO
Four Vibrio spp. isolates from the historical culture collection at the Centers for Disease Control and Prevention, obtained from human blood specimens (n=3) and river water (n=1), show characteristics distinct from those of isolates of the most closely related species, Vibrio navarrensis and Vibrio vulnificus, based on phenotypic and genotypic tests. They are specifically adapted to survival in both freshwater and seawater, being able to grow in rich media without added salts as well as salinities above that of seawater. Phenotypically, these isolates resemble V. navarrensis, their closest known relative with a validly published name, but the group of isolates is distinguished from V. navarrensis by the ability to utilize l-rhamnose. Average nucleotide identity and percent DNA-DNA hybridization values obtained from the pairwise comparisons of whole-genome sequences of these isolates to V. navarrensis range from 95.4-95.8 % and 61.9-64.3 %, respectively, suggesting that the group represents a different species. Phylogenetic analysis of the core genome, including four protein-coding housekeeping genes (pyrH, recA, rpoA and rpoB), places these four isolates into their own monophyletic clade, distinct from V. navarrensis and V. vulnificus. Based on these differences, we propose these isolates represent a novel species of the genus Vibrio, for which the name Vibrio cidicii sp. nov. is proposed; strain LMG 29267T (=CIP 111013T=2756-81T), isolated from river water, is the type strain.
Assuntos
Filogenia , Rios/microbiologia , Vibrio/classificação , Técnicas de Tipagem Bacteriana , Composição de Bases , DNA Bacteriano/genética , Genes Bacterianos , Hibridização de Ácido Nucleico , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Vibrio/genética , Vibrio/isolamento & purificaçãoRESUMO
Vancomycin is the mainstay of treatment for patients with Staphylococcus aureus infections, and reduced susceptibility to vancomycin is becoming increasingly common. Accordingly, the development of rapid and accurate assays for the diagnosis of vancomycin-intermediate S. aureus (VISA) will be critical. We developed and applied a genome-based machine-learning approach for discrimination between VISA and vancomycin-susceptible S. aureus (VSSA) using 25 whole-genome sequences. The resulting machine-learning model, based on 14 gene parameters, including 3 molecular typing markers and 11 genes implicated in reduced vancomycin susceptibility, is able to unambiguously distinguish between the VISA and VSSA isolates analyzed here despite the fact that they do not form evolutionarily distinct groups. As such, the model is able to discriminate based on specific genomic markers of antibiotic susceptibility rather than overall sequence relatedness. Subsequent evaluation of the model using leave-one-out validation yielded a classification accuracy of 84%. The machine-learning approach described here provides a generalized framework for the application of genome sequence analysis to the classification of bacteria that differ with respect to clinically relevant phenotypes and should be particularly useful in defining the genomic features that underlie antibiotic resistance.