RESUMEN
Understanding the biochemistry and metabolic pathways of cyanide degradation is necessary to improve the efficacy of cyanide bioremediation processes and industrial requirements. We have isolated and sequenced the genome of a cyanide-degrading Bacillus strain from water in contact with mine tailings from Lima, Peru. This strain was classified as Bacillus safensis based on 16S rRNA gene sequencing and core genome analyses and named B. safensis PER-URP-08. We searched for possible cyanide-degradation enzymes in the genome of this strain and identified a putative cyanide dihydratase (CynD) gene similar to a previously characterized CynD from Bacillus pumilus C1. Sequence analysis of CynD from B. safensis and B. pumilus allow us to identify C-terminal residues that differentiate both CynDs. We then cloned, expressed in Escherichia coli, and purified recombinant CynD from B. safensis PER-URP-08 (CynDPER-URP-08) and showed that in contrast to CynD from B. pumilus C1, this recombinant CynD remains active at up to pH 9. We also showed that oligomerization of CynDPER-URP-08 decreases as a function of increased pH. Finally, we demonstrated that transcripts of CynDPER-URP-08 in B. safensis PER-URP-08 are strongly induced in the presence of cyanide. Our results suggest that the use of B. safensis PER-URP-08 and CynDPER-URP-08 as potential tool for cyanide bioremediation warrants further investigation. IMPORTANCE Despite being of environmental concern around the world due to its toxicity, cyanide continues to be used in many important industrial processes. Thus, searching for cyanide bioremediation methods is a matter of societal concern and must be present on the political agenda of all governments. Here, we report the isolation, genome sequencing and characterization of cyanide degradation capacity of a bacterial strain isolated from an industrial mining site in Peru. We characterize a cyanide dehydratase (CynD) homolog from one of these bacteria, Bacillus safensis PER-URP-08.
Asunto(s)
Bacillus , Proteínas de Escherichia coli , Bacterias/genética , Proteínas de Ciclo Celular/metabolismo , Cianuros/metabolismo , Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Genómica , Hidrolasas , Perú , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/metabolismoRESUMEN
Hepatitis B virus (HBV) spreads efficiently among all human populations worldwide. HBV is classified into ten genotypes (A to J) with their geographic distribution and clinical features. In Mexico, HBV genotype H is the leading cause of hepatitis B and has been detected in indigenous populations, suggesting that HBV genotype H may be native to Mexico. However, little is known about the evolutionary history of HBV genotype H. Thus, we aimed to determine the age of HBV genotype H in Mexico using molecular dating techniques. Ninety-two HBV sequences of the reverse transcriptase (RT) domain of the polymerase gene (~1,251 bp) were analyzed; 48 were genotype H, 43 were genotype F, and the oldest HBV sequence from America was included as the root. All sequences were aligned, and the most recent common ancestor (TMRCA) time was calculated using the Bayesian Skyline Evolutionary Analysis. Our results estimate a TMRCA for the genotype H in Mexico of 2070.9 (667.5-4489.2) years before the present (YBP). We identified four major diversification events in genotype H, named H1, H2, H3, and H4. The TMRCA of H1 was 1213.0 (253.3-2638.3) YBP, followed by H2 1175.5 (557.5-2424.2) YBP, H3 949.6 (279.3-2105.0) YBP, and H4 1230.5 (336.3, 2756.7) YBP. We estimated that genotype H diverged from its sister genotype F around 8140.8 (1867.5-18012.8) YBP. In conclusion, this study found that genotype H in Mexico has an estimated age of 2070.9 (667.5-4489.2) YBP and has experienced at least four major diversification events since then.
RESUMEN
At over 0.6% of the population, Peru has one of the highest SARS-CoV-2 mortality rate in the world. Much effort to sequence genomes has been done in this country since mid-2020. However, an adequate analysis of the dynamics of the variants of concern and interest (VOCIs) is missing. We investigated the dynamics of the COVID-19 pandemic in Peru with a focus on the second wave, which had the greatest case fatality rate. The second wave in Peru was dominated by Lambda and Gamma. Analysis of the origin of Lambda shows that it most likely emerged in Peru before the second wave (June-November, 2020). After its emergence it reached Argentina and Chile from Peru where it was locally transmitted. During the second wave in Peru, we identify the coexistence of two Lambda and three Gamma sublineages. Lambda sublineages emerged in the center of Peru whereas the Gamma sublineages more likely originated in the north-east and mid-east. Importantly, it is observed that the center of Peru played a prominent role in transmitting SARS-CoV-2 to other regions within Peru.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiología , Pandemias , Perú/epidemiología , ArgentinaRESUMEN
The COVID-19 pandemic has highlighted the importance in the understanding of the biology of SARS-CoV-2. After more than two years since the first report of COVID-19, it remains crucial to continue studying how SARS-CoV-2 proteins interact with the host metabolism to cause COVID-19. In this review, we summarize the findings regarding the functions of the 16 non-structural, 6 accessory and 4 structural SARS-CoV-2 proteins. We place less emphasis on the spike protein, which has been the subject of several recent reviews. Furthermore, comprehensive reviews about COVID-19 therapeutic have been also published. Therefore, we do not delve into details on these topics; instead we direct the readers to those other reviews. To avoid confusions with what we know about proteins from other coronaviruses, we exclusively report findings that have been experimentally confirmed in SARS-CoV-2. We have identified host mechanisms that appear to be the primary targets of SARS-CoV-2 proteins, including gene expression and immune response pathways such as ribosome translation, JAK/STAT, RIG-1/MDA5 and NF-kß pathways. Additionally, we emphasize the multiple functions exhibited by SARS-CoV-2 proteins, along with the limited information available for some of these proteins. Our aim with this review is to assist researchers and contribute to the ongoing comprehension of SARS-CoV-2's pathogenesis.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/metabolismo , Pandemias , Glicoproteína de la Espiga del Coronavirus/genéticaRESUMEN
The recent outbreak of yellow fever (YF) in São Paulo during 2016-2019 has been one of the most severe in the last decades, spreading to areas with low vaccine coverage. The aim of this study was to assess the genetic diversity of the yellow fever virus (YFV) from São Paulo 2016-2019 outbreak, integrating the available genomic data with new genomes from patients from the Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo (HCFMUSP). Using phylodynamics, we proposed the existence of new IE subclades, described their sequence signatures, and determined their locations and time of origin. Plasma or urine samples from acute severe YF cases (n = 56) with polymerase chain reaction (PCR) positive to YFV were submitted to viral genome amplification using 12 sets of primers. Thirty-nine amplified genomes were subsequently sequenced using next-generation sequencing (NGS). These 39 sequences, together with all the complete genomes publicly available, were aligned and used to determine nucleotide/amino acids substitutions and perform phylogenetic and phylodynamic analysis. All YFV genomes generated in this study belonged to the genotype South American I subgroup E. Twenty-one non-synonymous substitutions were identified among the new generated genomes. We analyzed two major clades of the genotypes IE, IE1, and IE2 and proposed the existence of subclades based on their sequence signatures. Also, we described the location and time of origin of these subclades. Overall, our findings provide an overview of YFV genomic characterization and phylodynamics of the 2016-2019 outbreak contributing to future virological and epidemiological studies.
RESUMEN
Coronavirus disease 2019 (COVID-19) is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This disease has spread globally, causing more than 161.5 million cases and 3.3 million deaths to date. Surveillance and monitoring of new mutations in the virus' genome are crucial to our understanding of the adaptation of SARS-CoV-2. Moreover, how the temporal dynamics of these mutations is influenced by control measures and non-pharmaceutical interventions (NPIs) is poorly understood. Using 1,058,020 SARS-CoV-2 from sequenced COVID-19 cases from 98 countries (totaling 714 country-month combinations), we perform a normalization by COVID-19 cases to calculate the relative frequency of SARS-CoV-2 mutations and explore their dynamics over time. We found 115 mutations estimated to be present in more than 3% of global COVID-19 cases and determined three types of mutation dynamics: high-frequency, medium-frequency, and low-frequency. Classification of mutations based on temporal dynamics enable us to examine viral adaptation and evaluate the effects of implemented control measures in virus evolution during the pandemic. We showed that medium-frequency mutations are characterized by high prevalence in specific regions and/or in constant competition with other mutations in several regions. Finally, taking N501Y mutation as representative of high-frequency mutations, we showed that level of control measure stringency negatively correlates with the effective reproduction number of SARS-CoV-2 with high-frequency or not-high-frequency and both follows similar trends in different levels of stringency.
Asunto(s)
COVID-19/epidemiología , Control de Enfermedades Transmisibles/normas , Pandemias/prevención & control , SARS-CoV-2/genética , COVID-19/prevención & control , COVID-19/transmisión , COVID-19/virología , Genoma Viral , Carga Global de Enfermedades , Humanos , Tasa de Mutación , Prevalencia , SARS-CoV-2/patogenicidadRESUMEN
Since the identification of SARS-CoV-2, a large number of genomes have been sequenced with unprecedented speed around the world. This marks a unique opportunity to analyze virus spreading and evolution in a worldwide context. Currently, there is not a useful haplotype description to help to track important and globally scattered mutations. Also, differences in the number of sequenced genomes between countries and/or months make it difficult to identify the emergence of haplotypes in regions where few genomes are sequenced but a large number of cases are reported. We propose an approach based on the normalization by COVID-19 cases of relative frequencies of mutations using all the available data to identify major haplotypes. Furthermore, we can use a similar normalization approach to tracking the temporal and geographic distribution of haplotypes in the world. Using 171,461 genomes, we identify five major haplotypes or operational taxonomic units (OTUs) based on nine high-frequency mutations. OTU_3 characterized by mutations R203K and G204R is currently the most frequent haplotype circulating in four of the six continents analyzed (South America, North America, Europe, Asia, Africa, and Oceania). On the other hand, during almost all months analyzed, OTU_5 characterized by the mutation T85I in nsp2 is the most frequent in North America. Recently (since September), OTU_2 has been established as the most frequent in Europe. OTU_1, the ancestor haplotype, is near to extinction showed by its low number of isolations since May. Also, we analyzed whether age, gender, or patient status is more related to a specific OTU. We did not find OTU's preference for any age group, gender, or patient status. Finally, we discuss structural and functional hypotheses in the most frequently identified mutations, none of those mutations show a clear effect on the transmissibility or pathogenicity.
RESUMEN
In the first chapter, studies on substrate recognition and enzymatic activity of GGDEF domains are presented. Many proteins containing GGDEF domains are diguanylate cyclases (DGCs, EC 2.7.7.65), enzymes that catalyze the conversion of 2 GTP molecules into the second messenger c-di-GMP in prokaryotes. This molecule is primarily implicated in the transition between motile and sessile lifestyles, as well several other phenotypes. Redundancy and diversity of GGDEF domain sequences in many bacterial genomes raises the possibility that other enzymatic functions may yet be discovered. To test this hypothesis, i) the effect of point mutations on the structure and enzymatic activity of GGDEF domains is analyzed, ii) the enzymatic specificity of wild-type GGDEF domains from different proteins is also tested, and iii) when non-canonical products are detected, enzymatic models are studied to understand its preferential production. The principal results obtained from these studies are as follows. Seven mutants of the DGC PleD (a GGDEF containing-protein from Caulobacter crescentus) were constructed and the crystallographic structure of two of them was solved, showing that they are unlikely to bind the guanine moiety in its active site. Additionally, five mutants of XAC0610, another DGC from Xanthomonas citr, were constructed and their substrate specificities were evaluated. None of those mutants were able to use ATP as a substrate. Finally, seven different GGDEF domain-containing DGCs from different sources were expressed and purified and their enzymatic specificities were tested with several nucleotide triphosphates. One enzyme, GSU1658 from Geobacter sulfurreducens was particularly promiscuous and shown to produce c-di-GMP, c-di-AMP, c-di-IMP, c-di-2´dGMP, cGAMP, c-GIMP, and c-AIMP. Interestingly, XAC0610 was able to recognize 2´dGTP as substrate. Analysis of enzyme kinetics of XAC0610 in presence of 2´dGTP and/or GTP showed the preferential formation of the hybrid linear product pppGp2´dG. The second chapter present studies on cyanide metabolism in Bacillus with focus on the cyanide dihydratase of Bacillus safensis. Cyanide is widely used in industries due to its high affinity for metals. This same ability confers potent toxicity to this compound. Thus, industries must reduce the cyanide concentration from wastewater before its final disposal. Physical, chemical, and biological methods have been developed to achieve this goal, but knowledge about metabolic pathways and the biology of enzymes involved in cyanide degradation is still scarce. Here, the isolation of a Bacillus safensis strain from mine tailings in Peru is described. Classification of this strain was done through a comparative analysis of 132 core genomes of strains from the Bacillus pumilus group. Sequence analysis determined that a cyanide dihydratase (CynD, EC 3.5.5.1)) encoded in the genome of the isolated strain was likely the enzyme responsible for cyanide degradation. Confirmation of the cyanide degrading activity of CynD from this strain was achieved by cloning, expression and purification of the enzyme and its enzymatic characterization. CynD from this strain was active up to pH 9 and oligomerization patterns analyzed by SEC-MALS and electron microscopy showed that the enzyme forms large helical structures at pH 8 and smaller structures at higher pHs. Finally, we show that CynD expression is strongly induced in the presence of cyanide. The last two years of graduate studies were carried out in the context of the COVID-19 pandemic. Thanks to the large amount of publicly available genomic data, we were able to carry out studies on the worldwide dynamics of the spread of SARS-CoV-2 mutants forms. In the first year of the pandemic, genomic classification of 171,461 genomes showed the presence of five major haplotypes based on nine mutations. The worldwide distribution and the temporal evolution of frequency of these haplotypes was carefully analyzed. All the haplotypes were identified in the six regions analyzed (South America, North America, Europe, Asia, Africa, and Oceania); however, the frequency of each of them was different in each of these regions. As of September 30, 2020, haplotype 3 (or operational taxonomic unit 3, OTU_3) was the most prevalent in four regions (South America, Asia, Africa, and Oceania). OTU_5 was the most prevalent in North America and OTU_2 in Europe. Temporal dynamics of the haplotypes showed that OTU_1 became nearly extinct after 8 months of pandemic (November 2020). Other OTUs are still present in different frequencies all around the world, while currently generating new variants. Based on their temporal dynamics, a classification scheme of 115 SARS-CoV-2 mutations identified from 1,058,020 SARS-COV-2 genomes was also performed. Three types of temporal dynamics of mutations were identified: i) High-Frequency mutations are characterized by a rapid increase in frequency upon its appearance, ii) medium and iii) low-frequency mutations maintain mid or low-frequencies for several months and can be region-specific. Finally, we performed a correlation analysis of the effective reproduction number (Rt) of SARS-CoV-2 harboring the high-frequency mutation N501Y with the level of control measures adopted in specific jurisdictions. We show that Rt is negatively correlated with the level of control measures in eight of the nine countries analyzed. This negative correlation was similar when we analyzed the Rt of SARS-CoV-2 not-harboring N501Y. Thus, the control measures likely diminish the Rt of both SARSCoV-2 wild-type and N501Y
O presente trabalho está dividido em três capítulos sobre linhas de pesquisa diferentes desenvolvidas pelo autor durante o período de doutorado No primeiro capítulo, são apresentados estudos relacionados ao reconhecimento estrutural de substratos e análise enzimática de domínios GGDEF com atividade diguanilato ciclase (EC 2.7.7.65). As proteínas contendo domínios GGDEF estão relacionados à produção enzimática do segundo mensageiro c-di-GMP, a partir de duas moléculas de GTP, em procariotos. Esta molécula está principalmente envolvida na transição entre os estilos de vida móveis e sésseis, bem como vários outros fenótipos. Redundância e diversidade de sequências de domínio GGDEF aumentam a possibilidade de que outras funções enzimáticas ainda possam ser descobertas. Para testar esta hipótese, i) o efeito de mutações pontuais na estrutura e atividade enzimática dos domínios GGDEF é analisado, ii) a especificidade enzimática de domínios GGDEF de enzimas diferentes também é testada e iii) quando produtos não canônicos são detectados, modelos enzimáticos são estudados para entender sua produção preferencial. Como resultados mais importantes, sete mutantes do PleD (uma proteína contendo GGDEF) foram construídos e a estrutura cristalográfica de dois delas foi resolvida, mostrando que é improvável que eles liguem à porção guanina em seu sítio ativo. Além disso, cinco mutantes da proteína XAC0610 de Xanthomonas citri foram construídos e sua capacidade de usar ATP ou GTP como substrato foi avaliada. Nenhum desses mutantes foi capaz de usar ATP como substrato. Finalmente, sete outras proteínas contendo GGDEF foram purificadas e sua especificidade enzimática foi avaliada com vários trifosfatos de nucleotídeos. Uma enzima promíscua chamada GSU1658 mostrou produzir c-di-GMP, c-di-AMP, c-di-IMP, c-di-2´dGMP, c-GAMP, cGIMP e c-AIMP. Curiosamente, o XAC0610 foi capaz de reconhecer 2´dGTP como substrato. A análise da cinética enzimática de XAC0610 na presença de 2´dGTP e GTP mostrou a formação preferencial do produto linear híbrido pppGp2´dG. O segundo capítulo aborda estudos sobre o metabolismo do cianeto em Bacillus com foco na cianeto dihidratase de Bacillus safensis. O cianeto é amplamente utilizado nas indústrias devido à sua alta afinidade com os metais. Esta mesma capacidade confere toxicidade potente a este composto. Assim, as indústrias têm que reduzir a concentração de cianeto das águas residuais antes de sua disposição final. Métodos físicos, químicos e biológicos têm sido desenvolvidos para atingir esse objetivo, mas o conhecimento sobre as vias metabólicas e a biologia das enzimas envolvidas na degradação do cianeto ainda é escasso. Aqui, é descrito o isolamento de uma cepa de Bacillus safensis de rejeitos de minas no Peru. A classificação desta cepa foi feita através de uma análise comparativa de 132 core genomes de cepas do grupo de Bacillus pumilus. Em seguida, determinamos que uma cianeto dihidratase (CynD, EC 3.5.5.1) codificada no genoma da cepa isolada era provavelmente a enzima responsável pela degradação do cianeto. A confirmação da atividade degradante de cianeto de CynD desta cepa foi feita por clonagem, expressão e purificação da enzima e realização de caracterização enzimática. O CynD desta cepa é ativo até pH 9 e os padrões de oligomerização analisados por SEC-MALS mostraram que a enzima forma longas estruturas helicoidais em pH 8 e estruturas menores enquanto o pH aumenta. Finalmente, foi demonstrado que a expressão de CynD é fortemente induzida na presença de cianeto. Os últimos dois anos do doutorado foram realizados no contexto da pandemia COVID- 19. Vários laboratórios se dedicaram a gerar conhecimento para ajudar no combate à pandemia. Nesta situação e graças à grande quantidade de dados genômicos disponíveis publicamente, estudos sobre a dinâmica das mutações do SARS-CoV-2 foram realizados. No primeiro ano da pandemia, a classificação genômica de 171.461 genomas mostrou a presença de cinco haplótipos principais com base em nove mutações. A distribuição mundial e a mudança de frequência desses haplótipos foram analisadas cuidadosamente. Todos os haplótipos foram identificados nas seis regiões analisadas (América do Sul, América do Norte, Europa, Ásia, África e Oceania); no entanto, a frequência de cada um deles foi diferente em cada uma dessas regiões. Em 30 de setembro de 2020, o haplótipo 3 (ou unidade taxonômica operacional 3, OTU_3) era o mais prevalente em quatro regiões (América do Sul, Ásia, África e Oceania). OTU_5 foi o mais prevalente na América do Norte e OTU_2 na Europa. A dinâmica temporal dos haplótipos mostrou que OTU_1 parece perto da extinção após 8 meses de pandemia (novembro de 2020). Outros OTUs ainda estão presentes em diferentes frequências em todo o mundo, mesmo atualmente gerando novas variantes. Com base em sua dinâmica temporal, um esquema de classificação de 115 mutações SARS-CoV-2 identificadas a partir de 1.058.020 genomas SARS-COV-2 também foi feito. Três tipos de dinâmica temporal de mutações foram identificados: i) Mutações de alta frequência, ii) mutações de média frequência e iii) mutações de baixa frequência. Finalmente, foi analisada a correlação do número de reprodução efetiva (Rt) do SARS-CoV-2 que contém a mutação de alta frequência N501Y com o nível de medidas de controle, mostrando que seu Rt está negativamente correlacionado com o nível de medidas de controle em oito dos nove países analisados. Esta correlação negativa foi semelhante quando foi analisado o Rt de SARS-CoV-2 sem a mutação N501Y. Assim, as medidas de controle provavelmente diminuirão o Rt de SARS-CoV-2 tipo selvagem e N501Y