Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
Ann Clin Microbiol Antimicrob ; 23(1): 40, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38702782

ABSTRACT

BACKGROUND: Pretomanid is a key component of new regimens for the treatment of drug-resistant tuberculosis (TB) which are being rolled out globally. However, there is limited information on the prevalence of pre-existing resistance to the drug. METHODS: To investigate pretomanid resistance rates in China and its underlying genetic basis, as well as to generate additional minimum inhibitory concentration (MIC) data for epidemiological cutoff (ECOFF)/breakpoint setting, we performed MIC determinations in the Mycobacterial Growth Indicator Tube™ (MGIT) system, followed by WGS analysis, on 475 Mycobacterium tuberculosis (MTB) isolated from Chinese TB patients between 2013 and 2020. RESULTS: We observed a pretomanid MIC distribution with a 99% ECOFF equal to 0.5 mg/L. Of the 15 isolates with MIC values > 0.5 mg/L, one (MIC = 1 mg/L) was identified as MTB lineage 1 (L1), a genotype previously reported to be intrinsically less susceptible to pretomanid, two were borderline resistant (MIC = 2-4 mg/L) and the remaining 12 isolates were highly resistant (MIC ≥ 16 mg/L) to the drug. Five resistant isolates did not harbor mutations in the known pretomanid resistant genes. CONCLUSIONS: Our results further support a breakpoint of 0.5 mg/L for a non-L1 MTB population, which is characteristic of China. Further, our data point to an unexpected high (14/475, 3%) pre-existing pretomanid resistance rate in the country, as well as to the existence of yet-to-be-discovered pretomanid resistance genes.


Subject(s)
Antitubercular Agents , Microbial Sensitivity Tests , Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Mycobacterium tuberculosis/drug effects , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/isolation & purification , China/epidemiology , Humans , Antitubercular Agents/pharmacology , Tuberculosis, Multidrug-Resistant/microbiology , Tuberculosis, Multidrug-Resistant/epidemiology , Prevalence , Nitroimidazoles/pharmacology , Genotype , Mutation , Whole Genome Sequencing
2.
Microorganisms ; 12(4)2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38674714

ABSTRACT

Mycobacterial membrane proteins play a pivotal role in the bacterial invasion of host cells; however, the precise mechanisms underlying certain membrane proteins remain elusive. Mycolicibacterium smegmatis (Ms) msmeg5257 is a hemolysin III family protein that is homologous to Mycobacterium tuberculosis (Mtb) Rv1085c, but it has an unclear function in growth. To address this issue, we utilized the CRISPR/Cas9 gene editor to construct Δmsmeg5257 strains and combined RNA transcription and LC-MS/MS protein profiling to determine the functional role of msmeg5257 in Ms growth. The correlative analysis showed that the deletion of msmeg5257 inhibits ABC transporters in the cytomembrane and inhibits the biosynthesis of amino acids in the cell wall. Corresponding to these results, we confirmed that MSMEG5257 localizes in the cytomembrane via subcellular fractionation and also plays a role in facilitating the transport of iron ions in environments with low iron levels. Our data provide insights that msmeg5257 plays a role in maintaining Ms metabolic homeostasis, and the deletion of msmeg5257 significantly impacts the growth rate of Ms. Furthermore, msmeg5257, a promising drug target, offers a direction for the development of novel therapeutic strategies against mycobacterial diseases.

3.
Sci Total Environ ; 917: 170302, 2024 Mar 20.
Article in English | MEDLINE | ID: mdl-38272089

ABSTRACT

BACKGROUND: Rift valley fever (RVF) is listed as one of prioritized diseases by WHO. This study aims to describe RVF virus' landscape distribution globally, and to insight dynamics change of its evolution, prevalence, and outbreaks in the process of breaking geographical barriers. METHODS: A systematic literature review and meta-analyses was conducted to estimate RVF prevalence by hosts using a random-effect model. Molecular clock-based phylogenetic analyses were performed to estimate RVF virus nucleotide substitution rates using nucleotide sequences in NCBI database. RVF virus prevalence, nucleotide substitution rates, and outbreaks were compared before and after breaking geographical barriers twice, respectively. RESULTS: RVF virus was reported from 26 kinds of hosts covering 48 countries from 1930 to 2022. Since RVF broke geographical barriers, (1) nucleotide substitution rates significantly increased after firstly spreading out of Africa in 2000, (2) prevalence in humans significantly increased from 1.92 % (95 % CI: 0.86-3.25 %) to 3.03 % (95 % CI: 2.09-4.12 %) after it broke Sahara Desert geographical barriers in 1977, and to 5.24 % (95 % CI: 3.81-6.82 %) after 2000, (3) RVF outbreaks in humans and the number of wildlife hosts presented increasing trends. RVF virus spillover may exist between bats and humans, and accelerate viral substitution rates in humans. During outbreaks, the RVF virus substitution rates accelerated in humans. 60.00 % RVF outbreaks occurred 0-2 months after floods and (or) heavy rainfall. CONCLUSION: RVF has the increasing risk to cause pandemics, and global collaboration on "One Health" is needed to prevent potential pandemics.


Subject(s)
Rift Valley Fever , Rift Valley fever virus , Animals , Humans , Prevalence , Phylogeny , Rift Valley Fever/epidemiology , Rift Valley Fever/prevention & control , Disease Outbreaks , Nucleotides
4.
Eur J Clin Microbiol Infect Dis ; 43(1): 105-114, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37980301

ABSTRACT

PURPOSE: We aimed at evaluating the diagnostic efficacy of a nucleotide matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS) assay to detect drug resistance of Mycobacterium tuberculosis. METHODS: Overall, 263 M. tuberculosis clinical isolates were selected to evaluate the performance of nucleic MALDI-TOF-MS for rifampin (RIF), isoniazid (INH), ethambutol (EMB), moxifloxacin (MXF), streptomycin (SM), and pyrazinamide (PZA) resistance detection. The results for RIF, INH, EMB, and MXF were compared with phenotypic microbroth dilution drug susceptibility testing (DST) and whole-genome sequencing (WGS), and the results for SM and PZA were compared with those obtained by WGS. RESULTS: Using DST as the gold standard, the sensitivity, specificity, and kappa values of the MALDI-TOF-MS assay for the detection of resistance were 98.2%, 98.7%, and 0.97 for RIF; 92.8%, 99%, and 0.90 for INH; 82.4%, 98.0%, and 0.82 for EMB; and 92.6%, 99.5%, and 0.94 for MXF, respectively. Compared with WGS as the reference standard, the sensitivity, specificity, and kappa values of the MALDI-TOF-MS assay for the detection of resistance were 97.4%, 100.0%, and 0.98 for RIF; 98.7%, 92.9%, and 0.92 for INH; 96.3%, 100.0%, and 0.98 for EMB; 98.1%, 100.0%, and 0.99 for MXF; 98.0%, 100.0%, and 0.98 for SM; and 50.0%, 100.0%, and 0.65 for PZA. CONCLUSION: The nucleotide MALDI-TOF-MS assay yielded highly consistent results compared to DST and WGS, suggesting that it is a promising tool for the rapid detection of sensitivity to RIF, INH, EMB, and MXF.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Humans , Antitubercular Agents/pharmacology , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Microbial Sensitivity Tests , Streptomycin , Ethambutol , Isoniazid , Rifampin , Tuberculosis, Multidrug-Resistant/diagnosis , Tuberculosis, Multidrug-Resistant/drug therapy , Tuberculosis, Multidrug-Resistant/microbiology
5.
Microbiol Spectr ; 11(6): e0184223, 2023 Dec 12.
Article in English | MEDLINE | ID: mdl-37947405

ABSTRACT

IMPORTANCE: To date, rapid diagnostic methods based on the MPT64 antigen assay are increasingly utilized to differentiate between non-tuberculous mycobacteria and TB disease in clinical settings. Furthermore, numerous novel techniques based on the MPT64 release assay are continuously being developed and applied for the identification of both pulmonary and extrapulmonary TB. However, the diagnostic accuracy of the MPT64 antigen assay is influenced by the presence of 63 bp deletion variants within the mpt64 gene. To our knowledge, this is the first report on the association between the 63 bp deletion variant in mpt64 and Mycobacterium tuberculosis L4.2.2 globally, which highlights the need for the cautious utilization of MPT64-based testing in regions where L4.2.2 isolates are prevalent, such as China and Vietnam, and MPT64 negative results should be confirmed with another assay. In addition, further studies on vaccine development and immunology based on MPT64 should consider these isolates with 63 bp deletion variant.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis , Humans , Tuberculosis/diagnosis , Tuberculosis/microbiology , Antigens, Bacterial/genetics , Sensitivity and Specificity , China
6.
Microbiol Spectr ; : e0132423, 2023 Sep 21.
Article in English | MEDLINE | ID: mdl-37732780

ABSTRACT

Multidrug-resistant tuberculosis (MDR-TB) has a severe impact on public health. To investigate the drug-resistant profile, compensatory mutations and genetic variations among MDR-TB isolates, a total of 546 MDR-TB isolates from China underwent drug-susceptibility testing and whole genome sequencing for further analysis. The results showed that our isolates have a high rate of fluoroquinolone resistance (45.60%, 249/546) and a low proportion of conferring resistance to bedaquiline, clofazimine, linezolid, and delamanid. The majority of MDR-TB isolates (77.66%, 424/546) belong to Lineage 2.2.1, followed by Lineage 4.5 (6.41%, 35/546), and the Lineage 2 isolates have a strong association with pre-XDR/XDR-TB (P < 0.05) in our study. Epidemic success analysis using time-scaled haplotypic density (THD) showed that clustered isolates outperformed non-clustered isolates. Compensatory mutations happened in rpoA, rpoC, and non-RRDR of rpoB genes, which were found more frequently in clusters and were associated with the increase of THD index, suggesting that increased bacterial fitness was associated with MDR-TB transmission. In addition, the variants in resistance associated genes in MDR isolates are mainly focused on single nucleotide polymorphism mutations, and only a few genes have indel variants, such as katG, ethA. We also found some genes underwent indel variation correlated with the lineage and sub-lineage of isolates, suggesting the selective evolution of different lineage isolates. Thus, this analysis of the characterization and genetic diversity of MDR isolates would be helpful in developing effective strategies for treatment regimens and tailoring public interventions. IMPORTANCE Multidrug-resistant tuberculosis (MDR-TB) is a serious obstacle to tuberculosis prevention and control in China. This study provides insight into the drug-resistant characteristics of MDR combined with phenotypic drug-susceptibility testing and whole genome sequencing. The compensatory mutations and epidemic success analysis were analyzed by time-scaled haplotypic density (THD) method, suggesting clustered isolates and compensatory mutations are associated with MDR-TB transmission. In addition, the insertion and deletion variants happened in some genes, which are associated with the lineage and sub-lineage of isolates, such as the mpt64 gene. This study offered a valuable reference and increased understanding of MDR-TB in China, which could be crucial for achieving the objective of precision medicine in the prevention and treatment of MDR-TB.

7.
BMJ Open ; 13(8): e067294, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37536961

ABSTRACT

OBJECTIVE: To explore the feasible and cost-effective intervention strategies to achieve the goal of dynamic COVID-Zero in China. DESIGN: A Susceptible-Exposed-Infectious-Recovered model combined economic evaluation was used to generate the number of infections, the time for dynamic COVID-Zero and calculate cost-effectiveness under different intervention strategies. The model simulated the 1 year spread of COVID-19 in mainland China after 100 initial infections were imported. INTERVENTIONS: According to close contact tracing degree from 80% to 100%, close contact tracing time from 2 days to 1 day, isolation time from 14 days to 7 days, scope of nucleic acid testing (NAT) from 10% to 100% and NAT frequency from weekly to every day, 720 scenarios were simulated. OUTCOME MEASURE: Cumulative number of infections (CI), social COVID-Zero duration (SCD), total cost (TC) and incremental cost-effectiveness ratio. RESULTS: 205 of 720 scenarios could achieve the total COVID-Zero since the first case was reported. The fastest and most cost-effective strategy was Scenario 680, in which all close contacts were traced within 1 day, the isolation time was 14 days and 10% of the national population was randomly checked for NAT every day. In Scenario 680, the CI was 280 (100 initial infections) and the SCD was 13 days. The TC was ¥4126 hundred million and the cost of reducing one infection was ¥47 470. However, when the close contact tracing time was 2 days and the degree of close contact tracing was 80%-90%, the SCD would double to 24-101 days and the TCs increased by ¥16 505 to 37 134 hundred million compared with Scenario 680. CONCLUSIONS: If all close contact was controlled within 1 day, the rapid social COVID-Zero can be achieved effectively and cost-effectively. Therefore, the future prevention and control of emerging respiratory infectious diseases can focus on enhancing the ability of close contact tracing.


Subject(s)
COVID-19 , Humans , COVID-19/epidemiology , COVID-19/prevention & control , Cost-Benefit Analysis , SARS-CoV-2 , Contact Tracing , China/epidemiology
8.
Cancer Chemother Pharmacol ; 92(5): 341-355, 2023 11.
Article in English | MEDLINE | ID: mdl-37507485

ABSTRACT

BACKGROUND: The anti-HER2 antibody trastuzumab is a standard treatment for gastric carcinoma with HER2 overexpression, but not all patients benefit from treatment with HER2-targeted therapies due to intrinsic and acquired resistance. Thus, more precise predictors for selecting patients to receive trastuzumab therapy are urgently needed. METHODS: We applied mass spectrometry-based proteomic analysis to 38 HER2-positive gastric tumor biopsies from 19 patients pretreated with trastuzumab (responders n = 10; nonresponders, n = 9) to identify factors that may influence innate sensitivity or resistance to trastuzumab therapy and validated the results in tumor cells and patient samples. RESULTS: Statistical analyses revealed significantly lower phosphorylated ribosomal S6 (p-RPS6) levels in responders than nonresponders, and this downregulation was associated with a durable response and better overall survival after anti-HER2 therapy. High p-RPS6 levels could trigger AKT/mTOR/RPS6 signaling and inhibit trastuzumab antitumor efficacy in nonresponders. We demonstrated that RPS6 phosphorylation inhibitors in combination with trastuzumab effectively suppressed HER2-positive GC cell survival through the inhibition of the AKT/mTOR/RPS6 axis. CONCLUSIONS: Our findings provide for the first time a detailed proteomics profile of current protein alterations in patients before anti-HER2 therapy and present a novel and optimal predictor for the response to trastuzumab treatment. HER2-positive GC patients with low expression of p-RPS6 are more likely to benefit from trastuzumab therapy than those with high expression. However, those with high expression of p-RPS6 may benefit from trastuzumab in combination with RPS6 phosphorylation inhibitors.


Subject(s)
Carcinoma , Stomach Neoplasms , Humans , Trastuzumab/pharmacology , Trastuzumab/therapeutic use , Stomach Neoplasms/pathology , Proto-Oncogene Proteins c-akt , Proteomics/methods , Cell Line, Tumor , TOR Serine-Threonine Kinases/metabolism , Receptor, ErbB-2/metabolism , Drug Resistance, Neoplasm
9.
Eur J Pharmacol ; 949: 175719, 2023 Jun 15.
Article in English | MEDLINE | ID: mdl-37054942

ABSTRACT

GPR35, a class A G-protein-coupled receptor, is considered an orphan receptor; the endogenous ligand and precise physiological function of GPR35 remain obscure. GPR35 is expressed relatively highly in the gastrointestinal tract and immune cells. It plays a role in colorectal diseases like inflammatory bowel diseases (IBDs) and colon cancer. More recently, the development of GPR35 targeting anti-IBD drugs is in solid request. Nevertheless, the development process is in stagnation due to the lack of a highly potent GPR35 agonist that is also active comparably in both human and mouse orthologs. Therefore, we proposed to find compounds for GPR35 agonist development, especially for the human ortholog of GPR35. As an efficient way to pick up a safe and effective GPR35 targeting anti-IBD drug, we screened Food and Drug Administration (FDA)-approved 1850 drugs using a two-step DMR assay. Interestingly, we found aminosalicylates, first-line medicine for IBDs whose precise target remains unknown, exhibited activity on both human and mouse GPR35. Among these, pro-drug olsalazine showed the most potency on GPR35 agonism, inducing ERK phosphorylation and ß-arrestin2 translocation. In dextran sodium sulfate (DSS)-induced colitis, the protective effect on disease progression and inhibitory effect on TNFα mRNA expression, NF-κB and JAK-STAT3 pathway of olsalazine are compromised in GPR35 knock-out mice. The present study identified a target for first-line medicine aminosalicylates, highlighted that uncleaved pro-drug olsalazine is effective, and provided a new concept for the design of aminosalicylic GPR35 targeting anti-IBD drug.


Subject(s)
Aminosalicylic Acid , Colitis , Inflammatory Bowel Diseases , Prodrugs , Mice , Humans , Animals , Prodrugs/metabolism , Colitis/chemically induced , Colitis/drug therapy , Colitis/prevention & control , Aminosalicylic Acids/adverse effects , Inflammatory Bowel Diseases/drug therapy , Aminosalicylic Acid/adverse effects , NF-kappa B/metabolism , Dextran Sulfate/toxicity , Mice, Inbred C57BL , Colon , Disease Models, Animal , Receptors, G-Protein-Coupled/metabolism
10.
Front Microbiol ; 14: 1115295, 2023.
Article in English | MEDLINE | ID: mdl-36876077

ABSTRACT

Background: Tuberculosis may reoccur due to reinfection or relapse after initially successful treatment. Distinguishing the cause of TB recurrence is crucial to guide TB control and treatment. This study aimed to investigate the source of TB recurrence and risk factors related to relapse in Hunan province, a high TB burden region in southern China. Methods: A population-based retrospective study was conducted on all culture-positive TB cases in Hunan province, China from 2013 to 2020. Phenotypic drug susceptibility testing and whole-genome sequencing were used to detect drug resistance and distinguish between relapse and reinfection. Pearson chi-square test and Fisher exact test were applied to compare differences in categorical variables between relapse and reinfection. The Kaplan-Meier curve was generated in R studio (4.0.4) to describe and compare the time to recurrence between different groups. p < 0.05 was considered statistically significant. Results: Of 36 recurrent events, 27 (75.0%, 27/36) paired isolates were caused by relapse, and reinfection accounted for 25.0% (9/36) of recurrent cases. No significant difference in characteristics was observed between relapse and reinfection (all p > 0.05). In addition, TB relapse occurs earlier in patients of Tu ethnicity compared to patients of Han ethnicity (p < 0.0001), whereas no significant differences in the time interval to relapse were noted in other groups. Moreover, 83.3% (30/36) of TB recurrence occurred within 3 years. Overall, these recurrent TB isolates were predominantly pan-susceptible strains (71.0%, 49/69), followed by DR-TB (17.4%, 12/69) and MDR-TB (11.6%, 8/69), with mutations mainly in codon 450 of the rpoB gene and codon 315 of the katG gene. 11.1% (3/27) of relapse cases had acquired new resistance during treatment, with fluoroquinolone resistance occurring most frequently (7.4%, 2/27), both with mutations in codon 94 of gyrA. Conclusion: Endogenous relapse is the main mechanism leading to TB recurrences in Hunan province. Given that TB recurrences can occur more than 4 years after treatment completion, it is necessary to extend the post-treatment follow-up period to achieve better management of TB patients. Moreover, the relatively high frequency of fluoroquinolone resistance in the second episode of relapse suggests that fluoroquinolones should be used with caution when treating TB cases with relapse, preferably guided by DST results.

11.
Article in English | MEDLINE | ID: mdl-36554951

ABSTRACT

Early diagnosis of drug susceptibility for tuberculosis (TB) patients could guide the timely initiation of effective treatment. We evaluated a novel multiplex xMAP TIER (Tuberculosis-Isoniazid-Ethambutol-Rifampicin) assay based on the Luminex xMAP system to detect first-line anti-tuberculous drug resistance. Deoxyribonucleic acid samples from 353 Mycobacterium tuberculosis clinical isolates were amplified by multiplex polymerase chain reaction, followed by hybridization and analysis through the xMAP system. Compared with the broth microdilution method, the sensitivity and specificity of the xMAP TIER assay for detecting resistance was 94.9% (95%CI, 90.0-99.8%) and 98.9% (95%CI, 97.7-100.0%) for rifampicin; 89.1% (95%CI, 83.9-94.3%) and 100.0% (95%CI, 100.0-100.0%) for isoniazid; 82.1% (95% CI, 68.0-96.3%) and 99.7% (95% CI, 99.0-100.0%) for ethambutol. With DNA sequencing as the reference standard, the sensitivity and specificity of xMAP TIER for detecting resistance were 95.0% (95% CI, 90.2-99.8%) and 99.6% (95% CI, 98.9-100.0%) for rifampicin; 96.9% (95% CI, 93.8-99.9%) and 100.0% (95% CI, 100.0-100.0%) for isoniazid; 86.1% (95% CI, 74.8-97.4%) and 100.0% (95% CI, 100.0-100.0%) for ethambutol. The results achieved showed that the xMAP TIER assay had good performance for detecting first-line anti-tuberculosis drug resistance, and it has the potential to diagnose drug-resistant tuberculosis more accurately due to the addition of more optimal design primers and probes on open architecture xMAP system.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis , Humans , Antitubercular Agents/pharmacology , Antitubercular Agents/therapeutic use , Isoniazid/pharmacology , Isoniazid/therapeutic use , Ethambutol/pharmacology , Rifampin/pharmacology , Rifampin/therapeutic use , Microspheres , Microbial Sensitivity Tests , Mycobacterium tuberculosis/genetics , Tuberculosis/drug therapy
12.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1782-1793, 2022.
Article in English | MEDLINE | ID: mdl-33237867

ABSTRACT

It remains challenging how to find existing but undiscovered genome sequence mutations or predict potential genome sequence mutations based on real sequence data. Motivated by this, we develop approaches to detect new, undiscovered genome sequences. Because discovering new genome sequences through biological experiments is resource-intensive, we want to achieve the new genome sequence detection task mathematically. However, little literature tells us how to detect new, undiscovered genome sequence mutations mathematically. We form a new framework based on natural vector convex hull method that conducts alignment-free sequence analysis. Our newly developed two approaches, Random-permutation Algorithm with Penalty (RAP) and Random-permutation Algorithm with Penalty and COstrained Search (RAPCOS), use the geometry properties captured by natural vectors. In our experiment, we discover a mathematically new human immunodeficiency virus (HIV) genome sequence using some real HIV genome sequences. Significantly, the proposed methods are applicable to solve the new genome sequence detection challenge and have many good properties, such as robustness, rapid convergence, and fast computation.


Subject(s)
Algorithms , Genome , Genome/genetics , Humans
13.
Comput Struct Biotechnol J ; 19: 4226-4234, 2021.
Article in English | MEDLINE | ID: mdl-34429843

ABSTRACT

Understanding the relationships between genomic sequences is essential to the classification and characterization of living beings. The classes and characteristics of an organism can be identified in the corresponding genome space. In the genome space, the natural metric is important to describe the distribution of genomes. Therefore, the similarity of two biological sequences can be measured. Here, we report that all of the viral genomes are in 32-dimensional Euclidean space, in which the natural metric is the weighted summation of Euclidean distance of k-mer natural vectors. The classification of viral genomes in the constructed genome space further proves the convex hull principle of taxonomy, which states that convex hulls of different families are mutually disjoint. This study provides a novel geometric perspective to describe the genome sequences.

14.
Acta Math Sci ; 41(3): 1017-1022, 2021.
Article in English | MEDLINE | ID: mdl-33897081

ABSTRACT

The severe acute respiratory syndrome COVID-19 was discovered on December 31, 2019 in China. Subsequently, many COVID-19 cases were reported in many other countries. However, some positive COVID-19 samples had been reported earlier than those officially accepted by health authorities in other countries, such as France and Italy. Thus, it is of great importance to determine the place where SARS-CoV-2 was first transmitted to human. To this end, we analyze genomes of SARS-CoV-2 using k-mer natural vector method and compare the similarities of global SARS-CoV-2 genomes by a new natural metric. Because it is commonly accepted that SARS-CoV-2 is originated from bat coronavirus RaTG13, we only need to determine which SARS-CoV-2 genome sequence has the closest distance to bat coronavirus RaTG13 under our natural metric. From our analysis, SARS-CoV-2 most likely has already existed in other countries such as France, India, Netherland, England and United States before the outbreak at Wuhan, China.

15.
Front Genet ; 12: 828805, 2021.
Article in English | MEDLINE | ID: mdl-35186019

ABSTRACT

A comprehensive description of human genomes is essential for understanding human evolution and relationships between modern populations. However, most published literature focuses on local alignment comparison of several genes rather than the complete evolutionary record of individual genomes. Combining with data from the 1,000 Genomes Project, we successfully reconstructed 2,504 individual genomes and propose Divided Natural Vector method to analyze the distribution of nucleotides in the genomes. Comparisons based on autosomes, sex chromosomes and mitochondrial genomes reveal the genetic relationships between populations, and different inheritance pattern leads to different phylogenetic results. Results based on mitochondrial genomes confirm the "out-of-Africa" hypothesis and assert that humans, at least females, most likely originated in eastern Africa. The reconstructed genomes are stored on our server and can be further used for any genome-scale analysis of humans (http://yaulab.math.tsinghua.edu.cn/2022_1000genomesprojectdata/). This project provides the complete genomes of thousands of individuals and lays the groundwork for genome-level analyses of the genetic relationships between populations and the origin of humans.

16.
Comput Struct Biotechnol J ; 18: 1904-1913, 2020.
Article in English | MEDLINE | ID: mdl-32774785

ABSTRACT

Chaos Game Representation (CGR) was first proposed to be an image representation method of DNA and have been extended to the case of other biological macromolecules. Compared with the CGR images of DNA, where DNA sequences are converted into a series of points in the unit square, the existing CGR images of protein are not so elegant in geometry and the implications of the distribution of points in the CGR image are not so obvious. In this study, by naturally distributing the twenty amino acids on the vertices of a regular dodecahedron, we introduce a novel three-dimensional image representation of protein sequences with CGR method. We also associate each CGR image with a vector in high dimensional Euclidean space, called the extended natural vector (ENV), in order to analyze the information contained in the CGR images. Based on the results of protein classification and phylogenetic analysis, our method could serve as a precise method to discover biological relationships between proteins.

17.
PeerJ ; 8: e9625, 2020.
Article in English | MEDLINE | ID: mdl-32832270

ABSTRACT

BACKGROUND: Begomoviruses are widely distributed and causing devastating diseases in many crops. According to the number of genomic components, a begomovirus is known as either monopartite or bipartite begomovirus. Both the monopartite and bipartite begomoviruses have the DNA-A component which encodes all essential proteins for virus functions, while the bipartite begomoviruses still contain the DNA-B component. The satellite molecules, known as betasatellites, alphasatellites or deltasatellites, sometimes exist in the begomoviruses. So, the genomic components of begomoviruses are complex and varied. Different genomic components have different gene structures and functions. Classifying the components of begomoviruses is important for studying the virus origin and pathogenic mechanism. METHODS: We propose a model combining Subsequence Natural Vector (SNV) method with Support Vector Machine (SVM) algorithm, to classify the genomic components of begomoviruses and predict the genes of begomoviruses. First, the genome sequence is represented as a vector numerically by the SNV method. Then SVM is applied on the datasets to build the classification model. At last, recursive feature elimination (RFE) is used to select essential features of the subsequence natural vectors based on the importance of features. RESULTS: In the investigation, DNA-A, DNA-B, and different satellite DNAs are selected to build the model. To evaluate our model, the homology-based method BLAST and two machine learning algorithms Random Forest and Naive Bayes method are used to compare with our model. According to the results, our classification model can classify DNA-A, DNA-B, and different satellites with high accuracy. Especially, we can distinguish whether a DNA-A component is from a monopartite or a bipartite begomovirus. Then, based on the results of classification, we can also predict the genes of different genomic components. According to the selected features, we find that the content of four nucleotides in the second and tenth segments (approximately 150-350 bp and 1,450-1,650 bp) are the most different between DNA-A components of monopartite and bipartite begomoviruses, which may be related to the pre-coat protein (AV2) and the transcriptional activator protein (AC2) genes. Our results advance the understanding of the unique structures of the genomic components of begomoviruses.

18.
Genes (Basel) ; 11(6)2020 06 09.
Article in English | MEDLINE | ID: mdl-32526937

ABSTRACT

The severe respiratory disease COVID-19 was initially reported in Wuhan, China, in December 2019, and spread into many provinces from Wuhan. The corresponding pathogen was soon identified as a novel coronavirus named SARS-CoV-2 (formerly, 2019-nCoV). As of 2 May, 2020, over 3 million COVID-19 cases had been confirmed, and 235,290 deaths had been reported globally, and the numbers are still increasing. It is important to understand the phylogenetic relationship between SARS-CoV-2 and known coronaviruses, and to identify its hosts for preventing the next round of emergency outbreak. In this study, we employ an effective alignment-free approach, the Natural Vector method, to analyze the phylogeny and classify the coronaviruses based on genomic and protein data. Our results show that SARS-CoV-2 is closely related to, but distinct from the SARS-CoV branch. By analyzing the genetic distances from the SARS-CoV-2 strain to the coronaviruses residing in animal hosts, we establish that the most possible transmission path originates from bats to pangolins to humans.


Subject(s)
Betacoronavirus/genetics , Coronavirus Infections/transmission , Coronavirus/genetics , Models, Biological , Pneumonia, Viral/transmission , Animals , Betacoronavirus/classification , COVID-19 , Chiroptera/virology , Coronavirus/classification , Coronavirus 3C Proteases , Coronavirus Infections/virology , Cysteine Endopeptidases/chemistry , Cysteine Endopeptidases/genetics , Disease Outbreaks , Disease Reservoirs , Humans , Mammals/classification , Mammals/virology , Pandemics , Phylogeny , Pneumonia, Viral/virology , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/genetics , Viral Nonstructural Proteins/chemistry , Viral Nonstructural Proteins/genetics
19.
J Comput Biol ; 27(12): 1688-1698, 2020 12.
Article in English | MEDLINE | ID: mdl-32392428

ABSTRACT

Bacterial evolution is an important study field, biological sequences are often used to construct phylogenetic relationships. Multiple sequence alignment is very time-consuming and cannot deal with large scales of bacterial genome sequences in a reasonable time. Hence, a new mathematical method, joining density vector method, is proposed to cluster bacteria, which characterizes the features of coding sequence (CDS) in a DNA sequence. Coding sequences carry genetic information that can synthesize proteins. The correspondence between a genomic sequence and its joining density vector (JDV) is one-to-one. JDV reflects the statistical characteristics of genomic sequence and large amounts of data can be analyzed using this new approach. We apply the novel method to do phylogenetic analysis on four bacterial data sets at hierarchies of genus and species. The phylogenetic trees prove that our new method accurately describes the evolutionary relationships of bacterial coding sequences, and is faster than ClustalW and the existing alignment-free methods.


Subject(s)
Bacteria/genetics , Genome, Bacterial , Phylogeny , Enterobacteriaceae/genetics , Pseudomonas/genetics , Streptococcus/genetics
20.
Mol Phylogenet Evol ; 141: 106633, 2019 12.
Article in English | MEDLINE | ID: mdl-31563612

ABSTRACT

Using numerical methods for genome comparison has always been of importance in bioinformatics. The Chaos Game Representation (CGR) is an effective genome sequence mapping technology, which converts genome sequences to CGR images. To each CGR image, we associate a vector called an Extended Natural Vector (ENV). The ENV is based on the distribution of intensity values. This mapping produces a one-to-one correspondence between CGR images and their ENVs. We define the distance between two DNA sequences as the distance between their associated ENVs. We cluster and classify several datasets including Influenza A viruses, Bacillus genomes, and Conoidea mitochondrial genomes to build their phylogenetic trees. Results show that our ENV combining CGR method (CGR-ENV) compares favorably in classification accuracy and efficiency against the multiple sequence alignment (MSA) method and other alignment-free methods. The research provides significant insights into the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes.


Subject(s)
Algorithms , Genome , Genomics , Base Sequence , DNA/genetics , Genome, Mitochondrial , Markov Chains , Phylogeny
SELECTION OF CITATIONS
SEARCH DETAIL
...