Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 278
Filter
1.
Int J Parasitol ; 2024 Aug 19.
Article in English | MEDLINE | ID: mdl-39168434

ABSTRACT

Millions of livestock animals worldwide are infected with the haematophagous barber's pole worm, Haemonchus contortus, the aetiological agent of haemonchosis. Despite the major significance of this parasite worldwide and its widespread resistance to current treatments, the lack of a high-quality genome for the well-defined strain of this parasite from Australia, called Haecon-5, has constrained research in a number of areas including host-parasite interactions, drug discovery and population genetics. To enable research in these areas, we report here a chromosome-contiguous genome (∼280 Mb) for Haecon-5 with high-quality models for 19,234 protein-coding genes. Comparative genomic analyses show significant genomic similarity (synteny) with a UK strain of H. contortus, called MHco3(ISE).N1 (abbreviated as "ISE"), but we also discover marked differences in genomic structure/gene arrangements, distribution of nucleotide variability (single nucleotide polymorphisms (SNPs) and indels) and orthology between Haecon-5 and ISE. We used the genome and extensive transcriptomic resources for Haecon-5 to predict a subset of essential single-copy genes employing a "cross-species" machine learning (ML) approach using a range of features from nucleotide/protein sequences, protein orthology, subcellular localisation, single-cell RNA-seq and/or histone methylation data available for the model organisms Caenorhabditis elegans and Drosophila melanogaster. From a set of 1,464 conserved single copy genes, transcribed in key life-cycle stages of H. contortus, we identified 232 genes whose homologs have critical functions in C. elegans and/or D. melanogaster, and prioritised 10 of them for further characterisation; nine of the 10 genes likely play roles in neurophysiological processes, germline, hypodermis and/or respiration, and one is an unknown (orphan) gene for which no detailed functional information exists. Future studies of these genes/gene products are warranted to elucidate their roles in parasite biology, host-parasite interplay and/or disease. Clearly, the present Haecon-5 reference genome and associated resources now underpin a broad range of fundamental investigations of H. contortus and could assist in accelerating the discovery of novel intervention targets and drug candidates to combat haemonchosis.

2.
Comput Struct Biotechnol J ; 23: 3081-3089, 2024 Dec.
Article in English | MEDLINE | ID: mdl-39185442

ABSTRACT

Detailed explorations of the model organisms Caenorhabditis elegans (elegant worm) and Drosophila melanogaster (vinegar fly) have substantially improved our knowledge and understanding of biological processes and pathways in metazoan organisms. Extensive functional genomic and multi-omic data sets have enabled the discovery and characterisation of 'essential' genes that are critical for the survival of these organisms. Recently, we showed that a machine learning (ML)-based pipeline could be utilised to predict essential genes in both C. elegans and D. melanogaster using features from DNA, RNA, protein and/or cellular data or associated information. As these distantly-related species are within the Ecdysozoa, we hypothesised that this approach could be suited for non-model organisms within the same group (phylum) of protostome animals. In the present investigation, we cross-predicted essential genes within the phylum Nematoda - between C. elegans and the parasitic filarial nematodes Brugia malayi and Onchocerca volvulus, and then ranked and prioritised these genes. Highly ranked genes were linked to key biological pathways or processes, such as ribosome biogenesis, translation and RNA processing, and were expressed at relatively high levels in the germline, gonad, hypodermis and/or nerves. The present in silico workflow is hoped to expedite the identification of drug targets in parasitic organisms for subsequent experimental validation in the laboratory.

3.
Int J Mol Sci ; 25(13)2024 Jun 27.
Article in English | MEDLINE | ID: mdl-39000124

ABSTRACT

Over the years, comprehensive explorations of the model organisms Caenorhabditis elegans (elegant worm) and Drosophila melanogaster (vinegar fly) have contributed substantially to our understanding of complex biological processes and pathways in multicellular organisms generally. Extensive functional genomic-phenomic, genomic, transcriptomic, and proteomic data sets have enabled the discovery and characterisation of genes that are crucial for life, called 'essential genes'. Recently, we investigated the feasibility of inferring essential genes from such data sets using advanced bioinformatics and showed that a machine learning (ML)-based workflow could be used to extract or engineer features from DNA, RNA, protein, and/or cellular data/information to underpin the reliable prediction of essential genes both within and between C. elegans and D. melanogaster. As these are two distantly related species within the Ecdysozoa, we proposed that this ML approach would be particularly well suited for species that are within the same phylum or evolutionary clade. In the present study, we cross-predicted essential genes within the phylum Nematoda (evolutionary clade V)-between C. elegans and the pathogenic parasitic nematode H. contortus-and then ranked and prioritised H. contortus proteins encoded by these genes as intervention (e.g., drug) target candidates. Using strong, validated predictors, we inferred essential genes of H. contortus that are involved predominantly in crucial biological processes/pathways including ribosome biogenesis, translation, RNA binding/processing, and signalling and which are highly transcribed in the germline, somatic gonad precursors, sex myoblasts, vulva cell precursors, various nerve cells, glia, or hypodermis. The findings indicate that this in silico workflow provides a promising avenue to identify and prioritise panels/groups of drug target candidates in parasitic nematodes for experimental validation in vitro and/or in vivo.


Subject(s)
Caenorhabditis elegans , Genes, Essential , Haemonchus , Machine Learning , Animals , Haemonchus/genetics , Caenorhabditis elegans/genetics , Helminth Proteins/genetics , Helminth Proteins/metabolism , Computational Biology/methods , Drosophila melanogaster/genetics
4.
Mol Genet Genomics ; 299(1): 72, 2024 Jul 27.
Article in English | MEDLINE | ID: mdl-39060647

ABSTRACT

Codon usage bias (CUB), the uneven usage of synonymous codons encoding the same amino acid, differs among genes within and across bacteria genomes. CUB is known to be influenced by gene expression and accordingly, CUB differs between the high-expression and low-expression genes in several bacteria. In this article, we have extended codon usage study considering gene essentiality as a feature. Using machine learning (ML) based approaches, we have analysed Relative Synonymous Codon Usage (RSCU) values between essential and non-essential genes in Escherichia coli and thirty-four other bacterial genomes whose gene essentiality features were available in public databases. We observed significant differences in codon usage patterns between essential and non-essential genes for majority of the bacterial genomes and accordingly, ML based classifiers achieved high area under curve (AUC) scores, with a minimum score of 70.0 across twenty-eight organisms. Further, importance of the codons towards classifying genes found to differ among the codons in each genome. Arg codon CGT and Gly codon GGT were observed to be the most preferred codons among essential genes in Escherichia coli. Interestingly, some of the codons like CGT, ATA, GGT and GGG observed to be contributing consistently towards classifying essential genes across thirty-five bacteria genomes studied. In other hand, codons TGY and CAY encoding amino acids Cys and His respectively were among the least contributing codons towards classification among all these bacteria. This study demonstrates the gene essentiality based differences in synonymous codon usage in bacteria genomes and presents a common codon usage pattern across bacteria.


Subject(s)
Codon Usage , Escherichia coli , Genes, Essential , Machine Learning , Genes, Essential/genetics , Escherichia coli/genetics , Genome, Bacterial/genetics , Genes, Bacterial , Codon/genetics , Bacteria/genetics , Bacteria/classification
5.
Cell Rep ; 43(7): 114417, 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-38980795

ABSTRACT

The ability to sense and respond to osmotic fluctuations is critical for the maintenance of cellular integrity. We used gene co-essentiality analysis to identify an unappreciated relationship between TSC22D2, WNK1, and NRBP1 in regulating cell volume homeostasis. All of these genes have paralogs and are functionally buffered for osmo-sensing and cell volume control. Within seconds of hyperosmotic stress, TSC22D, WNK, and NRBP family members physically associate into biomolecular condensates, a process that is dependent on intrinsically disordered regions (IDRs). A close examination of these protein families across metazoans revealed that TSC22D genes evolved alongside a domain in NRBPs that specifically binds to TSC22D proteins, which we have termed NbrT (NRBP binding region with TSC22D), and this co-evolution is accompanied by rapid IDR length expansion in WNK-family kinases. Our study reveals that TSC22D, WNK, and NRBP genes evolved in metazoans to co-regulate rapid cell volume changes in response to osmolarity.


Subject(s)
Cell Size , WNK Lysine-Deficient Protein Kinase 1 , Humans , Animals , WNK Lysine-Deficient Protein Kinase 1/metabolism , WNK Lysine-Deficient Protein Kinase 1/genetics , Evolution, Molecular , HEK293 Cells , Protein Binding , Multigene Family , Osmotic Pressure
6.
Methods Mol Biol ; 2829: 127-156, 2024.
Article in English | MEDLINE | ID: mdl-38951331

ABSTRACT

The baculovirus expression vector system (BEVS) has now found acceptance in both research laboratories and industry, which can be attributed to many of its key features including the limited host range of the vectors, their non-pathogenicity to humans, and the mammalian-like post-translational modification (PTMs) that can be achieved in insect cells. In fact, this system acts as a middle ground between prokaryotes and higher eukaryotes to produce complex biologics. Still, industrial use of the BEVS lags compared to other platforms. We have postulated that one reason for this has been a lack of genetic tools that can complement the study of baculovirus vectors, while a second reason is the co-production of the baculovirus vector with the desired product. While some genetic enhancements have been made to improve the BEVS as a production platform, the genome remains under-scrutinized. This chapter outlines the methodology for a CRISPR-Cas9-based transfection-infection assay to probe the baculovirus genome for essential/nonessential genes that can potentially maximize foreign gene expression under a promoter of choice.


Subject(s)
Baculoviridae , CRISPR-Cas Systems , Genetic Vectors , Baculoviridae/genetics , Genetic Vectors/genetics , Animals , Genes, Essential , Gene Expression , Transfection/methods , Gene Editing/methods , Sf9 Cells , Humans
7.
Heliyon ; 10(11): e31713, 2024 Jun 15.
Article in English | MEDLINE | ID: mdl-38832264

ABSTRACT

Humans benefit from a vast community of microorganisms in their gastrointestinal tract, known as the gut microbiota, numbering in the tens of trillions. An imbalance in the gut microbiota known as dysbiosis, can lead to changes in the metabolite profile, elevating the levels of toxins like Bacteroides fragilis toxin (BFT), colibactin, and cytolethal distending toxin. These toxins are implicated in the process of oncogenesis. However, a significant portion of the Bacteroides fragilis genome consists of functionally uncharacterized and hypothetical proteins. This study delves into the functional characterization of hypothetical proteins (HPs) encoded by the Bacteroides fragilis genome, employing a systematic in silico approach. A total of 379 HPs were subjected to a BlastP homology search against the NCBI non-redundant protein sequence database, resulting in 162 HPs devoid of identity to known proteins. CDD-Blast identified 106 HPs with functional domains, which were then annotated using Pfam, InterPro, SUPERFAMILY, SCANPROSITE, SMART, and CATH. Physicochemical properties, such as molecular weight, isoelectric point, and stability indices, were assessed for 60 HPs whose functional domains were identified by at least three of the aforementioned bioinformatic tools. Subsequently, subcellular localization analysis was examined and the gene ontology analysis revealed diverse biological processes, cellular components, and molecular functions. Remarkably, E1WPR3 was identified as a virulent and essential gene among the HPs. This study presents a comprehensive exploration of B. fragilis HPs, shedding light on their potential roles and contributing to a deeper understanding of this organism's functional landscape.

8.
Appl Environ Microbiol ; 90(7): e0068724, 2024 Jul 24.
Article in English | MEDLINE | ID: mdl-38864628

ABSTRACT

Mycoplasma bovis is an important emerging pathogen of cattle and bison, but our understanding of the genetic basis of its interactions with its host is limited. The aim of this study was to identify genes of M. bovis required for interaction and survival in association with host cells. One hundred transposon-induced mutants of the type strain PG45 were assessed for their capacity to survive and proliferate in Madin-Darby bovine kidney cell cultures. The growth of 19 mutants was completely abrogated, and 47 mutants had a prolonged doubling time compared to the parent strain. All these mutants had a similar growth pattern to the parent strain PG45 in the axenic media. Thirteen genes previously classified as dispensable for the axenic growth of M. bovis were found to be essential for the growth of M. bovis in association with host cells. In most of the mutants with a growth-deficient phenotype, the transposon was inserted into a gene involved in transportation or metabolism. This included genes coding for ABC transporters, proteins related to carbohydrate, nucleotide and protein metabolism, and membrane proteins essential for attachment. It is likely that these genes are essential not only in vitro but also for the survival of M. bovis in infected animals. IMPORTANCE: Mycoplasma bovis causes chronic bronchopneumonia, mastitis, arthritis, keratoconjunctivitis, and reproductive tract disease in cattle around the globe and is an emerging pathogen in bison. Control of mycoplasma infections is difficult in the absence of appropriate antimicrobial treatment or effective vaccines. A comprehensive understanding of host-pathogen interactions and virulence factors is important to implement more effective control methods against M. bovis. Recent studies of other mycoplasmas with in vitro cell culture models have identified essential virulence genes of mycoplasmas. Our study has identified genes of M. bovis required for survival in association with host cells, which will pave the way to a better understanding of host-pathogen interactions and the role of specific genes in the pathogenesis of disease caused by M. bovis.


Subject(s)
Mycoplasma bovis , Mycoplasma bovis/genetics , Animals , Cattle , Mycoplasma Infections/microbiology , Mycoplasma Infections/veterinary , Cell Line , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Cattle Diseases/microbiology , Genes, Bacterial/genetics , DNA Transposable Elements , Host-Pathogen Interactions , Bison/microbiology , Microbial Viability
9.
Imeta ; 3(1): e157, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38868518

ABSTRACT

Over the past few decades, there has been a significant interest in the study of essential genes, which are crucial for the survival of an organism under specific environmental conditions and thus have practical applications in the fields of synthetic biology and medicine. An increasing amount of experimental data on essential genes has been obtained with the continuous development of technological methods. Meanwhile, various computational prediction methods, related databases and web servers have emerged accordingly. To facilitate the study of essential genes, we have established a database of essential genes (DEG), which has become popular with continuous updates to facilitate essential gene feature analysis and prediction, drug and vaccine development, as well as artificial genome design and construction. In this article, we summarized the studies of essential genes, overviewed the relevant databases, and discussed their practical applications. Furthermore, we provided an overview of the main applications of DEG and conducted comprehensive analyses based on its latest version. However, it should be noted that the essential gene is a dynamic concept instead of a binary one, which presents both opportunities and challenges for their future development.

10.
Genet Med ; 26(7): 101141, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38629401

ABSTRACT

PURPOSE: Existing resources that characterize the essentiality status of genes are based on either proliferation assessment in human cell lines, viability evaluation in mouse knockouts, or constraint metrics derived from human population sequencing studies. Several repositories document phenotypic annotations for rare disorders; however, there is a lack of comprehensive reporting on lethal phenotypes. METHODS: We queried Online Mendelian Inheritance in Man for terms related to lethality and classified all Mendelian genes according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. We characterized the genes across these lethality categories, examined the evidence on viability from mouse models and explored how this information could be used for novel gene discovery. RESULTS: We developed the Lethal Phenotypes Portal to showcase this curated catalog of human essential genes. Differences in the mode of inheritance, physiological systems affected, and disease class were found for genes in different lethality categories, as well as discrepancies between the lethal phenotypes observed in mouse and human. CONCLUSION: We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.


Subject(s)
Genes, Lethal , Genetic Diseases, Inborn , Phenotype , Humans , Animals , Mice , Genetic Diseases, Inborn/genetics , Databases, Genetic , Disease Models, Animal , Genes, Essential/genetics
11.
Front Bioeng Biotechnol ; 12: 1377334, 2024.
Article in English | MEDLINE | ID: mdl-38590605

ABSTRACT

Sinorhizobium fredii CCBAU45436 is an excellent rhizobium that plays an important role in agricultural production. However, there still needs more comprehensive understanding of the metabolic system of S. fredii CCBAU45436, which hinders its application in agriculture. Therefore, based on the first-generation metabolic model iCC541 we developed a new genome-scale metabolic model iAQY970, which contains 970 genes, 1,052 reactions, 942 metabolites and is scored 89% in the MEMOTE test. Cell growth phenotype predicted by iAQY970 is 81.7% consistent with the experimental data. The results of mapping the proteome data under free-living and symbiosis conditions to the model showed that the biomass production rate in the logarithmic phase was faster than that in the stable phase, and the nitrogen fixation efficiency of rhizobia parasitized in cultivated soybean was higher than that in wild-type soybean, which was consistent with the actual situation. In the symbiotic condition, there are 184 genes that would affect growth, of which 94 are essential; In the free-living condition, there are 143 genes that influence growth, of which 78 are essential. Among them, 86 of the 94 essential genes in the symbiotic condition were consistent with the prediction of iCC541, and 44 essential genes were confirmed by literature information; meanwhile, 30 genes were identified by DEG and 33 genes were identified by Geptop. In addition, we extracted four key nitrogen fixation modules from the model and predicted that sulfite reductase (EC 1.8.7.1) and nitrogenase (EC 1.18.6.1) as the target enzymes to enhance nitrogen fixation by MOMA, which provided a potential focus for strain optimization. Through the comprehensive metabolic model, we can better understand the metabolic capabilities of S. fredii CCBAU45436 and make full use of it in the future.

12.
mSphere ; 9(4): e0064223, 2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38511958

ABSTRACT

The spread of multi-drug-resistant (MDR) pathogens has rapidly outpaced the development of effective treatments. Diverse resistance mechanisms further limit the effectiveness of our best treatments, including multi-drug regimens and last line-of-defense antimicrobials. Biofilm formation is a powerful component of microbial pathogenesis, providing a scaffold for efficient colonization and shielding against anti-microbials, which further complicates drug resistance studies. Early genetic knockout tools didn't allow the study of essential genes, but clustered regularly interspaced palindromic repeat inference (CRISPRi) technologies have overcome this challenge via genetic silencing. These tools rapidly evolved to meet new demands and exploit native CRISPR systems. Modern tools range from the creation of massive CRISPRi libraries to tunable modulation of gene expression with CRISPR activation (CRISPRa). This review discusses the rapid expansion of CRISPRi/a-based technologies, their use in investigating MDR and biofilm formation, and how this drives further development of a potent tool to comprehensively examine multi-drug resistance.

13.
Cancer ; 130(S8): 1435-1448, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38358781

ABSTRACT

BACKGROUND: Patients with triple-positive breast cancer (TPBC) have a higher risk of recurrence and lower survival rates than patients with other luminal breast cancers. However, there are few studies on the predictive biomarkers of prognosis and treatment responses in TPBC. METHODS: Proliferation essential genes (PEGs) were acquired from clustered regularly interspaced short palindromic repeats-associated protein 9 (CRISPR-Cas9) technology, and cohorts of patients with TPBC were obtained from public databases and our cohort. To develop a TPBC-PEG signature, Cox regression and least absolute shrinkage and selection operator regression analyses were applied. Functional analyses were performed with gene set enrichment analysis. The relationship between candidate genes and neoadjuvant chemotherapy (NACT) sensitivity was explored via real-time quantitative polymerase chain reaction (RT-qPCR) and immunohistochemistry (IHC) on the basis of clinical samples. RESULTS: Among 900 TPBC-PEGs, 437 showed significant differential expression between TPBC and normal tissues. Three prognostic PEGs (actin-like 6A [ACTL6A], chaperonin containing TCP1 subunit 2 [CCT2], and threonyl-TRNA synthetase [TARS]) were identified and used to construct the PEG signature. Patients with high PEG signature scores exhibited a worse overall survival and lower sensitivity to NACT than patients with low PEG signature scores. RT-qPCR results indicated that ACTL6A and CCT2 expression were significantly upregulated in patients who lacked sensitivity to NACT. IHC results showed that the ACTL6A protein was highly expressed in patients with NACT resistance and nonpathological complete responses. CONCLUSIONS: This efficient PEG signature prognostic model can predict the outcomes of TPBC. Furthermore, ACTL6A expression level was associated with the response to NACT, and could serve as an important factor in predicting prognosis and drug sensitivity of patients with TPBC.


Subject(s)
Breast Neoplasms , Humans , Female , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Actins/genetics , Genes, Essential , Neoadjuvant Therapy/methods , Prognosis , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Cell Proliferation , Chromosomal Proteins, Non-Histone/genetics , Chromosomal Proteins, Non-Histone/therapeutic use , DNA-Binding Proteins/genetics
14.
Cell Syst ; 15(2): 134-148.e7, 2024 Feb 21.
Article in English | MEDLINE | ID: mdl-38340730

ABSTRACT

Quantifying and predicting growth rate phenotype given variation in gene expression and environment is complicated by epistatic interactions and the vast combinatorial space of possible perturbations. We developed an approach for mapping expression-growth rate landscapes that integrates sparsely sampled experimental measurements with an interpretable machine learning model. We used mismatch CRISPRi across pairs and triples of genes to create over 8,000 titrated changes in E. coli gene expression under varied environmental contexts, exploring epistasis in up to 22 distinct environments. Our results show that a pairwise model previously used to describe drug interactions well-described these data. The model yielded interpretable parameters related to pathway architecture and generalized to predict the combined effect of up to four perturbations when trained solely on pairwise perturbation data. We anticipate this approach will be broadly applicable in optimizing bacterial growth conditions, generating pharmacogenomic models, and understanding the fundamental constraints on bacterial gene expression. A record of this paper's transparent peer review process is included in the supplemental information.


Subject(s)
Epistasis, Genetic , Escherichia coli , Epistasis, Genetic/genetics , Escherichia coli/genetics , Bacteria/genetics , Gene Expression
15.
mSphere ; 9(2): e0070323, 2024 Feb 28.
Article in English | MEDLINE | ID: mdl-38251906

ABSTRACT

Promoter shutoff of essential genes in the diploid Candida albicans has often been insufficient to create tight, conditional null alleles due to leaky expression and has been a stumbling block in pathogenesis research. Moreover, homozygous deletion of non-essential genes has often been problematic due to the frequent aneuploidy in the mutant strains. Rapid, conditional depletion of essential genes by the anchor-away strategy has been successfully employed in Saccharomyces cerevisiae and other model organisms. Here, rapamycin mediates the dimerization of human FK506-binding protein (FKBP12) and FKBP12-rapamycin-binding (FRB) domain-containing target protein, resulting in relocalization to altered sub-cellular locations. In this work, we used the ribosomal protein Rpl13 as the anchor and took two nuclear proteins as targets to construct a set of mutants in a proof-of-principle approach. We first constructed a rapamycin-resistant C. albicans strain by introducing a dominant mutation in the CaTOR1 gene and a homozygous deletion of RBP1, the ortholog of FKBP12, a primary target of rapamycin. The FKBP12 and the FRB coding sequences were then CUG codon-adapted for C. albicans by site-directed mutagenesis. Anchor-away strains expressing the essential TBP1 gene or the non-essential SPT8 gene as FRB fusions were constructed. We found that rapamycin caused rapid cessation of growth of the TBP-AA strain within 15 minutes and the SPT8-AA strain phenocopied the constitutive filamentous phenotype of the spt8Δ/spt8Δ mutant. Thus, the anchor-away toolbox for C. albicans developed here can be employed for genome-wide analysis to identify gene function in a rapid and reliable manner, further accelerating anti-fungal drug development in C. albicans. IMPORTANCE: Molecular genetic studies thus far have identified ~27% open-reading frames as being essential for the vegetative growth of Candida albicans in rich medium out of a total 6,198 haploid set of open reading frames. However, a major limitation has been to construct rapid conditional alleles of essential C. albicans genes with near quantitative depletion of encoded proteins. Here, we have developed a toolbox for rapid and conditional depletion of genes that would aid studies of gene function of both essential and non-essential genes.


Subject(s)
Candida albicans , Tacrolimus Binding Protein 1A , Humans , Candida albicans/genetics , Tacrolimus Binding Protein 1A/genetics , Homozygote , Sequence Deletion , Sirolimus , Saccharomyces cerevisiae/genetics , Codon
16.
BMC Genomics ; 25(1): 47, 2024 Jan 10.
Article in English | MEDLINE | ID: mdl-38200437

ABSTRACT

BACKGROUND: Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes. RESULTS: In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training. CONCLUSIONS: Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.


Subject(s)
Genes, Essential , Neural Networks, Computer , Algorithms , Entropy , Genomics
17.
Biomolecules ; 14(1)2024 Jan 10.
Article in English | MEDLINE | ID: mdl-38254687

ABSTRACT

Prostate cancer (PCa) is characterised by androgen dependency. Unfortunately, under anti-androgen treatment pressure, castration-resistant prostate cancer (CRPC) emerges, characterised by heterogeneous cell populations that, over time, lead to the development of different androgen-dependent or -independent phenotypes. Despite important advances in therapeutic strategies, CRPC remains incurable. Context-specific essential genes represent valuable candidates for targeted anti-cancer therapies. Through the investigation of gene and protein annotations and the integration of published transcriptomic data, we identified two consensus lists to stratify PCa patients' risk and discriminate CRPC phenotypes based on androgen receptor activity. ROC and Kaplan-Meier survival analyses were used for gene set validation in independent datasets. We further evaluated these genes for their association with cancer dependency. The deregulated expression of the PCa-related genes was associated with overall and disease-specific survival, metastasis and/or high recurrence risk, while the CRPC-related genes clearly discriminated between adeno and neuroendocrine phenotypes. Some of the genes showed context-specific essentiality. We further identified candidate drugs through a computational repositioning approach for targeting these genes and treating lethal variants of PCa. This work provides a proof-of-concept for the use of an integrative approach to identify candidate biomarkers involved in PCa progression and CRPC pathogenesis within the goal of precision medicine.


Subject(s)
Androgens , Prostatic Neoplasms, Castration-Resistant , Male , Humans , Prostatic Neoplasms, Castration-Resistant/genetics , Biomarkers , Phenotype , Computational Biology
18.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38279653

ABSTRACT

Cluster analysis is one of the most widely used exploratory methods for visualization and grouping of gene expression patterns across multiple samples or treatment groups. Although several existing online tools can annotate clusters with functional terms, there is no all-in-one webserver to effectively prioritize genes/clusters using gene essentiality as well as congruency of mRNA-protein expression. Hence, we developed CAP-RNAseq that makes possible (1) upload and clustering of bulk RNA-seq data followed by identification, annotation and network visualization of all or selected clusters; and (2) prioritization using DepMap gene essentiality and/or dependency scores as well as the degree of correlation between mRNA and protein levels of genes within an expression cluster. In addition, CAP-RNAseq has an integrated primer design tool for the prioritized genes. Herein, we showed using comparisons with the existing tools and multiple case studies that CAP-RNAseq can uniquely aid in the discovery of co-expression clusters enriched with essential genes and prioritization of novel biomarker genes that exhibit high correlations between their mRNA and protein expression levels. CAP-RNAseq is applicable to RNA-seq data from different contexts including cancer and available at http://konulabapps.bilkent.edu.tr:3838/CAPRNAseq/ and the docker image is downloadable from https://hub.docker.com/r/konulab/caprnaseq.


Subject(s)
Proteomics , Sequence Analysis, RNA/methods , RNA-Seq , RNA, Messenger/genetics
19.
mBio ; 15(2): e0309223, 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38189270

ABSTRACT

The identification of microbial genes essential for survival as those with lethal knockout phenotype (LKP) is a common strategy for functional interrogation of genomes. However, interpretation of the LKP is complicated because a substantial fraction of the genes with this phenotype remains poorly functionally characterized. Furthermore, many genes can exhibit LKP not because their products perform essential cellular functions but because their knockout activates the toxicity of other genes (conditionally essential genes). We analyzed the sets of LKP genes for two archaea, Methanococcus maripaludis and Sulfolobus islandicus, using a variety of computational approaches aiming to differentiate between essential and conditionally essential genes and to predict at least a general function for as many of the proteins encoded by these genes as possible. This analysis allowed us to predict the functions of several LKP genes including previously uncharacterized subunit of the GINS protein complex with an essential function in genome replication and of the KEOPS complex that is responsible for an essential tRNA modification as well as GRP protease implicated in protein quality control. Additionally, several novel antitoxins (conditionally essential genes) were predicted, and this prediction was experimentally validated by showing that the deletion of these genes together with the adjacent genes apparently encoding the cognate toxins caused no growth defect. We applied principal component analysis based on sequence and comparative genomic features showing that this approach can separate essential genes from conditionally essential ones and used it to predict essential genes in other archaeal genomes.IMPORTANCEOnly a relatively small fraction of the genes in any bacterium or archaeon is essential for survival as demonstrated by the lethal effect of their disruption. The identification of essential genes and their functions is crucial for understanding fundamental cell biology. However, many of the genes with a lethal knockout phenotype remain poorly functionally characterized, and furthermore, many genes can exhibit this phenotype not because their products perform essential cellular functions but because their knockout activates the toxicity of other genes. We applied state-of-the-art computational methods to predict the functions of a number of uncharacterized genes with the lethal knockout phenotype in two archaeal species and developed a computational approach to predict genes involved in essential functions. These findings advance the current understanding of key functionalities of archaeal cells.


Subject(s)
Archaea , Archaeal Proteins , Archaea/genetics , Archaea/metabolism , Genes, Essential , Genome, Archaeal , Genomics , Phenotype , Archaeal Proteins/genetics , Archaeal Proteins/metabolism
20.
Microbiol Spectr ; 12(1): e0314923, 2024 Jan 11.
Article in English | MEDLINE | ID: mdl-38054713

ABSTRACT

IMPORTANCE: The construction of arrayed mutant libraries has advanced the field of bacterial genetics by allowing researchers to more efficiently study the exact function and importance of encoded genes. In this study, we constructed an arrayed clustered regularly interspaced short palindromic repeats interference (CRISPRi) library, known as S treptococcus mutans arrayed CRISPRi (SNAP), as a resource to study >250 essential and growth-supporting genes in Streptococcus mutans. SNAP will be made available to the research community, and we anticipate that its distribution will lead to high-quality, high-throughput, and reproducible studies of essential genes.


Subject(s)
Genes, Essential , Streptococcus mutans , Streptococcus mutans/genetics , Clustered Regularly Interspaced Short Palindromic Repeats , Gene Library , CRISPR-Cas Systems
SELECTION OF CITATIONS
SEARCH DETAIL