Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 276
Filtrar
Mais filtros

País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 185(24): 4634-4653.e22, 2022 11 23.
Artigo em Inglês | MEDLINE | ID: mdl-36347254

RESUMO

Understanding the basis for cellular growth, proliferation, and function requires determining the roles of essential genes in diverse cellular processes, including visualizing their contributions to cellular organization and morphology. Here, we combined pooled CRISPR-Cas9-based functional screening of 5,072 fitness-conferring genes in human HeLa cells with microscopy-based imaging of DNA, the DNA damage response, actin, and microtubules. Analysis of >31 million individual cells identified measurable phenotypes for >90% of gene knockouts, implicating gene targets in specific cellular processes. Clustering of phenotypic similarities based on hundreds of quantitative parameters further revealed co-functional genes across diverse cellular activities, providing predictions for gene functions and associations. By conducting pooled live-cell screening of ∼450,000 cell division events for 239 genes, we additionally identified diverse genes with functional contributions to chromosome segregation. Our work establishes a resource detailing the consequences of disrupting core cellular processes that represents the functional landscape of essential human genes.


Assuntos
Sistemas CRISPR-Cas , Genes Essenciais , Humanos , Células HeLa , Técnicas de Inativação de Genes , Fenótipo
2.
Cell ; 184(17): 4579-4592.e24, 2021 08 19.
Artigo em Inglês | MEDLINE | ID: mdl-34297925

RESUMO

Antibacterial agents target the products of essential genes but rarely achieve complete target inhibition. Thus, the all-or-none definition of essentiality afforded by traditional genetic approaches fails to discern the most attractive bacterial targets: those whose incomplete inhibition results in major fitness costs. In contrast, gene "vulnerability" is a continuous, quantifiable trait that relates the magnitude of gene inhibition to the effect on bacterial fitness. We developed a CRISPR interference-based functional genomics method to systematically titrate gene expression in Mycobacterium tuberculosis (Mtb) and monitor fitness outcomes. We identified highly vulnerable genes in various processes, including novel targets unexplored for drug discovery. Equally important, we identified invulnerable essential genes, potentially explaining failed drug discovery efforts. Comparison of vulnerability between the reference and a hypervirulent Mtb isolate revealed incomplete conservation of vulnerability and that differential vulnerability can predict differential antibacterial susceptibility. Our results quantitatively redefine essential bacterial processes and identify high-value targets for drug development.


Assuntos
Regulação Bacteriana da Expressão Gênica , Genoma Bacteriano , Mycobacterium tuberculosis/genética , Aminoacil-tRNA Sintetases/metabolismo , Antituberculosos/farmacologia , Teorema de Bayes , Evolução Biológica , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Inativação Gênica/efeitos dos fármacos , Testes de Sensibilidade Microbiana , Mycobacterium tuberculosis/efeitos dos fármacos , RNA Guia de Cinetoplastídeos/genética
3.
Am J Hum Genet ; 110(4): 575-591, 2023 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-37028392

RESUMO

Leveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta's D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes.


Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Desequilíbrio de Ligação/genética , Genótipo , Bancos de Espécimes Biológicos , Reino Unido , Polimorfismo de Nucleotídeo Único/genética
4.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38279653

RESUMO

Cluster analysis is one of the most widely used exploratory methods for visualization and grouping of gene expression patterns across multiple samples or treatment groups. Although several existing online tools can annotate clusters with functional terms, there is no all-in-one webserver to effectively prioritize genes/clusters using gene essentiality as well as congruency of mRNA-protein expression. Hence, we developed CAP-RNAseq that makes possible (1) upload and clustering of bulk RNA-seq data followed by identification, annotation and network visualization of all or selected clusters; and (2) prioritization using DepMap gene essentiality and/or dependency scores as well as the degree of correlation between mRNA and protein levels of genes within an expression cluster. In addition, CAP-RNAseq has an integrated primer design tool for the prioritized genes. Herein, we showed using comparisons with the existing tools and multiple case studies that CAP-RNAseq can uniquely aid in the discovery of co-expression clusters enriched with essential genes and prioritization of novel biomarker genes that exhibit high correlations between their mRNA and protein expression levels. CAP-RNAseq is applicable to RNA-seq data from different contexts including cancer and available at http://konulabapps.bilkent.edu.tr:3838/CAPRNAseq/ and the docker image is downloadable from https://hub.docker.com/r/konulab/caprnaseq.


Assuntos
Proteômica , Análise de Sequência de RNA/métodos , RNA-Seq , RNA Mensageiro/genética
5.
J Cell Sci ; 136(8)2023 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-36995025

RESUMO

Switching genes on and off on cue is a cornerstone for understanding gene functions. One contemporary approach for loss-of-function studies of essential genes involves CRISPR-mediated knockout of the endogenous locus in conjunction with the expression of a rescue construct, which can subsequently be turned off to produce a gene inactivation effect in mammalian cell lines. A broadening of this approach would involve simultaneously switching on a second construct to interrogate the functions of a gene in the pathway. In this study, we developed a pair of switches that were independently controlled by both inducible promoters and degrons, enabling the toggling between two constructs with comparable kinetics and tightness. The gene-OFF switch was based on TRE transcriptional control coupled with auxin-induced degron-mediated proteolysis. A second independently controlled gene-ON switch was based on a modified ecdysone promoter and mutated FKBP12-derived destabilization domain degron, allowing acute and tuneable gene activation. This platform facilitates efficient generation of knockout cell lines containing a two-gene switch that is regulated tightly and can be flipped within a fraction of the time of a cell cycle.


Assuntos
Regulação da Expressão Gênica , Ácidos Indolacéticos , Animais , Linhagem Celular , Ácidos Indolacéticos/farmacologia , Proteólise , Regiões Promotoras Genéticas/genética , Mamíferos/metabolismo
6.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36920063

RESUMO

Gene essentiality is defined as the extent to which a gene is required for the survival and reproductive success of a living system. It can vary between genetic backgrounds and environments. Essential protein coding genes have been well studied. However, the essentiality of non-coding regions is rarely reported. Most regions of human genome do not encode proteins. Determining essentialities of non-coding genes is demanded. We developed iEssLnc models, which can assign essentiality scores to lncRNA genes. As far as we know, this is the first direct quantitative estimation to the essentiality of lncRNA genes. By taking the advantage of graph neural network with meta-path-guided random walks on the lncRNA-protein interaction network, iEssLnc models can perform genome-wide screenings for essential lncRNA genes in a quantitative manner. We carried out validations and whole genome screening in the context of human cancer cell-lines and mouse genome. In comparisons to other methods, which are transferred from protein-coding genes, iEssLnc achieved better performances. Enrichment analysis indicated that iEssLnc essentiality scores clustered essential lncRNA genes with high ranks. With the screening results of iEssLnc models, we estimated the number of essential lncRNA genes in human and mouse. We performed functional analysis to find that essential lncRNA genes interact with microRNAs and cytoskeletal proteins significantly, which may be of interest in experimental life sciences. All datasets and codes of iEssLnc models have been deposited in GitHub (https://github.com/yyZhang14/iEssLnc).


Assuntos
MicroRNAs , Neoplasias , RNA Longo não Codificante , Humanos , Animais , Camundongos , Mapas de Interação de Proteínas , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , MicroRNAs/metabolismo , Redes Neurais de Computação
7.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36715269

RESUMO

Predicting therapeutic responses in cancer patients is a major challenge in the field of precision medicine due to high inter- and intra-tumor heterogeneity. Most drug response models need to be improved in terms of accuracy, and there is limited research to assess therapeutic responses of particular tumor types. Here, we developed a novel method DROEG (Drug Response based on Omics and Essential Genes) for prediction of drug response in tumor cell lines by integrating genomic, transcriptomic and methylomic data along with CRISPR essential genes, and revealed that the incorporation of tumor proliferation essential genes can improve drug sensitivity prediction. Concisely, DROEG integrates literature-based and statistics-based methods to select features and uses Support Vector Regression for model construction. We demonstrate that DROEG outperforms most state-of-the-art algorithms by both qualitative (prediction accuracy for drug-sensitive/resistant) and quantitative (Pearson correlation coefficient between the predicted and actual IC50) evaluation in Genomics of Drug Sensitivity in Cancer and Cancer Cell Line Encyclopedia datasets. In addition, DROEG is further applied to the pan-gastrointestinal tumor with high prevalence and mortality as a case study at both cell line and clinical levels to evaluate the model efficacy and discover potential prognostic biomarkers in Cisplatin and Epirubicin treatment. Interestingly, the CRISPR essential gene information is found to be the most important contributor to enhance the accuracy of the DROEG model. To our knowledge, this is the first study to integrate essential genes with multi-omics data to improve cancer drug response prediction and provide insights into personalized precision treatment.


Assuntos
Antineoplásicos , Neoplasias , Humanos , Genes Essenciais , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Neoplasias/tratamento farmacológico , Neoplasias/genética , Genômica/métodos , Medicina de Precisão/métodos
8.
BMC Genomics ; 25(1): 47, 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38200437

RESUMO

BACKGROUND: Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes. RESULTS: In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training. CONCLUSIONS: Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.


Assuntos
Genes Essenciais , Redes Neurais de Computação , Algoritmos , Entropia , Genômica
9.
Cancer ; 130(S8): 1435-1448, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38358781

RESUMO

BACKGROUND: Patients with triple-positive breast cancer (TPBC) have a higher risk of recurrence and lower survival rates than patients with other luminal breast cancers. However, there are few studies on the predictive biomarkers of prognosis and treatment responses in TPBC. METHODS: Proliferation essential genes (PEGs) were acquired from clustered regularly interspaced short palindromic repeats-associated protein 9 (CRISPR-Cas9) technology, and cohorts of patients with TPBC were obtained from public databases and our cohort. To develop a TPBC-PEG signature, Cox regression and least absolute shrinkage and selection operator regression analyses were applied. Functional analyses were performed with gene set enrichment analysis. The relationship between candidate genes and neoadjuvant chemotherapy (NACT) sensitivity was explored via real-time quantitative polymerase chain reaction (RT-qPCR) and immunohistochemistry (IHC) on the basis of clinical samples. RESULTS: Among 900 TPBC-PEGs, 437 showed significant differential expression between TPBC and normal tissues. Three prognostic PEGs (actin-like 6A [ACTL6A], chaperonin containing TCP1 subunit 2 [CCT2], and threonyl-TRNA synthetase [TARS]) were identified and used to construct the PEG signature. Patients with high PEG signature scores exhibited a worse overall survival and lower sensitivity to NACT than patients with low PEG signature scores. RT-qPCR results indicated that ACTL6A and CCT2 expression were significantly upregulated in patients who lacked sensitivity to NACT. IHC results showed that the ACTL6A protein was highly expressed in patients with NACT resistance and nonpathological complete responses. CONCLUSIONS: This efficient PEG signature prognostic model can predict the outcomes of TPBC. Furthermore, ACTL6A expression level was associated with the response to NACT, and could serve as an important factor in predicting prognosis and drug sensitivity of patients with TPBC.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Actinas/genética , Genes Essenciais , Terapia Neoadjuvante/métodos , Prognóstico , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Proliferação de Células , Proteínas Cromossômicas não Histona/genética , Proteínas Cromossômicas não Histona/uso terapêutico , Proteínas de Ligação a DNA/genética
10.
Mol Genet Genomics ; 299(1): 72, 2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-39060647

RESUMO

Codon usage bias (CUB), the uneven usage of synonymous codons encoding the same amino acid, differs among genes within and across bacteria genomes. CUB is known to be influenced by gene expression and accordingly, CUB differs between the high-expression and low-expression genes in several bacteria. In this article, we have extended codon usage study considering gene essentiality as a feature. Using machine learning (ML) based approaches, we have analysed Relative Synonymous Codon Usage (RSCU) values between essential and non-essential genes in Escherichia coli and thirty-four other bacterial genomes whose gene essentiality features were available in public databases. We observed significant differences in codon usage patterns between essential and non-essential genes for majority of the bacterial genomes and accordingly, ML based classifiers achieved high area under curve (AUC) scores, with a minimum score of 70.0 across twenty-eight organisms. Further, importance of the codons towards classifying genes found to differ among the codons in each genome. Arg codon CGT and Gly codon GGT were observed to be the most preferred codons among essential genes in Escherichia coli. Interestingly, some of the codons like CGT, ATA, GGT and GGG observed to be contributing consistently towards classifying essential genes across thirty-five bacteria genomes studied. In other hand, codons TGY and CAY encoding amino acids Cys and His respectively were among the least contributing codons towards classification among all these bacteria. This study demonstrates the gene essentiality based differences in synonymous codon usage in bacteria genomes and presents a common codon usage pattern across bacteria.


Assuntos
Uso do Códon , Escherichia coli , Genes Essenciais , Aprendizado de Máquina , Genes Essenciais/genética , Escherichia coli/genética , Genoma Bacteriano/genética , Genes Bacterianos , Códon/genética , Bactérias/genética , Bactérias/classificação
11.
Genes Cells ; 28(4): 258-266, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36624042

RESUMO

Protein evolution rate is negatively correlated with several effectors, such as expression level, expression distribution, protein-protein interactions (PPIs), and essentiality for survival. These effectors can characterize the signaling pathways mediated by ligand-receptor binding. However, it is unclear whether these effectors are constraining factors on the pathway-specific evolution of ligands and receptors. To clarify the relation between the effectors and protein evolution (dN /dS ratio) in ligands and their receptors considering each signaling pathway, we investigated 377 proteins in 20 peptide/protein ligand groups and their receptor groups using 15 primate sequences. The dN /dS ratios between peptide/protein ligand groups and their receptor groups were positively correlated, suggesting the protein evolution under the influence of signaling pathway to which they belong. Comparing each signaling pathway, ligands and receptors mainly related to development and growth (FGF/Hedgehog/Notch/WNT groups) showed lower dN /dS ratios, higher PPI numbers, and higher essentiality, whereas those mainly related to immune process (CSF/IFN/IL/TNF groups) showed higher dN /dS ratios, lower PPI numbers, and lower essentiality. Most ligands and receptors were poorly expressed, and expression level was not a constraining factor on the protein evolution. These findings indicate that PPI and essentiality are constraining factors that characterize the pathway-specific evolution of ligands and receptors.


Assuntos
Evolução Molecular , Primatas , Animais , Ligantes , Proteínas/genética , Transdução de Sinais
12.
Genet Med ; 26(7): 101141, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38629401

RESUMO

PURPOSE: Existing resources that characterize the essentiality status of genes are based on either proliferation assessment in human cell lines, viability evaluation in mouse knockouts, or constraint metrics derived from human population sequencing studies. Several repositories document phenotypic annotations for rare disorders; however, there is a lack of comprehensive reporting on lethal phenotypes. METHODS: We queried Online Mendelian Inheritance in Man for terms related to lethality and classified all Mendelian genes according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. We characterized the genes across these lethality categories, examined the evidence on viability from mouse models and explored how this information could be used for novel gene discovery. RESULTS: We developed the Lethal Phenotypes Portal to showcase this curated catalog of human essential genes. Differences in the mode of inheritance, physiological systems affected, and disease class were found for genes in different lethality categories, as well as discrepancies between the lethal phenotypes observed in mouse and human. CONCLUSION: We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.


Assuntos
Genes Letais , Doenças Genéticas Inatas , Fenótipo , Humanos , Animais , Camundongos , Doenças Genéticas Inatas/genética , Bases de Dados Genéticas , Modelos Animais de Doenças , Genes Essenciais/genética
13.
Appl Environ Microbiol ; 90(7): e0068724, 2024 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-38864628

RESUMO

Mycoplasma bovis is an important emerging pathogen of cattle and bison, but our understanding of the genetic basis of its interactions with its host is limited. The aim of this study was to identify genes of M. bovis required for interaction and survival in association with host cells. One hundred transposon-induced mutants of the type strain PG45 were assessed for their capacity to survive and proliferate in Madin-Darby bovine kidney cell cultures. The growth of 19 mutants was completely abrogated, and 47 mutants had a prolonged doubling time compared to the parent strain. All these mutants had a similar growth pattern to the parent strain PG45 in the axenic media. Thirteen genes previously classified as dispensable for the axenic growth of M. bovis were found to be essential for the growth of M. bovis in association with host cells. In most of the mutants with a growth-deficient phenotype, the transposon was inserted into a gene involved in transportation or metabolism. This included genes coding for ABC transporters, proteins related to carbohydrate, nucleotide and protein metabolism, and membrane proteins essential for attachment. It is likely that these genes are essential not only in vitro but also for the survival of M. bovis in infected animals. IMPORTANCE: Mycoplasma bovis causes chronic bronchopneumonia, mastitis, arthritis, keratoconjunctivitis, and reproductive tract disease in cattle around the globe and is an emerging pathogen in bison. Control of mycoplasma infections is difficult in the absence of appropriate antimicrobial treatment or effective vaccines. A comprehensive understanding of host-pathogen interactions and virulence factors is important to implement more effective control methods against M. bovis. Recent studies of other mycoplasmas with in vitro cell culture models have identified essential virulence genes of mycoplasmas. Our study has identified genes of M. bovis required for survival in association with host cells, which will pave the way to a better understanding of host-pathogen interactions and the role of specific genes in the pathogenesis of disease caused by M. bovis.


Assuntos
Mycoplasma bovis , Mycoplasma bovis/genética , Animais , Bovinos , Infecções por Mycoplasma/microbiologia , Infecções por Mycoplasma/veterinária , Linhagem Celular , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Doenças dos Bovinos/microbiologia , Genes Bacterianos/genética , Elementos de DNA Transponíveis , Interações Hospedeiro-Patógeno , Bison/microbiologia , Viabilidade Microbiana
14.
BMC Biol ; 21(1): 24, 2023 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-36747219

RESUMO

BACKGROUND: Studying genomic variation in rapidly evolving pathogens potentially enables identification of genes supporting their "core biology", being present, functional and expressed by all strains or "flexible biology", varying between strains. Genes supporting flexible biology may be considered to be "accessory", whilst the "core" gene set is likely to be important for common features of a pathogen species biology, including virulence on all host genotypes. The wheat-pathogenic fungus Zymoseptoria tritici represents one of the most rapidly evolving threats to global food security and was the focus of this study. RESULTS: We constructed a pangenome of 18 European field isolates, with 12 also subjected to RNAseq transcription profiling during infection. Combining this data, we predicted a "core" gene set comprising 9807 sequences which were (1) present in all isolates, (2) lacking inactivating polymorphisms and (3) expressed by all isolates. A large accessory genome, consisting of 45% of the total genes, was also defined. We classified genetic and genomic polymorphism at both chromosomal and individual gene scales. Proteins required for essential functions including virulence had lower-than average sequence variability amongst core genes. Both core and accessory genomes encoded many small, secreted candidate effector proteins that likely interact with plant immunity. Viral vector-mediated transient in planta overexpression of 88 candidates failed to identify any which induced leaf necrosis characteristic of disease. However, functional complementation of a non-pathogenic deletion mutant lacking five core genes demonstrated that full virulence was restored by re-introduction of the single gene exhibiting least sequence polymorphism and highest expression. CONCLUSIONS: These data support the combined use of pangenomics and transcriptomics for defining genes which represent core, and potentially exploitable, weaknesses in rapidly evolving pathogens.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Virulência/genética , Genoma Fúngico , Genes Fúngicos , Doenças das Plantas/microbiologia
15.
Int J Mol Sci ; 25(13)2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-39000124

RESUMO

Over the years, comprehensive explorations of the model organisms Caenorhabditis elegans (elegant worm) and Drosophila melanogaster (vinegar fly) have contributed substantially to our understanding of complex biological processes and pathways in multicellular organisms generally. Extensive functional genomic-phenomic, genomic, transcriptomic, and proteomic data sets have enabled the discovery and characterisation of genes that are crucial for life, called 'essential genes'. Recently, we investigated the feasibility of inferring essential genes from such data sets using advanced bioinformatics and showed that a machine learning (ML)-based workflow could be used to extract or engineer features from DNA, RNA, protein, and/or cellular data/information to underpin the reliable prediction of essential genes both within and between C. elegans and D. melanogaster. As these are two distantly related species within the Ecdysozoa, we proposed that this ML approach would be particularly well suited for species that are within the same phylum or evolutionary clade. In the present study, we cross-predicted essential genes within the phylum Nematoda (evolutionary clade V)-between C. elegans and the pathogenic parasitic nematode H. contortus-and then ranked and prioritised H. contortus proteins encoded by these genes as intervention (e.g., drug) target candidates. Using strong, validated predictors, we inferred essential genes of H. contortus that are involved predominantly in crucial biological processes/pathways including ribosome biogenesis, translation, RNA binding/processing, and signalling and which are highly transcribed in the germline, somatic gonad precursors, sex myoblasts, vulva cell precursors, various nerve cells, glia, or hypodermis. The findings indicate that this in silico workflow provides a promising avenue to identify and prioritise panels/groups of drug target candidates in parasitic nematodes for experimental validation in vitro and/or in vivo.


Assuntos
Caenorhabditis elegans , Genes Essenciais , Haemonchus , Aprendizado de Máquina , Animais , Haemonchus/genética , Caenorhabditis elegans/genética , Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Biologia Computacional/métodos , Drosophila melanogaster/genética
16.
BMC Bioinformatics ; 24(1): 347, 2023 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-37723435

RESUMO

BACKGROUND: The ability to accurately predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve the identification of disease-associated genes. Recently, there have been numerous computational methods developed to predict human essential genes from population genomic data. While the existing methods are highly predictive of essential genes of long length, they have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome. RESULTS: Motivated by the premise that population and functional genomic data may provide complementary evidence for gene essentiality, here we present an evolution-based deep learning model, DeepLOF, to predict essential genes in an unsupervised manner. Unlike previous population genetic methods, DeepLOF utilizes a novel deep learning framework to integrate both population and functional genomic data, allowing us to pinpoint short essential genes that can hardly be predicted from population genomic data alone. Compared with previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Notably, at a false positive rate of 5%, DeepLOF detects 50% more ClinGen haploinsufficient genes than previous methods. Furthermore, DeepLOF discovers 109 novel essential genes that are too short to be identified by previous methods. CONCLUSION: The predictive power of DeepLOF shows that it is a compelling computational method to aid in the discovery of essential genes.


Assuntos
Aprendizado Profundo , Genes Essenciais , Humanos , Animais , Camundongos , Genômica , Metagenômica , Linhagem Celular
17.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33842944

RESUMO

Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significant limitation of the machine learning approach is predicting conditionally essential genes. The essentiality status of a gene can change due to a specific condition of the organism. This review examines various methods applied to essential gene prediction task, their strengths, limitations and the factors responsible for effective computational prediction of essential genes. We discussed categories of features and how they contribute to the classification performance of essentiality prediction models. Five categories of features, namely, gene sequence, protein sequence, network topology, homology and gene ontology-based features, were generated for Caenorhabditis elegans to perform a comparative analysis of their essentiality prediction capacity. Gene ontology-based feature category outperformed other categories of features majorly due to its high correlation with the genes' biological functions. However, the topology feature category provided the highest discriminatory power making it more suitable for essentiality prediction. The major limiting factor of machine learning to predict essential genes conditionality is the unavailability of labeled data for interest conditions that can train a classifier. Therefore, cooperative machine learning could further exploit models that can perform well in conditional essentiality predictions. SHORT ABSTRACT: Identification of essential genes is imperative because it provides an understanding of the core structure and function, accelerating drug targets' discovery, among other functions. Recent studies have applied machine learning to complement the experimental identification of essential genes. However, several factors are limiting the performance of machine learning approaches. This review aims to present the standard procedure and resources available for predicting essential genes in organisms, and also highlight the factors responsible for the current limitation in using machine learning for conditional gene essentiality prediction. The choice of features and ML technique was identified as an important factor to predict essential genes effectively.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genes Essenciais/genética , Aprendizado de Máquina , Máquina de Vetores de Suporte , Animais , Caenorhabditis elegans/genética , Ontologia Genética , Redes Reguladoras de Genes , Humanos
18.
Appl Environ Microbiol ; 89(9): e0066723, 2023 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-37695289

RESUMO

Inducible gene expression systems are important for studying bacterial gene function, yet most exhibit leakage. In this study, we engineered a leakage-free hybrid system for precise gene expression controls in Fusobacterium nucleatum by integrating the xylose-inducible expression system with the theophylline-responsive riboswitch. This innovative method enables concurrent control of target gene expression at both transcription and translation initiation levels. Using luciferase and the indole-producing enzyme tryptophanase (TnaA) as reporters, we demonstrated that the hybrid system displays virtually no observable signal in the absence of inducers. We employed this system to express FtsX, a protein related to fusobacterial cytokinesis, in an ftsX mutant strain, unveiling a dose-dependent manner in FtsX production. Without inducers, cells form long filaments, while increasing FtsX levels by increasing inducer concentrations led to a gradual reduction in cell length until normal morphology was restored. Crucially, this system facilitated essential gene investigation, identifying the signal peptidase lepB gene as vital for F. nucleatum. LepB's essentiality stems from depletion, affecting outer membrane biogenesis and cell division. This novel hybrid system holds the potential for advancing research on essential genes and accurate gene regulation in F. nucleatum. IMPORTANCE Fusobacterium nucleatum, an anaerobic bacterium prevalent in the human oral cavity, is strongly linked to periodontitis and can colonize areas beyond the oral cavity, such as the placenta and gastrointestinal tract, causing adverse pregnancy outcomes and promoting colorectal cancer growth. Given F. nucleatum's clinical significance, research is underway to develop targeted therapies to inhibit its growth or eradicate the bacterium specifically. Essential genes, crucial for bacterial survival, growth, and reproduction, are promising drug targets. A leak-free-inducible gene expression system is needed for studying these genes, enabling conditional gene knockouts and elucidating the importance of those essential genes. Our study identified lepB as the essential gene by first generating a conditional gene mutation in F. nucleatum. Combining a xylose-inducible system with a riboswitch facilitated the analysis of essential genes in F. nucleatum, paving the way for potential drug development targeting this bacterium for various clinical applications.

19.
BMC Microbiol ; 23(1): 97, 2023 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-37024800

RESUMO

Campylobacter species are the major cause of bacterial gastroenteritis. As there is no effective vaccine, combined with the rapid increase in antimicrobial resistant strains, there is a need to identify new targets for intervention. Essential genes are those that are necessary for growth and/or survival, making these attractive targets. In this study, comprehensive transposon mutant libraries were created in six C. jejuni strains, four C. coli strains and one C. lari and C. hyointestinalis strain, allowing for those genes that cannot tolerate a transposon insertion being called as essential. Comparison of essential gene lists using core genome analysis can highlight those genes which are common across multiple strains and/or species. Comparison of C. jejuni and C. coli, the two species that cause the most disease, identified 316 essential genes. Genes of interest highlighted members of the purine pathway being essential for C. jejuni whilst also finding that a functional potassium uptake system is essential. Protein-protein interaction networks using these essential gene lists also highlighted proteins in the purine pathway being major 'hub' proteins which have a large number of interactors across the network. When adding in two more species (C. lari and C. hyointestinalis) the essential gene list reduces to 261. Within these 261 essential genes, there are many genes that have been found to be essential in other bacteria. These include htrB and PEB4, which have previously been found as core virulence genes across Campylobacter species in other studies. There were 21 genes which have no known function with eight of these being associated with the membrane. These surface-associated essential genes may provide attractive targets. The essential gene lists presented will help to prioritise targets for the development of novel therapeutic and preventative interventions.


Assuntos
Infecções por Campylobacter , Campylobacter coli , Campylobacter jejuni , Campylobacter , Humanos , Campylobacter jejuni/genética , Campylobacter coli/genética , Infecções por Campylobacter/microbiologia
20.
Crit Rev Biotechnol ; : 1-14, 2023 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-37380345

RESUMO

Bacteria with streamlined genomes, that harbor full functional genes for essential metabolic networks, are able to synthesize the desired products more effectively and thus have advantages as production platforms in industrial applications. To obtain streamlined chassis genomes, a large amount of effort has been made to reduce existing bacterial genomes. This work falls into two categories: rational and random reduction. The identification of essential gene sets and the emergence of various genome-deletion techniques have greatly promoted genome reduction in many bacteria over the past few decades. Some of the constructed genomes possessed desirable properties for industrial applications, such as: increased genome stability, transformation capacity, cell growth, and biomaterial productivity. The decreased growth and perturbations in physiological phenotype of some genome-reduced strains may limit their applications as optimized cell factories. This review presents an assessment of the advancements made to date in bacterial genome reduction to construct optimal chassis for synthetic biology, including: the identification of essential gene sets, the genome-deletion techniques, the properties and industrial applications of artificially streamlined genomes, the obstacles encountered in constructing reduced genomes, and the future perspectives.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA