Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.803
Filtrar
1.
Development ; 151(3)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38230566

RESUMO

Research in model organisms is central to the characterization of signaling pathways in multicellular organisms. Here, we present the comprehensive and systematic curation of 17 Drosophila signaling pathways using the Gene Ontology framework to establish a dynamic resource that has been incorporated into FlyBase, providing visualization and data integration tools to aid research projects. By restricting to experimental evidence reported in the research literature and quantifying the amount of such evidence for each gene in a pathway, we captured the landscape of empirical knowledge of signaling pathways in Drosophila.


Assuntos
Bases de Dados Genéticas , Drosophila , Animais , Drosophila/genética , Ontologia Genética , Transdução de Sinais , Drosophila melanogaster/genética
2.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-39038936

RESUMO

Sequence database searches followed by homology-based function transfer form one of the oldest and most popular approaches for predicting protein functions, such as Gene Ontology (GO) terms. These searches are also a critical component in most state-of-the-art machine learning and deep learning-based protein function predictors. Although sequence search tools are the basis of homology-based protein function prediction, previous studies have scarcely explored how to select the optimal sequence search tools and configure their parameters to achieve the best function prediction. In this paper, we evaluate the effect of using different options from among popular search tools, as well as the impacts of search parameters, on protein function prediction. When predicting GO terms on a large benchmark dataset, we found that BLASTp and MMseqs2 consistently exceed the performance of other tools, including DIAMOND-one of the most popular tools for function prediction-under default search parameters. However, with the correct parameter settings, DIAMOND can perform comparably to BLASTp and MMseqs2 in function prediction. Additionally, we developed a new scoring function to derive GO prediction from homologous hits that consistently outperform previously proposed scoring functions. These findings enable the improvement of almost all protein function prediction algorithms with a few easily implementable changes in their sequence homolog-based component. This study emphasizes the critical role of search parameter settings in homology-based function transfer and should have an important contribution to the development of future protein function prediction algorithms.


Assuntos
Bases de Dados de Proteínas , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biologia Computacional/métodos , Ontologia Genética , Algoritmos , Análise de Sequência de Proteína/métodos , Software , Aprendizado de Máquina
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38446740

RESUMO

Protein annotation has long been a challenging task in computational biology. Gene Ontology (GO) has become one of the most popular frameworks to describe protein functions and their relationships. Prediction of a protein annotation with proper GO terms demands high-quality GO term representation learning, which aims to learn a low-dimensional dense vector representation with accompanying semantic meaning for each functional label, also known as embedding. However, existing GO term embedding methods, which mainly take into account ancestral co-occurrence information, have yet to capture the full topological information in the GO-directed acyclic graph (DAG). In this study, we propose a novel GO term representation learning method, PO2Vec, to utilize the partial order relationships to improve the GO term representations. Extensive evaluations show that PO2Vec achieves better outcomes than existing embedding methods in a variety of downstream biological tasks. Based on PO2Vec, we further developed a new protein function prediction method PO2GO, which demonstrates superior performance measured in multiple metrics and annotation specificity as well as few-shot prediction capability in the benchmarks. These results suggest that the high-quality representation of GO structure is critical for diverse biological tasks including computational protein annotation.


Assuntos
Benchmarking , Biologia Computacional , Ontologia Genética , Aprendizagem , Anotação de Sequência Molecular
4.
Plant J ; 118(2): 304-323, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38265362

RESUMO

The model moss species Physcomitrium patens has long been used for studying divergence of land plants spanning from bryophytes to angiosperms. In addition to its phylogenetic relationships, the limited number of differential tissues, and comparable morphology to the earliest embryophytes provide a system to represent basic plant architecture. Based on plant-fungal interactions today, it is hypothesized these kingdoms have a long-standing relationship, predating plant terrestrialization. Mortierellaceae have origins diverging from other land fungi paralleling bryophyte divergence, are related to arbuscular mycorrhizal fungi but are free-living, observed to interact with plants, and can be found in moss microbiomes globally. Due to their parallel origins, we assess here how two Mortierellaceae species, Linnemannia elongata and Benniella erionia, interact with P. patens in coculture. We also assess how Mollicute-related or Burkholderia-related endobacterial symbionts (MRE or BRE) of these fungi impact plant response. Coculture interactions are investigated through high-throughput phenomics, microscopy, RNA-sequencing, differential expression profiling, gene ontology enrichment, and comparisons among 99 other P. patens transcriptomic studies. Here we present new high-throughput approaches for measuring P. patens growth, identify novel expression of over 800 genes that are not expressed on traditional agar media, identify subtle interactions between P. patens and Mortierellaceae, and observe changes to plant-fungal interactions dependent on whether MRE or BRE are present. Our study provides insights into how plants and fungal partners may have interacted based on their communications observed today as well as identifying L. elongata and B. erionia as modern fungal endophytes with P. patens.


Assuntos
Briófitas , Bryopsida , Micorrizas , Filogenia , Endófitos/metabolismo , Análise Multinível , Proteínas de Plantas/metabolismo , Bryopsida/genética , Bryopsida/metabolismo , Briófitas/genética , Briófitas/metabolismo , Micorrizas/metabolismo
5.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37756593

RESUMO

Single-cell RNA-sequencing (scRNA-seq) allows for obtaining genomic and transcriptomic profiles of individual cells. That data make it possible to characterize tissues at the cell level. In this context, one of the main analyses exploiting scRNA-seq data is identifying the cell types within tissue to estimate the quantitative composition of cell populations. Due to the massive amount of available scRNA-seq data, automatic classification approaches for cell typing, based on the most recent deep learning technology, are needed. Here, we present the gene ontology-driven wide and deep learning (GOWDL) model for classifying cell types in several tissues. GOWDL implements a hybrid architecture that considers the functional annotations found in Gene Ontology and the marker genes typical of specific cell types. We performed cross-validation and independent external testing, comparing our algorithm with 12 other state-of-the-art predictors. Classification scores demonstrated that GOWDL reached the best results over five different tissues, except for recall, where we got about 92% versus 97% of the best tool. Finally, we presented a case study on classifying immune cell populations in breast cancer using a hierarchical approach based on GOWDL.


Assuntos
Aprendizado Profundo , Ontologia Genética , Análise da Expressão Gênica de Célula Única , Algoritmos , Genômica
6.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37861172

RESUMO

Protein function annotation is one of the most important research topics for revealing the essence of life at molecular level in the post-genome era. Current research shows that integrating multisource data can effectively improve the performance of protein function prediction models. However, the heavy reliance on complex feature engineering and model integration methods limits the development of existing methods. Besides, models based on deep learning only use labeled data in a certain dataset to extract sequence features, thus ignoring a large amount of existing unlabeled sequence data. Here, we propose an end-to-end protein function annotation model named HNetGO, which innovatively uses heterogeneous network to integrate protein sequence similarity and protein-protein interaction network information and combines the pretraining model to extract the semantic features of the protein sequence. In addition, we design an attention-based graph neural network model, which can effectively extract node-level features from heterogeneous networks and predict protein function by measuring the similarity between protein nodes and gene ontology term nodes. Comparative experiments on the human dataset show that HNetGO achieves state-of-the-art performance on cellular component and molecular function branches.


Assuntos
Redes Neurais de Computação , Mapas de Interação de Proteínas , Humanos , Sequência de Aminoácidos , Ontologia Genética , Anotação de Sequência Molecular
7.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-37995133

RESUMO

Interpreting the function of genes and gene sets identified from omics experiments remains a challenge, as current pathway analysis tools often fail to consider the critical biological context, such as tissue or cell-type specificity. To address this limitation, we introduced CellGO. CellGO tackles this challenge by leveraging the visible neural network (VNN) and single-cell gene expressions to mimic cell-type-specific signaling propagation along the Gene Ontology tree within a cell. This design enables a novel scoring system to calculate the cell-type-specific gene-pathway paired active scores, based on which, CellGO is able to identify cell-type-specific active pathways associated with single genes. In addition, by aggregating the activities of single genes, CellGO extends its capability to identify cell-type-specific active pathways for a given gene set. To enhance biological interpretation, CellGO offers additional features, including the identification of significantly active cell types and driver genes and community analysis of pathways. To validate its performance, CellGO was assessed using a gene set comprising mixed cell-type markers, confirming its ability to discern active pathways across distinct cell types. Subsequent benchmarking analyses demonstrated CellGO's superiority in effectively identifying cell types and their corresponding cell-type-specific pathways affected by gene knockouts, using either single genes or sets of genes differentially expressed between knockout and control samples. Moreover, CellGO demonstrated its ability to infer cell-type-specific pathogenesis for disease risk genes. Accessible as a Python package, CellGO also provides a user-friendly web interface, making it a versatile and accessible tool for researchers in the field.


Assuntos
Aprendizado Profundo , Software , Humanos , Suscetibilidade a Doenças
8.
Bioinformatics ; 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38985218

RESUMO

MOTIVATION: Protein's with unknown function are frequently compared to better characterized relatives, either using sequence similarity, or recently through similarity in a learned embedding space. Through comparison, protein sequence embeddings allow for interpretable and accurate annotation of proteins, as well as for downstream tasks such as clustering for unsupervised discovery of protein families. However, it's unclear whether embeddings can be deliberately designed to improve their use in these downstream tasks. RESULTS: We find that for functional annotation of proteins, as represented by Gene Ontology (GO) terms, direct fine-tuning of language models on a simple classification loss has an immediate positive impact on protein embedding quality. Fine-tuned embeddings show stronger performance as representations for K-nearest neighbor classifiers, reaching stronger performance for GO annotation than even directly comparable fine-tuned classifiers, while maintaining interpretability through protein similarity comparisons. They also maintain their quality in related tasks, such as rediscovering protein families with clustering. AVAILABILITY: github.com/mofradlab/go_metric. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

9.
J Med Genet ; 61(5): 443-451, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38458754

RESUMO

BACKGROUND: Dystonia is one of the most common movement disorders. To date, the genetic causes of dystonia in populations of European descent have been extensively studied. However, other populations, particularly those from the Middle East, have not been adequately studied. The purpose of this study is to discover the genetic basis of dystonia in a clinically and genetically well-characterised dystonia cohort from Turkey, which harbours poorly studied populations. METHODS: Exome sequencing analysis was performed in 42 Turkish dystonia families. Using co-expression network (CEN) analysis, identified candidate genes were interrogated for the networks including known dystonia-associated genes and genes further associated with the protein-protein interaction, animal model-based characteristics and clinical findings. RESULTS: We identified potentially disease-causing variants in the established dystonia genes (PRKRA, SGCE, KMT2B, SLC2A1, GCH1, THAP1, HPCA, TSPOAP1, AOPEP; n=11 families (26%)), in the uncommon forms of dystonia-associated genes (PCCB, CACNA1A, ALDH5A1, PRKN; n=4 families (10%)) and in the candidate genes prioritised based on the pathogenicity of the variants and CEN-based analyses (n=11 families (21%)). The diagnostic yield was found to be 36%. Several pathways and gene ontologies implicated in immune system, transcription, metabolic pathways, endosomal-lysosomal and neurodevelopmental mechanisms were over-represented in our CEN analysis. CONCLUSIONS: Here, using a structured approach, we have characterised a clinically and genetically well-defined dystonia cohort from Turkey, where dystonia has not been widely studied, and provided an uncovered genetic basis, which will facilitate diagnostic dystonia research.


Assuntos
Distonia , Distúrbios Distônicos , Animais , Humanos , Distonia/genética , Distonia/diagnóstico , Distúrbios Distônicos/genética , Distúrbios Distônicos/diagnóstico , Testes Genéticos , Turquia , Biologia Molecular , Mutação , Proteínas de Ligação a DNA/genética , Proteínas Reguladoras de Apoptose/genética
10.
Proteomics ; : e2300471, 2024 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-38996351

RESUMO

Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.

11.
BMC Bioinformatics ; 25(1): 174, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38698340

RESUMO

BACKGROUND: In last two decades, the use of high-throughput sequencing technologies has accelerated the pace of discovery of proteins. However, due to the time and resource limitations of rigorous experimental functional characterization, the functions of a vast majority of them remain unknown. As a result, computational methods offering accurate, fast and large-scale assignment of functions to new and previously unannotated proteins are sought after. Leveraging the underlying associations between the multiplicity of features that describe proteins could reveal functional insights into the diverse roles of proteins and improve performance on the automatic function prediction task. RESULTS: We present GO-LTR, a multi-view multi-label prediction model that relies on a high-order tensor approximation of model weights combined with non-linear activation functions. The model is capable of learning high-order relationships between multiple input views representing the proteins and predicting high-dimensional multi-label output consisting of protein functional categories. We demonstrate the competitiveness of our method on various performance measures. Experiments show that GO-LTR learns polynomial combinations between different protein features, resulting in improved performance. Additional investigations establish GO-LTR's practical potential in assigning functions to proteins under diverse challenging scenarios: very low sequence similarity to previously observed sequences, rarely observed and highly specific terms in the gene ontology. IMPLEMENTATION: The code and data used for training GO-LTR is available at https://github.com/aalto-ics-kepaco/GO-LTR-prediction .


Assuntos
Biologia Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Bases de Dados de Proteínas , Algoritmos
12.
BMC Bioinformatics ; 25(1): 127, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38528499

RESUMO

BACKGROUND: N6-methyladenosine (m6A) is the most prevalent post-transcriptional modification in eukaryotic cells that plays a crucial role in regulating various biological processes, and dysregulation of m6A status is involved in multiple human diseases including cancer contexts. A number of prediction frameworks have been proposed for high-accuracy identification of putative m6A sites, however, none have targeted for direct prediction of tissue-conserved m6A modified residues from non-conserved ones at base-resolution level. RESULTS: We report here m6A-TCPred, a computational tool for predicting tissue-conserved m6A residues using m6A profiling data from 23 human tissues. By taking advantage of the traditional sequence-based characteristics and additional genome-derived information, m6A-TCPred successfully captured distinct patterns between potentially tissue-conserved m6A modifications and non-conserved ones, with an average AUROC of 0.871 and 0.879 tested on cross-validation and independent datasets, respectively. CONCLUSION: Our results have been integrated into an online platform: a database holding 268,115 high confidence m6A sites with their conserved information across 23 human tissues; and a web server to predict the conserved status of user-provided m6A collections. The web interface of m6A-TCPred is freely accessible at: www.rnamd.org/m6ATCPred .


Assuntos
Adenosina , Computadores , Humanos , Aprendizado de Máquina , Processamento Pós-Transcricional do RNA
13.
J Proteome Res ; 23(5): 1593-1602, 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38626392

RESUMO

With the rapid expansion of sequencing of genomes, the functional annotation of proteins becomes a bottleneck in understanding proteomes. The Chromosome-centric Human Proteome Project (C-HPP) aims to identify all proteins encoded by the human genome and find functional annotations for them. However, until now there are still 1137 identified human proteins without functional annotation, called uPE1 proteins. Sequence alignment was insufficient to predict their functions, and the crystal structures of most proteins were unavailable. In this study, we demonstrated a new functional annotation strategy, AlphaFun, based on structural alignment using deep-learning-predicted protein structures. Using this strategy, we functionally annotated 99% of the human proteome, including the uPE1 proteins and missing proteins, which have not been identified yet. The accuracy of the functional annotations was validated using the known-function proteins. The uPE1 proteins shared similar functions to the known-function PE1 proteins and tend to express only in very limited tissues. They are evolutionally young genes and thus should conduct functions only in specific tissues and conditions, limiting their occurrence in commonly studied biological models. Such functional annotations provide hints for functional investigations on the uPE1 proteins. This proteome-wide-scale functional annotation strategy is also applicable to any other species.


Assuntos
Anotação de Sequência Molecular , Proteoma , Humanos , Proteoma/genética , Proteoma/metabolismo , Proteoma/análise , Proteoma/química , Aprendizado Profundo , Alinhamento de Sequência , Genoma Humano , Proteômica/métodos , Bases de Dados de Proteínas
14.
Proteins ; 92(1): 60-75, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37638618

RESUMO

Proteins are played key roles in different functionalities in our daily life. All functional roles of a protein are a bit enhanced in interaction compared to individuals. Identification of essential proteins of an organism is a time consume and costly task during observation in the wet lab. The results of observation in wet lab always ensure high reliability and accuracy in the biological ground. Essential protein prediction using computational approaches is an alternative choice in research. It proves its significance rapidly in day-to-day life as well as reduces the experimental cost of wet lab effectively. Existing computational methods were implemented using Protein interaction networks (PPIN), Sequence, Gene Expression Dataset (GED), Gene Ontology (GO), Orthologous groups, and Subcellular localized datasets. Machine learning has diverse categories of features that enable to model and predict essential macromolecules of understudied organisms. A novel methodology MEM-FET (membership feature) is predicted based on features, that is, edge clustering coefficient, Average clustering coefficient, subcellular localization, and Gene Ontology within a compartment of common neighbors. The accuracy (ACC) values of the predicted true positive (TP) essential proteins are 0.79, 0.74, 0.78, and 0.71 for YHQ, YMIPS, YDIP, and YMBD datasets. An enriched set of essential proteins are also predicted using the MEM-FET algorithm. Ensemble ML also validated the proposed model with an accuracy of 60%. It has been predicted that MEM-FET algorithms outperform other existing algorithms with an ACC value of 80% for the yeast dataset.


Assuntos
Biologia Computacional , Proteínas , Humanos , Reprodutibilidade dos Testes , Biologia Computacional/métodos , Proteínas/genética , Proteínas/metabolismo , Algoritmos , Aprendizado de Máquina , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
15.
BMC Genomics ; 25(1): 223, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38424499

RESUMO

BACKGROUND: Switchgrass (Panicum virgatum L.) is a warm-season perennial (C4) grass identified as an important biofuel crop in the United States. It is well adapted to the marginal environment where heat and moisture stresses predominantly affect crop growth. However, the underlying molecular mechanisms associated with heat and drought stress tolerance still need to be fully understood in switchgrass. The methylation of H3K4 is often associated with transcriptional activation of genes, including stress-responsive. Therefore, this study aimed to analyze genome-wide histone H3K4-tri-methylation in switchgrass under heat, drought, and combined stress. RESULTS: In total, ~ 1.3 million H3K4me3 peaks were identified in this study using SICER. Among them, 7,342; 6,510; and 8,536 peaks responded under drought (DT), drought and heat (DTHT), and heat (HT) stresses, respectively. Most DT and DTHT peaks spanned 0 to + 2000 bases from the transcription start site [TSS]. By comparing differentially marked peaks with RNA-Seq data, we identified peaks associated with genes: 155 DT-responsive peaks with 118 DT-responsive genes, 121 DTHT-responsive peaks with 110 DTHT-responsive genes, and 175 HT-responsive peaks with 136 HT-responsive genes. We have identified various transcription factors involved in DT, DTHT, and HT stresses. Gene Ontology analysis using the AgriGO revealed that most genes belonged to biological processes. Most annotated peaks belonged to metabolite interconversion, RNA metabolism, transporter, protein modifying, defense/immunity, membrane traffic protein, transmembrane signal receptor, and transcriptional regulator protein families. Further, we identified significant peaks associated with TFs, hormones, signaling, fatty acid and carbohydrate metabolism, and secondary metabolites. qRT-PCR analysis revealed the relative expressions of six abiotic stress-responsive genes (transketolase, chromatin remodeling factor-CDH3, fatty-acid desaturase A, transmembrane protein 14C, beta-amylase 1, and integrase-type DNA binding protein genes) that were significantly (P < 0.05) marked during drought, heat, and combined stresses by comparing stress-induced against un-stressed and input controls. CONCLUSION: Our study provides a comprehensive and reproducible epigenomic analysis of drought, heat, and combined stress responses in switchgrass. Significant enrichment of H3K4me3 peaks downstream of the TSS of protein-coding genes was observed. In addition, the cost-effective experimental design, modified ChIP-Seq approach, and analyses presented here can serve as a prototype for other non-model plant species for conducting stress studies.


Assuntos
Panicum , Panicum/metabolismo , Temperatura Alta , Lisina/metabolismo , Histonas/metabolismo , Secas , Estresse Fisiológico/genética , Metilação , Regulação da Expressão Gênica de Plantas , Perfilação da Expressão Gênica
16.
BMC Genomics ; 25(1): 533, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38816789

RESUMO

BACKGROUND: Environmental stress factors, such as biotic and abiotic stress, are becoming more common due to climate variability, significantly affecting global maize yield. Transcriptome profiling studies provide insights into the molecular mechanisms underlying stress response in maize, though the functions of many genes are still unknown. To enhance the functional annotation of maize-specific genes, MaizeGDB has outlined a data-driven approach with an emphasis on identifying genes and traits related to biotic and abiotic stress. RESULTS: We mapped high-quality RNA-Seq expression reads from 24 different publicly available datasets (17 abiotic and seven biotic studies) generated from the B73 cultivar to the recent version of the reference genome B73 (B73v5) and deduced stress-related functional annotation of maize gene models. We conducted a robust meta-analysis of the transcriptome profiles from the datasets to identify maize loci responsive to stress, identifying 3,230 differentially expressed genes (DEGs): 2,555 DEGs regulated in response to abiotic stress, 408 DEGs regulated during biotic stress, and 267 common DEGs (co-DEGs) that overlap between abiotic and biotic stress. We discovered hub genes from network analyses, and among the hub genes of the co-DEGs we identified a putative NAC domain transcription factor superfamily protein (Zm00001eb369060) IDP275, which previously responded to herbivory and drought stress. IDP275 was up-regulated in our analysis in response to eight different abiotic and four different biotic stresses. A gene set enrichment and pathway analysis of hub genes of the co-DEGs revealed hormone-mediated signaling processes and phenylpropanoid biosynthesis pathways, respectively. Using phylostratigraphic analysis, we also demonstrated how abiotic and biotic stress genes differentially evolve to adapt to changing environments. CONCLUSIONS: These results will help facilitate the functional annotation of multiple stress response gene models and annotation in maize. Data can be accessed and downloaded at the Maize Genetics and Genomics Database (MaizeGDB).


Assuntos
Anotação de Sequência Molecular , Estresse Fisiológico , Transcriptoma , Zea mays , Zea mays/genética , Estresse Fisiológico/genética , Regulação da Expressão Gênica de Plantas , Perfilação da Expressão Gênica , Genes de Plantas
17.
BMC Genomics ; 25(1): 168, 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38347479

RESUMO

BACKGROUND: Understanding the molecular underpinnings of phenotypic variations is critical for enhancing poultry breeding programs. The Brazilian broiler (TT) and laying hen (CC) lines exhibit striking differences in body weight, growth potential, and muscle mass. Our work aimed to compare the global transcriptome of wing and pectoral tissues during the early development (days 2.5 to 3.5) of these chicken lines, unveiling disparities in gene expression and regulation. RESULTS: Different and bona-fide transcriptomic profiles were identified for the compared lines. A similar number of up- and downregulated differentially expressed genes (DEGs) were identified, considering the broiler line as a reference. Upregulated DEGs displayed an enrichment of protease-encoding genes, whereas downregulated DEGs exhibited a prevalence of receptors and ligands. Gene Ontology analysis revealed that upregulated DEGs were mainly associated with hormone response, mitotic cell cycle, and different metabolic and biosynthetic processes. In contrast, downregulated DEGs were primarily linked to communication, signal transduction, cell differentiation, and nervous system development. Regulatory networks were constructed for the mitotic cell cycle and cell differentiation biological processes, as their contrasting roles may impact the development of distinct postnatal traits. Within the mitotic cell cycle network, key upregulated DEGs included CCND1 and HSP90, with central regulators being NF-κB subunits (RELA and REL) and NFATC2. The cell differentiation network comprises numerous DEGs encoding transcription factors (e.g., HOX genes), receptors, ligands, and histones, while the main regulatory hubs are CREB, AR and epigenetic modifiers. Clustering analyses highlighted PIK3CD as a central player within the differentiation network. CONCLUSIONS: Our study revealed distinct developmental transcriptomes between Brazilian broiler and layer lines. The gene expression profile of broiler embryos seems to favour increased cell proliferation and delayed differentiation, which may contribute to the subsequent enlargement of pectoral tissues during foetal and postnatal development. Our findings pave the way for future functional studies and improvement of targeted traits of economic interest in poultry.


Assuntos
Galinhas , Perfilação da Expressão Gênica , Animais , Feminino , Galinhas/genética , Biologia Computacional , Transcriptoma , Diferenciação Celular/genética
18.
BMC Genomics ; 25(1): 267, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38468234

RESUMO

In every omics experiment, genes or their products are identified for which even state of the art tools are unable to assign a function. In the biotechnology chassis organism Pseudomonas putida, these proteins of unknown function make up 14% of the proteome. This missing information can bias analyses since these proteins can carry out functions which impact the engineering of organisms. As a consequence of predicting protein function across all organisms, function prediction tools generally fail to use all of the types of data available for any specific organism, including protein and transcript expression information. Additionally, the release of Alphafold predictions for all Uniprot proteins provides a novel opportunity for leveraging structural information. We constructed a bespoke machine learning model to predict the function of recalcitrant proteins of unknown function in Pseudomonas putida based on these sources of data, which annotated 1079 terms to 213 proteins. Among the predicted functions supplied by the model, we found evidence for a significant overrepresentation of nitrogen metabolism and macromolecule processing proteins. These findings were corroborated by manual analyses of selected proteins which identified, among others, a functionally unannotated operon that likely encodes a branch of the shikimate pathway.


Assuntos
Pseudomonas putida , Pseudomonas putida/genética , Proteoma/metabolismo , Multiômica , Biotecnologia , Óperon
19.
Curr Issues Mol Biol ; 46(6): 5488-5510, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38921000

RESUMO

The PHLDA (pleckstrin homology-like domain family) gene family is popularly known as a potential biomarker for cancer identification, and members of the PHLDA family have become considered potentially viable targets for cancer treatments. The PHLDA gene family consists of PHLDA1, PHLDA2, and PHLDA3. The predictive significance of PHLDA genes in cancer remains unclear. To determine the role of pleckstrin as a prognostic biomarker in human cancers, we conducted a systematic multiomics investigation. Through various survival analyses, pleckstrin expression was evaluated, and their predictive significance in human tumors was discovered using a variety of online platforms. By analyzing the protein-protein interactions, we also chose a collection of well-known functional protein partners for pleckstrin. Investigations were also carried out on the relationship between pleckstrins and other cancers regarding mutations and copy number alterations. The cumulative impact of pleckstrin and their associated genes on various cancers, Gene Ontology (GO), and pathway analyses were used for their evaluation. Thus, the expression profiles of PHLDA family members and their prognosis in various cancers may be revealed by this study. During this multiomics analysis, we found that among the PHLDA family, PHLDA1 may be a therapeutic target for several cancers, including kidney, colon, and brain cancer, while PHLDA2 can be a therapeutic target for cancers of the colon, esophagus, and pancreas. Additionally, PHLDA3 may be a useful therapeutic target for ovarian, renal, and gastric cancer.

20.
BMC Plant Biol ; 24(1): 468, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38811873

RESUMO

BACKGROUND: The cuticular wax serves as a primary barrier that protects plants from environmental stresses. The Eceriferum (CER) gene family is associated with wax production and stress resistance. RESULTS: In a genome-wide identification study, a total of 52 members of the CER family were discovered in four Gossypium species: G. arboreum, G. barbadense, G. raimondii, and G. hirsutum. There were variations in the physicochemical characteristics of the Gossypium CER (GCER) proteins. Evolutionary analysis classified the identified GCERs into five groups, with purifying selection emerging as the primary evolutionary force. Gene structure analysis revealed that the number of conserved motifs ranged from 1 to 15, and the number of exons varied from 3 to 13. Closely related GCERs exhibited similar conserved motifs and gene structures. Analyses of chromosomal positions, selection pressure, and collinearity revealed numerous fragment duplications in the GCER genes. Additionally, nine putative ghr-miRNAs targeting seven G. hirsutum CER (GhCER) genes were identified. Among them, three miRNAs, including ghr-miR394, ghr-miR414d, and ghr-miR414f, targeted GhCER09A, representing the most targeted gene. The prediction of transcription factors (TFs) and the visualization of the regulatory TF network revealed interactions with GhCER genes involving ERF, MYB, Dof, bHLH, and bZIP. Analysis of cis-regulatory elements suggests potential associations between the CER gene family of cotton and responses to abiotic stress, light, and other biological processes. Enrichment analysis demonstrated a robust correlation between GhCER genes and pathways associated with cutin biosynthesis, fatty acid biosynthesis, wax production, and stress response. Localization analysis showed that most GCER proteins are localized in the plasma membrane. Transcriptome and quantitative reverse transcription-polymerase chain reaction (qRT-PCR) expression assessments demonstrated that several GhCER genes, including GhCER15D, GhCER04A, GhCER06A, and GhCER12D, exhibited elevated expression levels in response to water deficiency stress compared to control conditions. The functional identification through virus-induced gene silencing (VIGS) highlighted the pivotal role of the GhCER04A gene in enhancing drought resistance by promoting increased tissue water retention. CONCLUSIONS: This investigation not only provides valuable evidence but also offers novel insights that contribute to a deeper understanding of the roles of GhCER genes in cotton, their role in adaptation to drought and other abiotic stress and their potential applications for cotton improvement.


Assuntos
Secas , Gossypium , Família Multigênica , Proteínas de Plantas , Gossypium/genética , Gossypium/fisiologia , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Regulação da Expressão Gênica de Plantas , Estresse Fisiológico/genética , Genes de Plantas , Filogenia , Adaptação Fisiológica/genética , Ceras/metabolismo , MicroRNAs/genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa