RESUMO
In this chapter, we describe a computational pipeline for the in silico detection of plant viruses by high-throughput sequencing (HTS) from total RNA samples. The pipeline is designed for the analysis of short reads generated using an Illumina platform and free-available software tools. First, we provide advice for high-quality total RNA purification, library preparation, and sequencing. The bioinformatics pipeline begins with the raw reads obtained from the sequencing machine and performs some curation steps to obtain long contigs. Contigs are blasted against a local database of reference nucleotide viral sequences to identify the viruses in the samples. Then, the search is refined by applying specific filters. We also provide the code to re-map the short reads against the viruses found to get information on sequencing depth and read coverage for each virus. No previous bioinformatics background is required, but basic knowledge of the Unix command line and R language is recommended.
Assuntos
Vírus de Plantas , Vírus de RNA , RNA de Plantas , Vírus de RNA/genética , Vírus de Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Biologia ComputacionalRESUMO
Gazami crab (Portunus trituberculatus) is prone to spoilage during storage and transportation. More research is needed to determine how to reliably show its freshness and explain the mechanism of quality deterioration. We hypothesized that proteins extracted from crabs can be biomarkers to detect crab muscle quality changes. This work used physicochemical and proteomic approaches to investigate protein biomarkers and molecular mechanisms driving changes in gazami crab muscle quality after long-term refrigeration. It was shown that 66 differentially abundant proteins (DAPs) were closely associated with pH and texture and can be used as biomarkers to assess crab muscle freshness. According to bioinformatics studies, ribosomes and autophagy were significant mechanisms in crab rotting. These findings provided new concepts and a theoretical foundation for evaluating the freshness of refrigerated gazami crab and help uncover the molecular mechanism of its quality deterioration.
Assuntos
Braquiúros , Animais , Proteômica , Refrigeração , Biologia Computacional , BiomarcadoresRESUMO
The development of effective anticancer strategies and the improvement of our understanding of cancer need analytical tools. Utilizing a variety of analytical approaches while investigating anti-cancer medicines gives us a thorough understanding of the traits and mechanisms concerned to cancer cells, which enables us to develop potent treatments to combat them. The importance of anticancer research may be attributed to various analytical techniques that contributes to the identification of therapeutic targets and the assessment of medication efficacy, which are crucial things in expanding our understanding of cancer biology. The study looks at methods that are often used in cancer research, including cell viability assays, clonogenic assay, flow cytometry, 2D electrophoresis, microarray, immunofluorescence, western blot caspase activation assay, bioinformatics, etc. The fundamentals, applications, and how each technique analytical advances our understanding of cancer are briefly reviewed.
Assuntos
Neoplasias , Humanos , Neoplasias/tratamento farmacológico , Biologia Computacional , Morte CelularRESUMO
ETHNOPHARMACOLOGICAL RELEVANCE: Angong Niuhuang Pill (ANP) is a traditional Chinese medicine formula that has been used clinically for many years in the treatment of cerebral hemorrhage. It is composed of ingredients such as calculus bovis, moschus, and others. Ancient texts have documented that ANP's multiple components possess properties such as heat-clearing, detoxification, and sedation, which can be effective in treating conditions such as coma and stroke. However, the underlying mechanisms of ANP's potential actions are still under investigation. AIM OF THE STUDY: ANP is a Chinese medicine widely utilized for the treatment of intracerebral hemorrhage (ICH). However, the precise mechanism underlying the therapeutic effects remains largely elusive. The present study aims to unravel the effects and pharmacological molecular mechanisms of ANP in combatting ICH, employing a comprehensive network pharmacology approach and experimental validation. MATERIALS AND METHODS: The molecular targets of ANP and ICH were obtained from various databases, followed by the construction of protein-protein interaction (PPI) networks using the STRING database. Further, gene ontology (GO) enrichment and Kyoto encyclopedia of genes and genomes (KEGG) analyses were conducted using the Metascape database and Cytoscape, respectively. Finally, molecular docking was performed. We performed a series of behavioral tests, immunohistochemical staining, TUNEL staining, and Western Blot to verify the effects of ANP. RESULTS: IL-6, JUN, MMP9, IL-1ß, VEGFA were the main candidate targets and were associated with fluid shear stress and atherosclerosis, TNF signaling pathway, etc. It is suggested that the potential mechanism of ANP against ICH may be mainly related to pyroptosis, inflammation. In vivo validation showed that ANP treatment significantly reduced the number of TUNEL-positive cells and ANP inhibited the activation of Iba-1 positive neurons, and suppressed the expression of inflammatory factors and pyroptosis indicators. In addition, ANP improved the cognitive level and motor ability of ICH mice. CONCLUSION: The results of the study combined with virtual screening and experimental validation showed that ANP has an important contribution in protecting the brain from neuronal damage by regulating the pathways of inflammation and pyroptosis, laying the foundation and innovative ideas for future studies.
Assuntos
Medicamentos de Ervas Chinesas , Farmacologia em Rede , Animais , Camundongos , Simulação de Acoplamento Molecular , Hemorragia Cerebral/tratamento farmacológico , Biologia Computacional , Medicamentos de Ervas Chinesas/farmacologia , Medicamentos de Ervas Chinesas/uso terapêutico , InflamaçãoRESUMO
The viral fraction of human and experimental animal fecal matter is increasingly attracting research interest due to its newfound influence on the gut microbiome and host health. During the past decade, high-throughput sequencing techniques have seen massive improvements, and in recent years, bioinformatics pipelines for virome analysis have also vastly improved with respect to both user-friendliness and output quality. Yet, the shape and quality of such data are highly dependent on how the viruses are isolated and their genomes extracted and processed to build sequencing libraries.Here we describe a simple protocol for virus isolation from fecal samples suitable for further propagation/characterization or sequencing efforts. It is based on two filtration steps: one for removing large particles such as bacteria and one for removing free DNA and up-concentrating phages and other viruses in the solution. The method is highly scalable, adaptable to a long range of sample types including low-input samples, and has a quantifiable output suitable for both plaquing and sequencing.
Assuntos
Bacteriófagos , Microbioma Gastrointestinal , Animais , Humanos , Bacteriófagos/genética , Biologia Computacional , Fezes , FiltraçãoRESUMO
An in-depth analysis of phage genomic sequences is essential for the proposal of a cocktail for therapeutic uses. With the burst of publications on phage isolation and genetic studies during the last decade, several different bioinformatics programs have been used. Here we describe our studies on the genetic organization of phages infecting Staphylococcus aureus, a pathogen of human importance, by using an assembly of tools for gene annotation, identification of expression components, and phylogeny analysis.
Assuntos
Infecções Estafilocócicas , Fagos de Staphylococcus , Humanos , Fagos de Staphylococcus/genética , Biologia Computacional , Genômica , Anotação de Sequência MolecularRESUMO
Genes encoding small secreted peptides are widely distributed among plant genomes but their detection and annotation remains challenging. The bioinformatics protocol described here aims to identify as exhaustively as possible secreted peptide precursors belonging to a family of interest. First, homology searches are performed at the protein and genome levels. Next, multiple sequence alignments and predictions of a secretion signal are used to define a set of homologous proteins sharing features of secreted peptide precursors. These protein sequences are then used as input of motif detection and profile-based tools to build representative matrices and profiles that are used iteratively as guides to scan again the proteome and genome until family completion.
Assuntos
Biologia Computacional , Peptídeos , Peptídeos/genética , Sequência de Aminoácidos , Transporte Biológico , Genoma de PlantaRESUMO
Protein-ligand blind docking is a widely used method for studying the binding sites and poses of ligands and receptors in pharmaceutical and biological research. Recently, our new blind docking server named CB-Dock2 has been released and is currently being utilized by researchers worldwide. CB-Dock2 outperforms state-of-the-art methods due to its accuracy in binding site identification and binding pose prediction, which are enabled by its knowledge-based docking engine. This highly automated server offers interactive and intuitive input and output web interfaces, making it an efficient and user-friendly tool for the bioinformatics and cheminformatics communities. This chapter provides a brief overview of the methods, followed by a detailed guide on using the CB-Dock2 server. Additionally, we present a case study that evaluates the performance of protein-ligand blind docking using this tool.
Assuntos
Quimioinformática , Biologia Computacional , Ligantes , Sítios de Ligação , Bases de ConhecimentoRESUMO
Mechanisms underlying food allergy are not well understood. Mass cytometry is a technique that allows the multiple analysis of cell surface markers and intracellular proteins by using the spectrum of rare metal isotopes of different atomic masses without channel overlap. Bioinformatic approaches are implemented to combine and reduce the information of more than 60 parameters to define immune cell subpopulations. To date, mass cytometry has revealed a great heterogeneity in human response to food antigens and that subpopulations of basophils and mononuclear cells might be mechanistically implicated in food allergy. This chapter reviews some fundamentals of mass cytometry and the contributions of this technique in elucidating the immune basis of food allergy, oral tolerance, food desensitization, phenotypes, and the cellular events occurring upon allergen-specific immunotherapy.
Assuntos
Hipersensibilidade Alimentar , Humanos , Alimentos , Basófilos , Biologia Computacional , Dessensibilização ImunológicaRESUMO
The identification of T-cell epitopes is a critical step in the understanding of the immunologic mechanisms such as food allergy. Epitope screening in silico by bioinformatic tools can be used to identify T-cell epitopes, which can save time and resources. In this chapter, a multiparametric approach to predict and assess major histocompatibility complex (MHC) class II binding T-cell epitopes using bioinformatics was introduced for food allergens. Furthermore, the ability of predicted T-cell epitopes to induce interleukin (IL)-4, as well as the allergenicity potential based on the sequence analysis and population coverage of epitopes were also determined. The molecular docking approach was further used to explore the binding ability between epitopes and human leukocyte antigen (HLA) class II molecules. The amino acids that might be responsible for binding to HLA class II molecules and their binding interactions were analyzed.
Assuntos
Epitopos de Linfócito T , Hipersensibilidade Alimentar , Humanos , Simulação de Acoplamento Molecular , Aminoácidos , Biologia ComputacionalRESUMO
ETHNOPHARMACOLOGICAL RELEVANCE: Mahonia bealei (Fortune) Carrière (M. bealei) is a traditional medicine widely used by the Hmong community in Guizhou. It possesses diverse biological activities and shows promise in cancer treatment; however, contemporary pharmacological research in this area is lacking. AIMS OF THE STUDY: This study aimed to investigate the effects and underlying mechanisms of M. bealei on alcoholic hepatocellular carcinoma (HCC). MATERIALS AND METHODS: We initially employed the LC-MS/MS method to identify the compounds present in M. bealei serum. Subsequently, its potential targets were predicted using public databases. Bioinformatics and network pharmacology approaches, such as univariate Cox regression and random forest (RF) algorithms, were utilized to identify differentially expressed genes (DEGs) associated with the prognosis of alcoholic HCC. Survival curve and receiver operating characteristic (ROC) analyses were conducted using alcoholic HCC-related data from TCGA and GEO to determine the diagnostic value of the identified DEGs. Molecular docking using the CDOCKER approach based on CHARMm was performed to validate the affinity between the predictive compounds and targets. Additionally, we evaluated the impact of M. bealei on cell proliferation, migration, and conducted western blot assays. RESULTS: The LC-MS/MS approach identified 17 therapeutic components and predicted 483 component-related targets, of which 63 overlapped with alcoholic HCC targets and were considered potential therapeutic targets. GO and KEGG pathway analysis revealed significant associations between the 63 overlapping targets and alcoholic HCC progression. Through various approaches in the Cytoscape 3.9.0 software, we confirmed 9 hub genes (CDK1, CXCR4, DNMT1, ESR1, KIT, PDGFRB, SERPINE1, TOP2A, and TYMS) as core targets. TOP2A and CDK1 genes were identified as advantageous for diagnosing alcoholic HCC using univariate Cox regression, RF, survival curve, and ROC analysis. Molecular docking analysis demonstrated strong binding affinity between key bioactive components cyclamic acid, perfluoroalkyl carboxylic acid, perfluorosulfonic acid, alpha-linolenic acid, adenosine receptor antagonist (CGS 15943), and Prodigiosin and TOP2A and CDK1. In vitro experiments confirmed that M. bealei significantly suppressed cell proliferation and migration of HepG2 cells, while downregulating TOP2A and CDK1 expression. CONCLUSION: This study highlights the potential of M. bealei as a natural medicine for the treatment of alcoholic HCC. Six compounds (cyclamic acid, perfluoroalkylic carboxylic acids, perfluorosulfonic acid, alpha-linolenic acid, adenosine receptor antagonist (CGS 15943), and Prodigiosin) present in M. bealei serum may exhibit therapeutic effects against alcoholic HCC by downregulating CDK1 and TOP2A expression levels in vitro.
Assuntos
Berberis , Carcinoma Hepatocelular , Neoplasias Hepáticas , Mahonia , Humanos , Carcinoma Hepatocelular/tratamento farmacológico , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/metabolismo , Neoplasias Hepáticas/tratamento farmacológico , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismo , Simulação de Acoplamento Molecular , Cromatografia Líquida , Ciclamatos , Farmacologia em Rede , Prodigiosina , Ácido alfa-Linolênico , Espectrometria de Massas em Tandem , Biologia Computacional/métodosRESUMO
The Next-Generation Sequencing revolution had a great impact on the genomics of Pseudomonas aeruginosa. Since the first release of the P. aeruginosa PAO1 genome, there are more than 5700 genomes published. This wealth of information has been accompanied by the development of bioinformatic tools for handling genomic and phenotypic data. Bioinformatics, indeed, become de facto a big data science. In this chapter, we give a brief historical overview of the knowledge gained from P. aeruginosa genome sequencing, then we describe the wet-lab procedure to extract the DNA and prepare the library for broad genome sequencing using Illumina MiSeq technology. As last, we describe three user-friendly bioinformatics procedures to infer the P. aeruginosa genotype, starting from NGS data, with the Multi-Locus Sequence Typing method, and visualize it as a minimum spanning tree.
Assuntos
Biologia Computacional , Genômica , Genótipo , Tipagem de Sequências Multilocus , Genômica/métodos , Biologia Computacional/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Pseudomonas aeruginosa/genética , Análise de Sequência de DNA/métodosRESUMO
In order to investigate the influences of modified RAE-based film on shrimp quality, the proteomic approach was performed to elucidate preservation mechanism. Results showed that the modified RAE-based film kept better shrimp quality compared with natural RAE-based film in terms of determined biochemical parameters and estimated shelf-life. Totally, 49 differentially abundance proteins (DAPs) were identified compared with shrimp without packaging. Bioinformatics analysis demonstrated that the modified RAE-based film could maintain functional DAPs which were mainly distributed in the binding, catalytic activity, etc., and metabolic signaling pathways like melanogenesis signaling pathway were remarkably enriched. Meanwhile, there were 25 DAPs showing close relationship with quality traits, and some of them, such as myosin chains, troponin I and heat shock protein were considered as the potential biomarkers to evaluate shrimp quality deterioration. In conclusion, this study revealed the preservation mechanism of modified RAE-based active film on shrimp quality at the protein molecular level.
Assuntos
Hibiscus , Penaeidae , Animais , Antocianinas , Penaeidae/genética , Proteômica , Biologia ComputacionalRESUMO
Gene regulatory network is the architecture of transcription factors (TFs) and their gene targets, which help in controlling their expression as required by a phenotype during various environmental perturbations. Inferring the regulatory network from the high-throughput data needs an algorithmic approach involving statistical analysis. There are several interaction databases such as JASPAR and SwissRegulon that provide information for TFs-targets pair interaction, which are estimated based on experimental and prediction procedures. These repositories are majorly used for predicting the complex structure of GRNs either with or without gene expression data. Here we described and discussed the step-wise procedures to extract the interaction data for a desired set of target-TFs from the JASPAR database, and used that information to infer the network by using the igraph library. Further, we also mentioned the important parameters for analyzing the different properties of the network. The described procedure will be helpful in discerning the GRN based on the set of TF-gene pairs.
Assuntos
Redes Reguladoras de Genes , Fatores de Transcrição , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Genoma , Biologia Computacional/métodosRESUMO
Advanced technology innovations allow cost-effective, high-throughput profiling of biological systems. It enabled genome sequencing in days using advanced technologies (e.g., next-generation sequencing, microarrays, and mass spectrometry). Since technology has been developed, massive biological data (e.g., genomics, proteomics) has been produced cheaply, allowing the "big data" era to create new opportunities to solve medical and biological complications in many disciplines-preventive medicine, biology, Personalized Medicine, gene sequencing, healthcare, and industry. Computational biology and bioinformatics are interdisciplinary fields that develop and apply computational methods (e.g., analytical methods, mathematical modeling, and simulation) to analyze large collections of biological data, such as genetic sequences, cell populations, or protein samples, to make new predictions or discover new biology. Biological data storage, mining, and analysis have challenges because data is much more heterogeneous. In this study, the big data resources of genomics, proteomics, and metabolomics have been explored to solve biological problems using big data analysis approaches. The goal is to build a network of relationship-based gene-disease associations to prioritize phenotypes common to epilepsy and seizure disease. Through network analysis, The 10 seed genes, 22 associated genes, 132 microRNAs, and 38 transcription factors have been identified that have a direct effect on all forms of epilepsy and seizures. The majority of seed genes, according to the results of a functional analysis of seed genes, are involved in the acetylcholine-gated channel complex (10%) and the heterotrimeric G-protein complex (10%) pathways related to cellular components, followed by a role in the regulation of action potential (20%) and positive regulation of vascular endothelial growth factor production (20%) in Epilepsy and Seizures pathways related to biological processes. This study might provide insight into the workings of the disease and shows the importance of continued research into epilepsy and other conditions that can trigger seizure activity.
Assuntos
Epilepsia , Informática Médica , Humanos , Big Data , Fator A de Crescimento do Endotélio Vascular , Biologia Computacional/métodos , Epilepsia/genética , ConvulsõesRESUMO
The discovery of potential disease-causing genes can aid medical progress. The post-genomic era has made this a more difficult task. Modern high-throughput methods have not solved the problem of identifying disease genes. Conventional methods cannot be used to investigate many rare or lethal diseases. Monitoring gene expression values in different samples using microarray technology is one of the best and most accurate ways to identify disease-causing genes. One of the most recent advances in experimental molecular biology is microarrays, which allow researchers to simultaneously monitor the expression levels of thousands of genes. Statistical analysis of microarray data might aid gene discovery by revealing pathways related to the target gene and facilitating identification of candidate genes. Systems biology, an interdisciplinary approach, has emerged as a crucial analytic tool with the potential to reveal previously unidentified causes and consequences of human illness. Genetic, environmental, immunological, or neurological factors have been implicated in the developing complex disorders like cancer. Because of this, it is important to approach the study of such disease from a novel perspective. The system biology approach allows us to rapidly identify disease-causing genes and assess their viability as therapeutic targets. This chapter demonstrates systems biology approaches to identify candidate genes using public database. Oral squamous cell carcinoma (OSCC) is used as a model disease to show how systems biology can be used successfully to identify and prioritize disease genes.
Assuntos
Carcinoma de Células Escamosas , Neoplasias de Cabeça e Pescoço , Neoplasias Bucais , Humanos , Carcinoma de Células Escamosas/patologia , Carcinoma de Células Escamosas de Cabeça e Pescoço , Neoplasias Bucais/patologia , Biologia de Sistemas , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Biologia Computacional/métodos , Neoplasias de Cabeça e Pescoço/genéticaRESUMO
Advancements in high-throughput technologies, genomics, transcriptomics, and metabolomics play an important role in obtaining biological information about living organisms. The field of computational biology and bioinformatics has experienced significant growth with the advent of high-throughput sequencing technologies and other high-throughput techniques. The resulting large amounts of data present both opportunities and challenges for data analysis. Big data analysis has become essential for extracting meaningful insights from the massive amount of data. In this chapter, we provide an overview of the current status of big data analysis in computational biology and bioinformatics. We discuss the various aspects of big data analysis, including data acquisition, storage, processing, and analysis. We also highlight some of the challenges and opportunities of big data analysis in this area of research. Despite the challenges, big data analysis presents significant opportunities like development of efficient and fast computing algorithms for advancing our understanding of biological processes, identifying novel biomarkers for breeding research and developments, predicting disease, and identifying potential drug targets for drug development programs.
Assuntos
Biologia Computacional , Genômica , Biologia Computacional/métodos , Genômica/métodos , Metabolômica , Algoritmos , Big DataRESUMO
The human genome was first sequenced in 1994. It took 10 years of cooperation between numerous international research organizations to reveal a preliminary human DNA sequence. Genomics labs can now sequence an entire genome in only a few days. Here, we talk about how the advent of high-performance sequencing platforms has paved the way for Big Data in biology and contributed to the development of modern bioinformatics, which in turn has helped to expand the scope of biology and allied sciences. New technologies and methodologies for the storage, management, analysis, and visualization of big data have been shown to be necessary. Not only does modern bioinformatics have to deal with the challenge of processing massive amounts of heterogeneous data, but it also has to deal with different ways of interpreting and presenting those results, as well as the use of different software programs and file formats. Solutions to these problems are tried to present in this chapter. In order to store massive amounts of data and provide a reasonable period for completing search queries, new database management systems other than relational ones will be necessary. Emerging advance programing approaches, such as machine learning, Hadoop, and MapReduce, aim to provide the capacity to easily construct one's own scripts for data processing and address the issue of the diversity of genomic and proteomic data formats in bioinformatics.
Assuntos
Big Data , Proteômica , Humanos , Biologia Computacional/métodos , Genômica/métodos , SoftwareRESUMO
Inference of gene regulatory network (GRN) from time series microarray data remains as a fascinating task for computer science researchers to understand the complex biological process that occurred inside a cell. Among the different popular models to infer GRN, S-system is considered as one of the promising non-linear mathematical tools to model the dynamics of gene expressions, as well as to infer the GRN. S-system is based on biochemical system theory and power law formalism. By observing the value of kinetic parameters of S-system model, it is possible to extract the regulatory relationships among genes. In this review, several existing intelligent methods that were already proposed for inference of S-system-based GRN are explained. It is observed that finding out the most suitable and efficient optimization technique for the accurate inference of all kinds of networks, i.e., in-silico, in-vivo, etc., with less computational complexity is still an open research problem to all. This paper may help the beginners or researchers who want to continue their research in the field of computational biology and bioinformatics.
Assuntos
Redes Reguladoras de Genes , Modelos Genéticos , Biologia Computacional/métodos , AlgoritmosRESUMO
Genetic heterogeneity is a common trait in microbial populations, caused by de novo mutations and changes in variant frequencies over time. Microbes can thus differ genetically within the same species and acquire different phenotypes. For instance, performance and stability of anaerobic reactors are linked to the composition of the microbiome involved in the digestion process and to the environmental parameters imposing selective pressure on the metagenome, shaping its evolution. Changes at the strain level have the potential to determine variations in microbial functions, and their characterization could provide new insight into ecological and evolutionary processes driving anaerobic digestion. In this work, single nucleotide variant dynamics were studied in two time-course biogas upgrading experiments, testing alternative carbon sources and the response to exogenous hydrogen addition. A cumulative total of 76,229 and 64,289 high-confidence single nucleotide variants were discerned in the experiments related to carbon substrate availability and hydrogen addition, respectively. By combining complementary bioinformatic approaches, the study reconstructed the precise strain count-two for both hydrogenotrophic archaea-and tracked their abundance over time, while also characterizing tens of genes under strong selection. Results in the dominant archaea revealed the presence of nearly 100 variants within genes encoding enzymes involved in hydrogenotrophic methanogenesis. In the bacterial counterparts, 119 mutations were identified across 23 genes associated with the Wood-Ljungdahl pathway, suggesting a possible impact on the syntrophic acetate-oxidation process. Strain replacement events took place in both experiments, confirming the trends suggested by the variants trajectories and providing a comprehensive understanding of the biogas upgrading microbiome at the strain level. Overall, this resolution level allowed us to reveal fine-scale evolutionary mechanisms, functional dynamics, and strain-level metabolic variation that could contribute to the selection of key species actively involved in the carbon dioxide fixation process.