Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 131.635
Filter
1.
Microbiology (Reading) ; 170(7)2024 Jul.
Article in English | MEDLINE | ID: mdl-38967642

ABSTRACT

Artificial intelligence has revolutionized the field of protein structure prediction. However, with more powerful and complex software being developed, it is accessibility and ease of use rather than capability that is quickly becoming a limiting factor to end users. LazyAF is a Google Colaboratory-based pipeline which integrates the existing ColabFold BATCH software to streamline the process of medium-scale protein-protein interaction prediction. LazyAF was used to predict the interactome of the 76 proteins encoded on the broad-host-range multi-drug resistance plasmid RK2, demonstrating the ease and accessibility the pipeline provides.


Subject(s)
Computational Biology , Protein Interaction Mapping , Software , Computational Biology/methods , Computer Simulation , Plasmids/genetics , Bacterial Proteins/metabolism , Bacterial Proteins/genetics , Bacterial Proteins/chemistry , Protein Binding
2.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38961813

ABSTRACT

Computational biological models have proven to be an invaluable tool for understanding and predicting the behaviour of many biological systems. While it may not be too challenging for experienced researchers to construct such models from scratch, it is not a straightforward task for early stage researchers. Design patterns are well-known techniques widely applied in software engineering as they provide a set of typical solutions to common problems in software design. In this paper, we collect and discuss common patterns that are usually used during the construction and execution of computational biological models. We adopt Petri nets as a modelling language to provide a visual illustration of each pattern; however, the ideas presented in this paper can also be implemented using other modelling formalisms. We provide two case studies for illustration purposes and show how these models can be built up from the presented smaller modules. We hope that the ideas discussed in this paper will help many researchers in building their own future models.


Subject(s)
Computational Biology , Computer Simulation , Models, Biological , Software , Computational Biology/methods , Algorithms , Humans
3.
Gen Physiol Biophys ; 43(4): 347-351, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38953576

ABSTRACT

Since the acid growth theory was introduced in plant physiology and mainframe computers became more widely available in the mid-20th century, there has been a growing need to accurately predict plant cell morphological parameters during growth. This article presents a computer program that uses an original numerical method to solve a highly nonlinear growth equation. The program is written in Python, a popular open-source scientific software environment called CoCalc or SAGE. This program can be used to determine the growth of an individual plant cell or multicellular organ, such as a coleoptile or hypocotyl segment, at the non-meristemic limit. This standalone program is designed to be user-friendly and accessible to all readers, without barriers. With only a few key parameters, including pH and temperature, this program provides a practical set of tools for comparing growth-related experimental data across various areas of plant biology. Additionally, it could be useful in predicting plant growth during assisted migration, particularly in the face of climate change.


Subject(s)
Plant Development , Software , Plant Development/physiology , Models, Biological , Computer Simulation , Algorithms
4.
J Contemp Dent Pract ; 25(4): 320-325, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38956845

ABSTRACT

AIM: The aim of the present research was to assess the mesiodistal angulation of the maxillary anterior teeth utilizing Image J computer software, a Profile projector, and a Custom-made jig. MATERIALS AND METHODS: A total of 34 subjects (17 males and 17 females) were chosen from a group of 18-30 years old with bilateral Angle Class I molars and canine relationships. One manual approach (Custom-made jig) and two digital methods (J computer software, a Profile projector) were used to record the mesiodistal angulation in incisal view. The individuals had alginate impressions made, and a facebow was used to capture the maxilla's spatial relationship with the cranium. The articulated cast with the help of mounting ring moved to the specially customized jig, then the angulations was measured in the incisal view after the casts were placed in a semi-adjustable articulator. Data were recorded and statistically analyzed. RESULTS: The mesiodistal angulation in the incisal view via three methods between the 17 males and 17 females has statistically significant different. Although the mesiodistal angulation for maxillary lateral incisor and canine did not show any statistically significant difference, the maximum and minimum values obtained were always greater in males in comparison with the females. This indicates that the positions of six maxillary anterior teeth in the males resulted in the creation of upward sweep of incisal edges of central and lateral incisors which was also referred to as "smiling line" producing masculine surface anatomy more squared and vigorous while feminine surface anatomy being more rounded, soft, and pleasant. There was no statistically significant difference between the right and left sides, indicating bilateral arch symmetry and the symmetrical place of the right teeth compared with the left side's corresponding teeth. CONCLUSION: On conclusion, according to the current study's findings, all three approaches can measure the mesiodistal angulations of maxillary anterior teeth in incisal view with clinically acceptable accuracy. The digital methods, which included using the Image J computer software and the profile projector, achieved more accurate results than the manual method. CLINICAL SIGNIFICANCE: The outcomes of this study's mesiodistal angulations can be used as a reference for placing teeth in both fully and partially edentulous conditions. This study contributes to a better understanding of the importance of achieving the ideal occlusion in the Indian population by placing the maxillary anterior teeth at the proper mesiodistal angulation. How to cite this article: Shadaksharappa SH, Lahiri B, Kamath AG, et al. Evaluation of Mesiodistal Angulation of Maxillary Anterior Teeth in Incisal View Using Manual and Digital Methods: An In Vivo Study. J Contemp Dent Pract 2024;25(4):320-325.


Subject(s)
Incisor , Maxilla , Humans , Male , Female , Maxilla/anatomy & histology , Adolescent , Incisor/anatomy & histology , Young Adult , Adult , Software , Image Processing, Computer-Assisted/methods , Cuspid/anatomy & histology
5.
Sci Rep ; 14(1): 15000, 2024 07 01.
Article in English | MEDLINE | ID: mdl-38951578

ABSTRACT

The primary objective of analyzing the data obtained in a mass spectrometry-based proteomic experiment is peptide and protein identification, or correct assignment of the tandem mass spectrum to one amino acid sequence. Comparison of empirical fragment spectra with the theoretical predicted one or matching with the collected spectra library are commonly accepted strategies of proteins identification and defining of their amino acid sequences. Although these approaches are widely used and are appreciably efficient for the well-characterized model organisms or measured proteins, they cannot detect novel peptide sequences that have not been previously annotated or are rare. This study presents PowerNovo tool for de novo sequencing of proteins using tandem mass spectra acquired in a variety of types of mass analyzers and different fragmentation techniques. PowerNovo involves an ensemble of models for peptide sequencing: model for detecting regularities in tandem mass spectra, precursors, and fragment ions and a natural language processing model, which has a function of peptide sequence quality assessment and helps with reconstruction of noisy sequences. The results of testing showed that the performance of PowerNovo is comparable and even better than widely utilized PointNovo, DeepNovo, Casanovo, and Novor packages. Also, PowerNovo provides complete cycle of processing (pipeline) of mass spectrometry data and, along with predicting the peptide sequence, involves the peptide assembly and protein inference blocks.


Subject(s)
Peptides , Sequence Analysis, Protein , Tandem Mass Spectrometry , Tandem Mass Spectrometry/methods , Sequence Analysis, Protein/methods , Peptides/chemistry , Peptides/analysis , Amino Acid Sequence , Software , Proteomics/methods , Algorithms
6.
Genome Biol ; 25(1): 170, 2024 07 01.
Article in English | MEDLINE | ID: mdl-38951884

ABSTRACT

Microbial pangenome analysis identifies present or absent genes in prokaryotic genomes. However, current tools are limited when analyzing species with higher sequence diversity or higher taxonomic orders such as genera or families. The Roary ILP Bacterial core Annotation Pipeline (RIBAP) uses an integer linear programming approach to refine gene clusters predicted by Roary for identifying core genes. RIBAP successfully handles the complexity and diversity of Chlamydia, Klebsiella, Brucella, and Enterococcus genomes, outperforming other established and recent pangenome tools for identifying all-encompassing core genes at the genus level. RIBAP is a freely available Nextflow pipeline at github.com/hoelzer-lab/ribap and zenodo.org/doi/10.5281/zenodo.10890871.


Subject(s)
Genome, Bacterial , Molecular Sequence Annotation , Software , Brucella/genetics , Brucella/classification , Bacteria/genetics , Bacteria/classification , Chlamydia/genetics , Enterococcus/genetics , Klebsiella/genetics
7.
Microbiome ; 12(1): 117, 2024 Jun 29.
Article in English | MEDLINE | ID: mdl-38951915

ABSTRACT

BACKGROUND: Shotgun metagenomics for microbial community survey recovers enormous amount of information for microbial genomes that include their abundances, taxonomic, and phylogenetic information, as well as their genomic makeup, the latter of which then helps retrieve their function based on annotated gene products, mRNA, protein, and metabolites. Within the context of a specific hypothesis, additional modalities are often included, to give host-microbiome interaction. For example, in human-associated microbiome projects, it has become increasingly common to include host immunology through flow cytometry. Whilst there are plenty of software approaches available, some that utilize marker-based and assembly-based approaches, for downstream statistical analyses, there is still a dearth of statistical tools that help consolidate all such information in a single platform. By virtue of stringent computational requirements, the statistical workflow is often passive with limited visual exploration. RESULTS: In this study, we have developed a Java-based statistical framework ( https://github.com/KociOrges/cviewer ) to explore shotgun metagenomics data, which integrates seamlessly with conventional pipelines and offers exploratory as well as hypothesis-driven analyses. The end product is a highly interactive toolkit with a multiple document interface, which makes it easier for a person without specialized knowledge to perform analysis of multiomics datasets and unravel biologically relevant patterns. We have designed algorithms based on frequently used numerical ecology and machine learning principles, with value-driven from integrated omics tools which not only find correlations amongst different datasets but also provide discrimination based on case-control relationships. CONCLUSIONS: CViewer was used to analyse two distinct metagenomic datasets with varying complexities. These include a dietary intervention study to understand Crohn's disease changes during a dietary treatment to include remission, as well as a gut microbiome profile for an obesity dataset comparing subjects who suffer from obesity of different aetiologies and against controls who were lean. Complete analyses of both studies in CViewer then provide very powerful mechanistic insights that corroborate with the published literature and demonstrate its full potential. Video Abstract.


Subject(s)
Metagenomics , Software , Metagenomics/methods , Humans , Microbiota/genetics , Gastrointestinal Microbiome/genetics , Computational Biology/methods , Metagenome , Crohn Disease/microbiology , Crohn Disease/genetics
8.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38960405

ABSTRACT

Plasmids are extrachromosomal DNA found in microorganisms. They often carry beneficial genes that help bacteria adapt to harsh conditions. Plasmids are also important tools in genetic engineering, gene therapy, and drug production. However, it can be difficult to identify plasmid sequences from chromosomal sequences in genomic and metagenomic data. Here, we have developed a new tool called PlasmidHunter, which uses machine learning to predict plasmid sequences based on gene content profile. PlasmidHunter can achieve high accuracies (up to 97.6%) and high speeds in benchmark tests including both simulated contigs and real metagenomic plasmidome data, outperforming other existing tools.


Subject(s)
Machine Learning , Plasmids , Plasmids/genetics , Sequence Analysis, DNA/methods , Software , Computational Biology/methods , Algorithms
9.
Bioinformatics ; 40(7)2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38960865

ABSTRACT

MOTIVATION: The data independent acquisition (DIA) mass spectrometry (MS) method is increasingly popular in the field of proteomics. But the loss of the correspondence between peptide ions and their spectra in DIA makes the identification challenging. One effective approach to reduce false positive identification is to calculate the deviation between the peptide's estimated retention time (RT) and measured RT. During this process, scaling the spectral library RT into the estimated RT, known as the RT calibration, is a prerequisite for calculating the deviation. Currently, within the DIA algorithm ecosystem, there is a lack of engine-independent and readily usable RT calibration toolkits. RESULTS: In this work, we introduce Calib-RT, a RT calibration method tailored to the characteristics of RT data. This method can achieve the nonlinear calibration across various data scales and tolerate a certain level of noise interference. Calib-RT is expected to enrich the open source DIA algorithm toolchain and assist in the development of DIA identification algorithms. AVAILABILITY AND IMPLEMENTATION: Calib-RT is released as an open source software under the MIT license and can be installed from PyPi as a python module. The source code is available on GitHub at https://github.com/chenghui03/Calib_RT.


Subject(s)
Algorithms , Mass Spectrometry , Peptides , Proteomics , Software , Peptides/chemistry , Peptides/analysis , Mass Spectrometry/methods , Proteomics/methods , Calibration
10.
BMC Med Res Methodol ; 24(1): 144, 2024 Jul 04.
Article in English | MEDLINE | ID: mdl-38965539

ABSTRACT

MOTIVATION: Data is increasingly used for improvement and research in public health, especially administrative data such as that collected in electronic health records. Patients enter and exit these typically open-cohort datasets non-uniformly; this can render simple questions about incidence and prevalence time-consuming and with unnecessary variation between analyses. We therefore developed methods to automate analysis of incidence and prevalence in open cohort datasets, to improve transparency, productivity and reproducibility of analyses. IMPLEMENTATION: We provide both a code-free set of rules for incidence and prevalence that can be applied to any open cohort, and a python Command Line Interface implementation of these rules requiring python 3.9 or later. GENERAL FEATURES: The Command Line Interface is used to calculate incidence and point prevalence time series from open cohort data. The ruleset can be used in developing other implementations or can be rearranged to form other analytical questions such as period prevalence. AVAILABILITY: The command line interface is freely available from https://github.com/THINKINGGroup/analogy_publication .


Subject(s)
Electronic Health Records , Humans , Prevalence , Incidence , Cohort Studies , Electronic Health Records/statistics & numerical data , Software , Reproducibility of Results
11.
J Chem Phys ; 161(1)2024 Jul 07.
Article in English | MEDLINE | ID: mdl-38958156

ABSTRACT

Force Field X (FFX) is an open-source software package for atomic resolution modeling of genetic variants and organic crystals that leverages advanced potential energy functions and experimental data. FFX currently consists of nine modular packages with novel algorithms that include global optimization via a many-body expansion, acid-base chemistry using polarizable constant-pH molecular dynamics, estimation of free energy differences, generalized Kirkwood implicit solvent models, and many more. Applications of FFX focus on the use and development of a crystal structure prediction pipeline, biomolecular structure refinement against experimental datasets, and estimation of the thermodynamic effects of genetic variants on both proteins and nucleic acids. The use of Parallel Java and OpenMM combines to offer shared memory, message passing, and graphics processing unit parallelization for high performance simulations. Overall, the FFX platform serves as a computational microscope to study systems ranging from organic crystals to solvated biomolecular systems.


Subject(s)
Software , Molecular Dynamics Simulation , Genetic Variation , Algorithms , Thermodynamics , Proteins/chemistry , Crystallization , Nucleic Acids/chemistry
12.
PLoS One ; 19(7): e0305809, 2024.
Article in English | MEDLINE | ID: mdl-38954704

ABSTRACT

Chromatin exhibits non-random distribution within the nucleus being arranged into discrete domains that are spatially organized throughout the nuclear space. Both the spatial distribution and structural rearrangement of chromatin domains in the nucleus depend on epigenetic modifications of DNA and/or histones and structural elements such as the nuclear envelope. These components collectively contribute to the organization and rearrangement of chromatin domains, thereby influencing genome architecture and functional regulation. This study develops an innovative, user-friendly, ImageJ-based plugin, called IsoConcentraChromJ, aimed quantitatively delineating the spatial distribution of chromatin regions in concentric patterns. The IsoConcentraChromJ can be applied to quantitative chromatin analysis in both two- and three-dimensional spaces. After DNA and histone staining with fluorescent probes, high-resolution images of nuclei have been obtained using advanced fluorescence microscopy approaches, including confocal and stimulated emission depletion (STED) microscopy. IsoConcentraChromJ workflow comprises the following sequential steps: nucleus segmentation, thresholding, masking, normalization, and trisection with specified ratios for either 2D or 3D acquisitions. The effectiveness of the IsoConcentraChromJ has been validated and demonstrated using experimental datasets consisting in nuclei images of pre-adipocytes and mature adipocytes, encompassing both 2D and 3D imaging. The outcomes allow to characterize the nuclear architecture by calculating the ratios between specific concentric nuclear areas/volumes of acetylated chromatin with respect to total acetylated chromatin and/or total DNA. The novel IsoConcentrapChromJ plugin could represent a valuable resource for researchers investigating the rearrangement of chromatin architecture driven by epigenetic mechanisms using nuclear images obtained by different fluorescence microscopy methods.


Subject(s)
Cell Nucleus , Chromatin , Microscopy, Fluorescence , Chromatin/metabolism , Chromatin/genetics , Cell Nucleus/metabolism , Cell Nucleus/genetics , Animals , Mice , Microscopy, Fluorescence/methods , Humans , Histones/metabolism , Histones/genetics , Software , Imaging, Three-Dimensional/methods , Image Processing, Computer-Assisted/methods
13.
Nat Commun ; 15(1): 5573, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956036

ABSTRACT

Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, HiTE has identified numerous novel transposons with well-defined structures containing protein-coding domains, some of which are directly inserted within crucial genes, leading to direct alterations in gene expression. A Nextflow version of HiTE is also available, with enhanced parallelism, reproducibility, and portability.


Subject(s)
DNA Transposable Elements , Molecular Sequence Annotation , DNA Transposable Elements/genetics , Molecular Sequence Annotation/methods , Animals , Software , Humans , Reproducibility of Results , Computational Biology/methods , Databases, Genetic , Algorithms , Genome/genetics
14.
BMC Bioinformatics ; 25(1): 228, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38956506

ABSTRACT

BACKGROUND: Fungi play a key role in several important ecological functions, ranging from organic matter decomposition to symbiotic associations with plants. Moreover, fungi naturally inhabit the human body and can be beneficial when administered as probiotics. In mycology, the internal transcribed spacer (ITS) region was adopted as the universal marker for classifying fungi. Hence, an accurate and robust method for ITS classification is not only desired for the purpose of better diversity estimation, but it can also help us gain a deeper insight into the dynamics of environmental communities and ultimately comprehend whether the abundance of certain species correlate with health and disease. Although many methods have been proposed for taxonomic classification, to the best of our knowledge, none of them fully explore the taxonomic tree hierarchy when building their models. This in turn, leads to lower generalization power and higher risk of committing classification errors. RESULTS: Here we introduce HiTaC, a robust hierarchical machine learning model for accurate ITS classification, which requires a small amount of data for training and can handle imbalanced datasets. HiTaC was thoroughly evaluated with the established TAXXI benchmark and could correctly classify fungal ITS sequences of varying lengths and a range of identity differences between the training and test data. HiTaC outperforms state-of-the-art methods when trained over noisy data, consistently achieving higher F1-score and sensitivity across different taxonomic ranks, improving sensitivity by 6.9 percentage points over top methods in the most noisy dataset available on TAXXI. CONCLUSIONS: HiTaC is publicly available at the Python package index, BIOCONDA and Docker Hub. It is released under the new BSD license, allowing free use in academia and industry. Source code and documentation, which includes installation and usage instructions, are available at https://gitlab.com/dacs-hpi/hitac .


Subject(s)
Fungi , Machine Learning , Fungi/genetics , Fungi/classification , DNA, Ribosomal Spacer/genetics , Software
15.
IUCrJ ; 11(Pt 4): 643-644, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38958017

ABSTRACT

The manuscript `Modeling a unit cell: crystallographic refinement procedure using the biomolecular MD simulation platform Amber' presents a novel protein structure refinement method claimed to offer improvements over traditional techniques like Refmac5 and Phenix. Our re-evaluation suggests that while the new method provides improvements, traditional methods achieve comparable results with less computational effort.


Subject(s)
Molecular Dynamics Simulation , Proteins , Proteins/chemistry , Crystallography, X-Ray , Protein Conformation , Macromolecular Substances/chemistry , Software , Models, Molecular
16.
Genome Biol ; 25(1): 177, 2024 Jul 04.
Article in English | MEDLINE | ID: mdl-38965579

ABSTRACT

Identifying viruses from metagenomes is a common step to explore the virus composition in the human gut. Here, we introduce VirRep, a hybrid language representation learning framework, for identifying viruses from human gut metagenomes. VirRep combines a context-aware encoder and an evolution-aware encoder to improve sequence representation by incorporating k-mer patterns and sequence homologies. Benchmarking on both simulated and real datasets with varying viral proportions demonstrates that VirRep outperforms state-of-the-art methods. When applied to fecal metagenomes from a colorectal cancer cohort, VirRep identifies 39 high-quality viral species associated with the disease, many of which cannot be detected by existing methods.


Subject(s)
Gastrointestinal Microbiome , Metagenome , Humans , Viruses/genetics , Feces/virology , Metagenomics/methods , Software , Colorectal Neoplasms/virology , Colorectal Neoplasms/genetics
17.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38975894

ABSTRACT

Chimeric antigen receptor (CAR) therapy has emerged as a ground-breaking advancement in cancer treatment, harnessing the power of engineered human immune cells to target and eliminate cancer cells. The escalating interest and investment in CAR therapy in recent years emphasize its profound significance in clinical research, positioning it as a rapidly expanding frontier in the field of personalized cancer therapies. A crucial step in CAR therapy design is choosing the right target as it determines the therapy's effectiveness, safety and specificity against cancer cells, while sparing healthy tissues. Herein, we propose a suite of tools for the identification and analysis of potential CAR targets leveraging expression data from The Cancer Genome Atlas and Genotype-Tissue Expression Project, which are implemented in CARTAR website. These tools focus on pinpointing tumor-associated antigens, ensuring target selectivity and assessing specificity to avoid off-tumor toxicities and can be used to rationally designing dual CARs. In addition, candidate target expression can be explored in cancer cell lines using the expression data for the Cancer Cell Line Encyclopedia. To our best knowledge, CARTAR is the first website dedicated to the systematic search of suitable candidate targets for CAR therapy. CARTAR is publicly accessible at https://gmxenomica.github.io/CARTAR/.


Subject(s)
Neoplasms , Receptors, Chimeric Antigen , Humans , Receptors, Chimeric Antigen/genetics , Receptors, Chimeric Antigen/metabolism , Receptors, Chimeric Antigen/immunology , Neoplasms/therapy , Neoplasms/genetics , Immunotherapy, Adoptive/methods , Software , Internet , Computational Biology/methods , Databases, Genetic
18.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38975892

ABSTRACT

Understanding the biological functions and processes of genes, particularly those not yet characterized, is crucial for advancing molecular biology and identifying therapeutic targets. The hypothesis guiding this study is that the 3D proximity of genes correlates with their functional interactions and relevance in prokaryotes. We introduced 3D-GeneNet, an innovative software tool that utilizes high-throughput sequencing data from chromosome conformation capture techniques and integrates topological metrics to construct gene association networks. Through a series of comparative analyses focused on spatial versus linear distances, we explored various dimensions such as topological structure, functional enrichment levels, distribution patterns of linear distances among gene pairs, and the area under the receiver operating characteristic curve by utilizing model organism Escherichia coli K-12. Furthermore, 3D-GeneNet was shown to maintain good accuracy compared to multiple algorithms (neighbourhood, co-occurrence, coexpression, and fusion) across multiple bacteria, including E. coli, Brucella abortus, and Vibrio cholerae. In addition, the accuracy of 3D-GeneNet's prediction of long-distance gene interactions was identified by bacterial two-hybrid assays on E. coli K-12 MG1655, where 3D-GeneNet not only increased the accuracy of linear genomic distance tripled but also achieved 60% accuracy by running alone. Finally, it can be concluded that the applicability of 3D-GeneNet will extend to various bacterial forms, including Gram-negative, Gram-positive, single-, and multi-chromosomal bacteria through Hi-C sequencing and analysis. Such findings highlight the broad applicability and significant promise of this method in the realm of gene association network. 3D-GeneNet is freely accessible at https://github.com/gaoyuanccc/3D-GeneNet.


Subject(s)
Gene Regulatory Networks , Software , Algorithms , Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Escherichia coli K12/genetics , Escherichia coli K12/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism
19.
Protein Sci ; 33(8): e5096, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38979954

ABSTRACT

Nuclear magnetic resonance (NMR) crystallography is one of the main methods in structural biology for analyzing protein stereochemistry and structure. The chemical shift of the resonance frequency reflects the effect of the protons in a molecule producing distinct NMR signals in different chemical environments. Apprehending chemical shifts from NMR signals can be challenging since having an NMR structure does not necessarily provide all the required chemical shift information, making predictive models essential for accurately deducing chemical shifts, either from protein structures or, more ideally, directly from amino acid sequences. Here, we present EFG-CS, a web server that specializes in chemical shift prediction. EFG-CS employs a machine learning-based transfer prediction model for backbone atom chemical shift prediction, using ESMFold-predicted protein structures. Additionally, ESG-CS incorporates a graph neural network-based model to provide comprehensive side-chain atom chemical shift predictions. Our method demonstrated reliable performance in backbone atom prediction, achieving comparable accuracy levels with root mean square errors (RMSE) of 0.30 ppm for H, 0.22 ppm for Hα, 0.89 ppm for C, 0.89 ppm for Cα, 0.84 ppm for Cß, and 1.69 ppm for N. Moreover, our approach also showed predictive capabilities in side-chain atom chemical shift prediction achieving RMSE values of 0.71 ppm for Hß, 0.74-1.15 ppm for Hδ, and 0.58-0.94 ppm for Hγ, solely utilizing amino acid sequences without homology or feature curation. This work shows for the first time that generative AI protein models can predict NMR shifts nearly comparable to experimental models. This web server is freely available at https://biosig.lab.uq.edu.au/efg_cs, and the chemical shift prediction results can be downloaded in tabular format and visualized in 3D format.


Subject(s)
Deep Learning , Machine Learning , Proteins , Proteins/chemistry , Nuclear Magnetic Resonance, Biomolecular , Software , Protein Conformation , Amino Acid Sequence , Models, Molecular
20.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38980375

ABSTRACT

Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.


Subject(s)
Algorithms , Humans , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Genomics/methods , Genomic Structural Variation , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...