Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 239
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 23(6): 100777, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38670310

RESUMO

Transmembrane (TM) proteins constitute over 30% of the mammalian proteome and play essential roles in mediating cell-cell communication, synaptic transmission, and plasticity in the central nervous system. Many of these proteins, especially the G protein-coupled receptors (GPCRs), are validated or candidate drug targets for therapeutic development for mental diseases, yet their expression profiles are underrepresented in most global proteomic studies. Herein, we establish a brain TM protein-enriched spectral library based on 136 data-dependent acquisition runs acquired from various brain regions of both naïve mice and mental disease models. This spectral library comprises 3043 TM proteins including 171 GPCRs, 231 ion channels, and 598 transporters. Leveraging this library, we analyzed the data-independent acquisition data from different brain regions of two mouse models exhibiting depression- or anxiety-like behaviors. By integrating multiple informatics workflows and library sources, our study significantly expanded the mental stress-perturbed TM proteome landscape, from which a new GPCR regulator of depression was verified by in vivo pharmacological testing. In summary, we provide a high-quality mouse brain TM protein spectral library to largely increase the TM proteome coverage in specific brain regions, which would catalyze the discovery of new potential drug targets for the treatment of mental disorders.


Assuntos
Encéfalo , Modelos Animais de Doenças , Transtornos Mentais , Camundongos Endogâmicos C57BL , Proteoma , Proteômica , Animais , Proteoma/metabolismo , Encéfalo/metabolismo , Proteômica/métodos , Camundongos , Transtornos Mentais/metabolismo , Proteínas de Membrana/metabolismo , Masculino , Receptores Acoplados a Proteínas G/metabolismo
2.
Mol Biol Evol ; 41(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38507648

RESUMO

Population genomic analyses such as inference of population structure and identifying signatures of selection usually involve the application of a plethora of tools. The installation of tools and their dependencies, data transformation, or series of data preprocessing in a particular order sometimes makes the analyses challenging. While the usage of container-based technologies has significantly resolved the problems associated with the installation of tools and their dependencies, population genomic analyses requiring multistep pipelines or complex data transformation can greatly be facilitated by the application of workflow management systems such as Nextflow and Snakemake. Here, we present scalepopgen, a collection of fully automated workflows that can carry out widely used population genomic analyses on the biallelic single nucleotide polymorphism data stored in either variant calling format files or the plink-generated binary files. scalepopgen is developed in Nextflow and can be run locally or on high-performance computing systems using either Conda, Singularity, or Docker. The automated workflow includes procedures such as (i) filtering of individuals and genotypes; (ii) principal component analysis, admixture with identifying optimal K-values; (iii) running TreeMix analysis with or without bootstrapping and migration edges, followed by identification of an optimal number of migration edges; (iv) implementing single-population and pair-wise population comparison-based procedures to identify genomic signatures of selection. The pipeline uses various open-source tools; additionally, several Python and R scripts are also provided to collect and visualize the results. The tool is freely available at https://github.com/Popgen48/scalepopgen.


Assuntos
Metagenômica , Software , Humanos , Fluxo de Trabalho , Genômica/métodos , Biologia Computacional/métodos
3.
BMC Bioinformatics ; 25(1): 200, 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38802733

RESUMO

BACKGROUND: The initial version of SEDA assists life science researchers without programming skills with the preparation of DNA and protein sequence FASTA files for multiple bioinformatics applications. However, the initial version of SEDA lacks a command-line interface for more advanced users and does not allow the creation of automated analysis pipelines. RESULTS: The present paper discusses the updates of the new SEDA release, including the addition of a complete command-line interface, new functionalities like gene annotation, a framework for automated pipelines, and improved integration in Linux environments. CONCLUSION: SEDA is an open-source Java application and can be installed using the different distributions available ( https://www.sing-group.org/seda/download.html ) as well as through a Docker image ( https://hub.docker.com/r/pegi3s/seda ). It is released under a GPL-3.0 license, and its source code is publicly accessible on GitHub ( https://github.com/sing-group/seda ). The software version at the time of submission is archived at Zenodo (version v1.6.0, http://doi.org/10.5281/zenodo.10201605 ).


Assuntos
Biologia Computacional , Software , Biologia Computacional/métodos , Análise de Dados
4.
BMC Bioinformatics ; 25(1): 11, 2024 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-38177985

RESUMO

BACKGROUND: Machine learning (ML) has a rich history in structural bioinformatics, and modern approaches, such as deep learning, are revolutionizing our knowledge of the subtle relationships between biomolecular sequence, structure, function, dynamics and evolution. As with any advance that rests upon statistical learning approaches, the recent progress in biomolecular sciences is enabled by the availability of vast volumes of sufficiently-variable data. To be useful, such data must be well-structured, machine-readable, intelligible and manipulable. These and related requirements pose challenges that become especially acute at the computational scales typical in ML. Furthermore, in structural bioinformatics such data generally relate to protein three-dimensional (3D) structures, which are inherently more complex than sequence-based data. A significant and recurring challenge concerns the creation of large, high-quality, openly-accessible datasets that can be used for specific training and benchmarking tasks in ML pipelines for predictive modeling projects, along with reproducible splits for training and testing. RESULTS: Here, we report 'Prop3D', a platform that allows for the creation, sharing and extensible reuse of libraries of protein domains, featurized with biophysical and evolutionary properties that can range from detailed, atomically-resolved physicochemical quantities (e.g., electrostatics) to coarser, residue-level features (e.g., phylogenetic conservation). As a community resource, we also supply a 'Prop3D-20sf' protein dataset, obtained by applying our approach to CATH . We have developed and deployed the Prop3D framework, both in the cloud and on local HPC resources, to systematically and reproducibly create comprehensive datasets via the Highly Scalable Data Service ( HSDS ). Our datasets are freely accessible via a public HSDS instance, or they can be used with accompanying Python wrappers for popular ML frameworks. CONCLUSION: Prop3D and its associated Prop3D-20sf dataset can be of broad utility in at least three ways. Firstly, the Prop3D workflow code can be customized and deployed on various cloud-based compute platforms, with scalability achieved largely by saving the results to distributed HDF5 files via HSDS . Secondly, the linked Prop3D-20sf dataset provides a hand-crafted, already-featurized dataset of protein domains for 20 highly-populated CATH families; importantly, provision of this pre-computed resource can aid the more efficient development (and reproducible deployment) of ML pipelines. Thirdly, Prop3D-20sf's construction explicitly takes into account (in creating datasets and data-splits) the enigma of 'data leakage', stemming from the evolutionary relationships between proteins.


Assuntos
Biologia Computacional , Proteínas , Humanos , Filogenia , Biologia Computacional/métodos , Fluxo de Trabalho , Aprendizado de Máquina
5.
J Microsc ; 295(2): 93-101, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38532662

RESUMO

As microscopy diversifies and becomes ever more complex, the problem of quantification of microscopy images has emerged as a major roadblock for many researchers. All researchers must face certain challenges in turning microscopy images into answers, independent of their scientific question and the images they have generated. Challenges may arise at many stages throughout the analysis process, including handling of the image files, image pre-processing, object finding, or measurement, and statistical analysis. While the exact solution required for each obstacle will be problem-specific, by keeping analysis in mind, optimizing data quality, understanding tools and tradeoffs, breaking workflows and data sets into chunks, talking to experts, and thoroughly documenting what has been done, analysts at any experience level can learn to overcome these challenges and create better and easier image analyses.

6.
Crit Rev Food Sci Nutr ; : 1-22, 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38206576

RESUMO

Over the past decade, a remarkable surge in the development of functional nano-delivery systems loaded with bioactive compounds for healthcare has been witnessed. Notably, the demanding requirements of high solubility, prolonged circulation, high tissue penetration capability, and strong targeting ability of nanocarriers have posed interdisciplinary research challenges to the community. While extensive experimental studies have been conducted to understand the construction of nano-delivery systems and their metabolic behavior in vivo, less is known about these molecular mechanisms and kinetic pathways during their metabolic process in vivo, and lacking effective means for high-throughput screening. Molecular dynamics (MD) simulation techniques provide a reliable tool for investigating the design of nano-delivery carriers encapsulating these functional ingredients, elucidating the synthesis, translocation, and delivery of nanocarriers. This review introduces the basic MD principles, discusses how to apply MD simulation to design nanocarriers, evaluates the ability of nanocarriers to adhere to or cross gastrointestinal mucosa, and regulates plasma proteins in vivo. Moreover, we presented the critical role of MD simulation in developing delivery systems for precise nutrition and prospects for the future. This review aims to provide insights into the implications of MD simulation techniques for designing and optimizing nano-delivery systems in the healthcare food industry.

7.
J Comput Aided Mol Des ; 38(1): 24, 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39014286

RESUMO

Molecular dynamics (MD) simulation is a powerful tool for characterizing ligand-protein conformational dynamics and offers significant advantages over docking and other rigid structure-based computational methods. However, setting up, running, and analyzing MD simulations continues to be a multi-step process making it cumbersome to assess a library of ligands in a protein binding pocket using MD. We present an automated workflow that streamlines setting up, running, and analyzing Desmond MD simulations for protein-ligand complexes using machine learning (ML) models. The workflow takes a library of pre-docked ligands and a prepared protein structure as input, sets up and runs MD with each protein-ligand complex, and generates simulation fingerprints for each ligand. Simulation fingerprints (SimFP) capture protein-ligand compatibility, including stability of different ligand-pocket interactions and other useful metrics that enable easy rank-ordering of the ligand library for pocket optimization. SimFPs from a ligand library are used to build & deploy ML models that predict binding assay outcomes and automatically infer important interactions. Unlike relative free-energy methods that are constrained to assess ligands with high chemical similarity, ML models based on SimFPs can accommodate diverse ligand sets. We present two case studies on how SimFP helps delineate structure-activity relationship (SAR) trends and explain potency differences across matched-molecular pairs of (1) cyclic peptides targeting PD-L1 and (2) small molecule inhibitors targeting CDK9.


Assuntos
Aprendizado de Máquina , Simulação de Dinâmica Molecular , Ligação Proteica , Proteínas , Ligantes , Proteínas/química , Proteínas/metabolismo , Sítios de Ligação , Simulação de Acoplamento Molecular , Conformação Proteica , Fluxo de Trabalho , Humanos , Desenho de Fármacos , Software
8.
J Biomed Inform ; 154: 104647, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38692465

RESUMO

OBJECTIVE: To use software, datasets, and data formats in the domain of Infectious Disease Epidemiology as a test collection to evaluate a novel M1 use case, which we introduce in this paper. M1 is a machine that upon receipt of a new digital object of research exhaustively finds all valid compositions of it with existing objects. METHOD: We implemented a data-format-matching-only M1 using exhaustive search, which we refer to as M1DFM. We then ran M1DFM on the test collection and used error analysis to identify needed semantic constraints. RESULTS: Precision of M1DFM search was 61.7%. Error analysis identified needed semantic constraints and needed changes in handling of data services. Most semantic constraints were simple, but one data format was sufficiently complex to be practically impossible to represent semantic constraints over, from which we conclude limitatively that software developers will have to meet the machines halfway by engineering software whose inputs are sufficiently simple that their semantic constraints can be represented, akin to the simple APIs of services. We summarize these insights as M1-FAIR guiding principles for composability and suggest a roadmap for progressively capable devices in the service of reuse and accelerated scientific discovery. CONCLUSION: Algorithmic search of digital repositories for valid workflow compositions has potential to accelerate scientific discovery but requires a scalable solution to the problem of knowledge acquisition about semantic constraints on software inputs. Additionally, practical limitations on the logical complexity of semantic constraints must be respected, which has implications for the design of software.


Assuntos
Software , Humanos , Semântica , Aprendizado de Máquina , Algoritmos , Bases de Dados Factuais
9.
Microsc Microanal ; 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38905154

RESUMO

There has been an increasing interest in atom probe tomography (APT) to characterize hydrated and biological materials. A major benefit of APT compared to microscopy techniques more commonly used in biology is its combination of outstanding three-dimensional (3D) spatial resolution and mass sensitivity. APT has already been successfully used to characterize biominerals, revealing key structural information at the atomic scale, however there are many challenges inherent to the analysis of soft hydrated materials. New preparation protocols, often involving specimen preparation and transfer at cryogenic temperature, enable APT analysis of hydrated materials and have the potential to enable 3D atomic scale characterization of biological materials in the near-native hydrated state. In this study, samples of pure water at the tips of tungsten needle specimens were prepared at room temperature by graphene encapsulation. A comparative study was conducted where specimens were transferred at either room temperature or cryo-temperature and analyzed by APT by varying the flight path and pulsing mode. The differences between the analysis workflows are presented along with recommendations for future studies, and the compatibility between graphene coating and cryogenic workflows is demonstrated.

10.
Sensors (Basel) ; 24(7)2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38610587

RESUMO

This paper describes a novel architecture that aims to create a template for the implementation of an IT platform, supporting the deployment and integration of the different digital twin subsystems that compose a complex urban intelligence system. In more detail, the proposed Smart City IT architecture has the following main purposes: (i) facilitating the deployment of the subsystems in a cloud environment; (ii) effectively storing, integrating, managing, and sharing the huge amount of heterogeneous data acquired and produced by each subsystem, using a data lake; (iii) supporting data exchange and sharing; (iv) managing and executing workflows, to automatically coordinate and run processes; and (v) to provide and visualize the required information. A prototype of the proposed IT solution was implemented leveraging open-source frameworks and technologies, to test its functionalities and performance. The results of the tests performed in real-world settings confirmed that the proposed architecture could efficiently and easily support the deployment and integration of heterogeneous subsystems, allowing them to share and integrate their data and to select, extract, and visualize the information required by a user, as well as promoting the integration with other external systems, and defining and executing workflows to orchestrate the various subsystems involved in complex analyses and processes.

11.
Int Nurs Rev ; 2024 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-38973347

RESUMO

AIM: This research examines the effects of artificial intelligence (AI)-based decision support systems (DSS) on the operational processes of nurses in critical care units (CCU) located in Amman, Jordan. BACKGROUND: The deployment of AI technology within the healthcare sector presents substantial opportunities for transforming patient care, with a particular emphasis on the field of nursing. METHOD: This paper examines how AI-based DSS affect CCU nursing workflows in Amman, Jordan, using a cross-sectional analysis. A study group of 112 registered nurses was enlisted throughout a research period spanning one month. Data were gathered using surveys that specifically examined several facets of nursing workflows, the employment of AI, encountered problems, and the sufficiency of training. RESULT: The findings indicate a varied demographic composition among the participants, with notable instances of AI technology adoption being reported. Nurses have the perception that there are favorable effects on time management, patient monitoring, and clinical decision-making. However, they continue to face persistent hurdles, including insufficient training, concerns regarding data privacy, and technical difficulties. DISCUSSION: The study highlights the significance of thorough training programs and supportive mechanisms to improve nurses' involvement with AI technologies and maximize their use in critical care environments. Although there are differing degrees of contentment with existing AI systems, there is a general agreement on the necessity of ongoing enhancement and fine-tuning to optimize their efficacy in enhancing patient care results. CONCLUSION AND IMPLICATIONS FOR NURSING AND/OR HEALTH POLICY: This research provides essential knowledge about the intricacies of incorporating AI into nursing practice, highlighting the significance of tackling obstacles to guarantee the ethical and efficient use of AI technology in healthcare.

12.
BMC Bioinformatics ; 24(1): 446, 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38012574

RESUMO

BACKGROUND: Galaxy is a web-based open-source platform for scientific analyses. Researchers use thousands of high-quality tools and workflows for their respective analyses in Galaxy. Tool recommender system predicts a collection of tools that can be used to extend an analysis. In this work, a tool recommender system is developed by training a transformer on workflows available on Galaxy Europe and its performance is compared to other neural networks such as recurrent, convolutional and dense neural networks. RESULTS: The transformer neural network achieves two times faster convergence, has significantly lower model usage (model reconstruction and prediction) time and shows a better generalisation that goes beyond training workflows than the older tool recommender system created using RNN in Galaxy. In addition, the transformer also outperforms CNN and DNN on several key indicators. It achieves a faster convergence time, lower model usage time, and higher quality tool recommendations than CNN. Compared to DNN, it converges faster to a higher precision@k metric (approximately 0.98 by transformer compared to approximately 0.9 by DNN) and shows higher quality tool recommendations. CONCLUSION: Our work shows a novel usage of transformers to recommend tools for extending scientific workflows. A more robust tool recommendation model, created using a transformer, having significantly lower usage time than RNN and CNN, higher precision@k than DNN, and higher quality tool recommendations than all three neural networks, will benefit researchers in creating scientifically significant workflows and exploratory data analysis in Galaxy. Additionally, the ability to train faster than all three neural networks imparts more scalability for training on larger datasets consisting of millions of tool sequences. Open-source scripts to create the recommendation model are available under MIT licence at https://github.com/anuprulez/galaxy_tool_recommendation_transformers.


Assuntos
Redes Neurais de Computação , Software , Fluxo de Trabalho , Análise de Dados , Europa (Continente)
13.
Expert Rev Proteomics ; 20(11): 251-266, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37787106

RESUMO

INTRODUCTION: Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED: The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION: The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.


Assuntos
Proteômica , Humanos , Biologia Computacional/métodos , Espectrometria de Massas/métodos , Proteômica/métodos , Software
14.
J Biomed Inform ; 139: 104319, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36791900

RESUMO

Despite the creation of thousands of machine learning (ML) models, the promise of improving patient care with ML remains largely unrealized. Adoption into clinical practice is lagging, in large part due to disconnects between how ML practitioners evaluate models and what is required for their successful integration into care delivery. Models are just one component of care delivery workflows whose constraints determine clinicians' abilities to act on models' outputs. However, methods to evaluate the usefulness of models in the context of their corresponding workflows are currently limited. To bridge this gap we developed APLUS, a reusable framework for quantitatively assessing via simulation the utility gained from integrating a model into a clinical workflow. We describe the APLUS simulation engine and workflow specification language, and apply it to evaluate a novel ML-based screening pathway for detecting peripheral artery disease at Stanford Health Care.


Assuntos
Atenção à Saúde , Aprendizado de Máquina , Humanos , Simulação por Computador , Fluxo de Trabalho , Idioma
15.
J Digit Imaging ; 36(6): 2392-2401, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37580483

RESUMO

Thyroid nodules occur in up to 68% of people, 95% of which are benign. Of the 5% of malignant nodules, many would not result in symptoms or death, yet 600,000 FNAs are still performed annually, with a PPV of 5-7% (up to 30%). Artificial intelligence (AI) systems have the capacity to improve diagnostic accuracy and workflow efficiency when integrated into clinical decision pathways. Previous studies have evaluated AI systems against physicians, whereas we aim to compare the benefits of incorporating AI into their final diagnostic decision. This work analyzed the potential for artificial intelligence (AI)-based decision support systems to improve physician accuracy, variability, and efficiency. The decision support system (DSS) assessed was Koios DS, which provides automated sonographic nodule descriptor predictions and a direct cancer risk assessment aligned to ACR TI-RADS. The study was conducted retrospectively between (08/2020) and (10/2020). The set of cases used included 650 patients (21% male, 79% female) of age 53 ± 15. Fifteen physicians assessed each of the cases in the set, both unassisted and aided by the DSS. The order of the reading condition was randomized, and reading blocks were separated by a period of 4 weeks. The system's impact on reader accuracy was measured by comparing the area under the ROC curve (AUC), sensitivity, and specificity of readers with and without the DSS with FNA as ground truth. The impact on reader variability was evaluated using Pearson's correlation coefficient. The impact on efficiency was determined by comparing the average time per read. There was a statistically significant increase in average AUC of 0.083 [0.066, 0.099] and an increase in sensitivity and specificity of 8.4% [5.4%, 11.3%] and 14% [12.5%, 15.5%], respectively, when aided by Koios DS. The average time per case decreased by 23.6% (p = 0.00017), and the observed Pearson's correlation coefficient increased from r = 0.622 to r = 0.876 when aided by Koios DS. These results indicate that providing physicians with automated clinical decision support significantly improved diagnostic accuracy, as measured by AUC, sensitivity, and specificity, and reduced inter-reader variability and interpretation times.


Assuntos
Aprendizado Profundo , Nódulo da Glândula Tireoide , Humanos , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Idoso , Estudos Retrospectivos , Inteligência Artificial , Nódulo da Glândula Tireoide/patologia , Ultrassonografia/métodos
16.
Int J Mol Sci ; 24(18)2023 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-37762547

RESUMO

Macromolecular assemblies, such as protein complexes, undergo continuous structural dynamics, including global reconfigurations critical for their function. Two fast analytical methods are widely used to study these global dynamics, namely elastic network model normal mode analysis and principal component analysis of ensembles of structures. These approaches have found wide use in various computational studies, driving the development of complex pipelines in several software packages. One common theme has been conformational sampling through hybrid simulations incorporating all-atom molecular dynamics and global modes of motion. However, wide functionality is only available for experienced programmers with limited capabilities for other users. We have, therefore, integrated one popular and extensively developed software for such analyses, the ProDy Python application programming interface, into the Scipion workflow engine. This enables a wider range of users to access a complete range of macromolecular dynamics pipelines beyond the core functionalities available in its command-line applications and the normal mode wizard in VMD. The new protocols and pipelines can be further expanded and integrated into larger workflows, together with other software packages for cryo-electron microscopy image analysis and molecular simulations. We present the resulting plugin, Scipion-EM-ProDy, in detail, highlighting the rich functionality made available by its development.


Assuntos
Processamento de Imagem Assistida por Computador , Microscopia Crioeletrônica , Fluxo de Trabalho , Bases de Dados Factuais , Movimento (Física)
17.
Molecules ; 28(13)2023 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-37446648

RESUMO

Antioxidants play a significant role in human health, protecting against a variety of diseases. Therefore, the development of products with antioxidant activity is becoming increasingly prominent in the human lifestyle. New antioxidant drinks containing different percentages of pomegranate, blackberries, red grapes, and aronia have been designed, developed, and manufactured by a local industry. The comprehensive characterization of the drinks' constituents has been deemed necessary to evaluate their bioactivity. Thus, LC-qTOFMS has been selected, due to its sensitivity and structure identification capability. Both data-dependent and -independent acquisition modes have been utilized. The data have been treated according to a novel, newly designed workflow based on MS-DIAL and MZmine for suspect, as well as target screening. The classical MS-DIAL workflow has been modified to perform suspect and target screening in an automatic way. Furthermore, a novel methodology based on a compiled bioactivity-driven suspect list was developed and expanded with combinatorial enumeration to include metabolism products of the highlighted metabolites. Compounds belonging to ontologies with possible antioxidant capacity have been identified, such as flavonoids, amino acids, and fatty acids, which could be beneficial to human health, revealing the importance of the produced drinks as well as the efficacy of the new in-house developed workflow.


Assuntos
Antioxidantes , Punica granatum , Humanos , Antioxidantes/farmacologia , Antioxidantes/química , Cromatografia Líquida/métodos , Fluxo de Trabalho
18.
BMC Bioinformatics ; 23(1): 156, 2022 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-35501696

RESUMO

BACKGROUND: Quantification of gene expression from RNA-seq data is a prerequisite for transcriptome analysis such as differential gene expression analysis and gene co-expression network construction. Individual RNA-seq experiments are larger and combining multiple experiments from sequence repositories can result in datasets with thousands of samples. Processing hundreds to thousands of RNA-seq data can result in challenges related to data management, access to sufficient computational resources, navigation of high-performance computing (HPC) systems, installation of required software dependencies, and reproducibility. Processing of larger and deeper RNA-seq experiments will become more common as sequencing technology matures. RESULTS: GEMmaker, is a nf-core compliant, Nextflow workflow, that quantifies gene expression from small to massive RNA-seq datasets. GEMmaker ensures results are highly reproducible through the use of versioned containerized software that can be executed on a single workstation, institutional compute cluster, Kubernetes platform or the cloud. GEMmaker supports popular alignment and quantification tools providing results in raw and normalized formats. GEMmaker is unique in that it can scale to process thousands of local or remote stored samples without exceeding available data storage. CONCLUSIONS: Workflows that quantify gene expression are not new, and many already address issues of portability, reusability, and scale in terms of access to CPUs. GEMmaker provides these benefits and adds the ability to scale despite low data storage infrastructure. This allows users to process hundreds to thousands of RNA-seq samples even when data storage resources are limited. GEMmaker is freely available and fully documented with step-by-step setup and execution instructions.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA-Seq , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos
19.
BMC Genomics ; 23(1): 235, 2022 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-35346021

RESUMO

BACKGROUND: Whole genome sequencing analyzed by core genome multi-locus sequence typing (cgMLST) is widely used in surveillance of the pathogenic bacteria Listeria monocytogenes. Given the heterogeneity of available bioinformatics tools to define cgMLST alleles, our aim was to identify parameters influencing the precision of cgMLST profiles. METHODS: We used three L. monocytogenes reference genomes from different phylogenetic lineages and assessed the impact of in vitro (i.e. tested genomes, successive platings, replicates of DNA extraction and sequencing) and in silico parameters (i.e. targeted depth of coverage, depth of coverage, breadth of coverage, assembly metrics, cgMLST workflows, cgMLST completeness) on cgMLST precision made of 1748 core loci. Six cgMLST workflows were tested, comprising assembly-based (BIGSdb, INNUENDO, GENPAT, SeqSphere and BioNumerics) and assembly-free (i.e. kmer-based MentaLiST) allele callers. Principal component analyses and generalized linear models were used to identify the most impactful parameters on cgMLST precision. RESULTS: The isolate's genetic background, cgMLST workflows, cgMLST completeness, as well as depth and breadth of coverage were the parameters that impacted most on cgMLST precision (i.e. identical alleles against reference circular genomes). All workflows performed well at ≥40X of depth of coverage, with high loci detection (> 99.54% for all, except for BioNumerics with 97.78%) and showed consistent cluster definitions using the reference cut-off of ≤7 allele differences. CONCLUSIONS: This highlights that bioinformatics workflows dedicated to cgMLST allele calling are largely robust when paired-end reads are of high quality and when the sequencing depth is ≥40X.


Assuntos
Listeria monocytogenes , Genoma Bacteriano , Listeria monocytogenes/genética , Tipagem de Sequências Multilocus , Filogenia , Sequenciamento Completo do Genoma
20.
Ecol Lett ; 25(6): 1345-1351, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35315961

RESUMO

Making predictions from ecological models-and comparing them to data-offers a coherent approach to evaluate model quality, regardless of model complexity or modelling paradigm. To date, our ability to use predictions for developing, validating, updating, integrating and applying models across scientific disciplines while influencing management decisions, policies, and the public has been hampered by disparate perspectives on prediction and inadequately integrated approaches. We present an updated foundation for Predictive Ecology based on seven principles applied to ecological modelling: make frequent Predictions, Evaluate models, make models Reusable, Freely accessible and Interoperable, built within Continuous workflows that are routinely Tested (PERFICT). We outline some benefits of working with these principles: accelerating science; linking with data science; and improving science-policy integration.


Assuntos
Ecologia , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA