Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 7.149
Filtrar
Más filtros

Publication year range
1.
Cell ; 187(6): 1316-1326, 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38490173

RESUMEN

Understanding sex-related variation in health and illness requires rigorous and precise approaches to revealing underlying mechanisms. A first step is to recognize that sex is not in and of itself a causal mechanism; rather, it is a classification system comprising a set of categories, usually assigned according to a range of varying traits. Moving beyond sex as a system of classification to working with concrete and measurable sex-related variables is necessary for precision. Whether and how these sex-related variables matter-and what patterns of difference they contribute to-will vary in context-specific ways. Second, when researchers incorporate these sex-related variables into research designs, rigorous analytical methods are needed to allow strongly supported conclusions. Third, the interpretation and reporting of sex-related variation require care to ensure that basic and preclinical research advance health equity for all.


Asunto(s)
Investigación Biomédica , Equidad en Salud , Sexo , Humanos
2.
Physiol Rev ; 104(3): 1387-1408, 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38451234

RESUMEN

Effective data management is crucial for scientific integrity and reproducibility, a cornerstone of scientific progress. Well-organized and well-documented data enable validation and building on results. Data management encompasses activities including organization, documentation, storage, sharing, and preservation. Robust data management establishes credibility, fostering trust within the scientific community and benefiting researchers' careers. In experimental biomedicine, comprehensive data management is vital due to the typically intricate protocols, extensive metadata, and large datasets. Low-throughput experiments, in particular, require careful management to address variations and errors in protocols and raw data quality. Transparent and accountable research practices rely on accurate documentation of procedures, data collection, and analysis methods. Proper data management ensures long-term preservation and accessibility of valuable datasets. Well-managed data can be revisited, contributing to cumulative knowledge and potential new discoveries. Publicly funded research has an added responsibility for transparency, resource allocation, and avoiding redundancy. Meeting funding agency expectations increasingly requires rigorous methodologies, adherence to standards, comprehensive documentation, and widespread sharing of data, code, and other auxiliary resources. This review provides critical insights into raw and processed data, metadata, high-throughput versus low-throughput datasets, a common language for documentation, experimental and reporting guidelines, efficient data management systems, sharing practices, and relevant repositories. We systematically present available resources and optimal practices for wide use by experimental biomedical researchers.


Asunto(s)
Investigación Biomédica , Manejo de Datos , Difusión de la Información , Investigación Biomédica/normas , Investigación Biomédica/métodos , Difusión de la Información/métodos , Humanos , Animales , Manejo de Datos/métodos
3.
Mol Cell ; 82(11): 2161-2166.e3, 2022 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-35623354

RESUMEN

CRISPR systems are prokaryotic adaptive immune systems that use RNA-guided Cas nucleases to recognize and destroy foreign genetic elements. To overcome CRISPR immunity, bacteriophages have evolved diverse families of anti-CRISPR proteins (Acrs). Recently, Lin et al. (2020) described the discovery and characterization of 7 Acr families (AcrVIA1-7) that inhibit type VI-A CRISPR systems. We detail several inconsistencies that question the results reported in the Lin et al. (2020) study. These include inaccurate bioinformatics analyses and bacterial strains that are impossible to construct. Published strains were provided by the authors, but MS2 bacteriophage plaque assays did not support the published results. We also independently tested the Acr sequences described in the original report, in E. coli and mammalian cells, but did not observe anti-Cas13a activity. Taken together, our data and analyses prompt us to question the claim that AcrVIA1-7 reported in Lin et al. are type VI anti-CRISPR proteins.


Asunto(s)
Bacteriófagos , Proteínas Asociadas a CRISPR , Animales , Bacteriófagos/genética , Proteínas Asociadas a CRISPR/genética , Proteínas Asociadas a CRISPR/metabolismo , Sistemas CRISPR-Cas , Escherichia coli/genética , Escherichia coli/metabolismo , Leptotrichia/genética , Mamíferos/metabolismo , Profagos/genética , Profagos/metabolismo , Ribonucleasas/metabolismo
4.
Proc Natl Acad Sci U S A ; 121(29): e2313851121, 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-38976734

RESUMEN

Mass spectrometry-based omics technologies are increasingly used in perturbation studies to map drug effects to biological pathways by identifying significant molecular events. Significance is influenced by fold change and variation of each molecular parameter, but also by multiple testing corrections. While the fold change is largely determined by the biological system, the variation is determined by experimental workflows. Here, it is shown that memory effects of prior subculture can influence the variation of perturbation profiles using the two colon carcinoma cell lines SW480 and HCT116. These memory effects are largely driven by differences in growth states that persist into the perturbation experiment. In SW480 cells, memory effects combined with moderate treatment effects amplify the variation in multiple omics levels, including eicosadomics, proteomics, and phosphoproteomics. With stronger treatment effects, the memory effect was less pronounced, as demonstrated in HCT116 cells. Subculture homogeneity was controlled by real-time monitoring of cell growth. Controlled homogeneous subculture resulted in a perturbation network of 321 causal conjectures based on combined proteomic and phosphoproteomic data, compared to only 58 causal conjectures without controlling subculture homogeneity in SW480 cells. Some cellular responses and regulatory events were identified that extend the mode of action of arsenic trioxide (ATO) only when accounting for these memory effects. Controlled prior subculture led to the finding of a synergistic combination treatment of ATO with the thioredoxin reductase 1 inhibitor auranofin, which may prove useful in the management of NRF2-mediated resistance mechanisms.


Asunto(s)
Proteómica , Humanos , Proteómica/métodos , Línea Celular Tumoral , Células HCT116 , Técnicas de Cultivo de Célula/métodos , Neoplasias del Colon/metabolismo , Neoplasias del Colon/tratamiento farmacológico , Neoplasias del Colon/patología , Trióxido de Arsénico/farmacología , Auranofina/farmacología , Proliferación Celular/efectos de los fármacos , Espectrometría de Masas/métodos
5.
Development ; 150(11)2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37260362

RESUMEN

Recent years have seen exciting progress across human embryo research, including new methods for culturing embryos, transcriptional profiling of embryogenesis and gastrulation, mapping lineage trajectories, and experimenting on stem cell-based embryo models. These advances are beginning to define the dynamical principles of development across stages, tissues and organs, enabling a better understanding of human development before birth in health and disease, and potentially leading to improved treatments for infertility and developmental disorders. However, there are still significant roadblocks en route to this goal. Here, we highlight technical challenges to studying early human development and propose ways and means to overcome some of these constraints.


Asunto(s)
Desarrollo Embrionario , Gastrulación , Humanos , Desarrollo Embrionario/genética , Embrión de Mamíferos , Células Madre
6.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38752856

RESUMEN

Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.


Asunto(s)
Biología Computacional , Programas Informáticos , Humanos , Biología Computacional/métodos , Reproducibilidad de los Resultados , Receptores Inmunológicos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Inmunidad Adaptativa/genética , Guías como Asunto
7.
Semin Cancer Biol ; 101: 12-24, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38657746

RESUMEN

In 2014, the International Society for Extracellular Vesicles (ISEV) introduced the Minimal Information for Studies of Extracellular Vesicles (MISEV) guidelines to establish standards for extracellular vesicle (EV) research. These guidelines aimed to enhance reliability and reproducibility, addressing the expanding field of EV science. EVs, membrane-bound particles released by cells, play crucial roles in intercellular communication and are potential biomarkers for various conditions. Over the years, the EV landscape witnessed a surge in publications, emphasizing their roles in cancer and immune modulation. In response, the MISEV guidelines underwent evolution, leading to the MISEV2018 update. This version, generated through community outreach, provided a comprehensive framework for EV research methodologies, emphasizing separation, characterization, reporting standards, and community engagement. The MISEV2018 guidelines reflected responsiveness to feedback, acknowledging the evolving EV research landscape. The guidelines served as a testament to the commitment of the scientific community to rigorous standards and the collective discernment of experts. The present article compares previous MISEV guidelines with its 2023 counterpart, highlighting advancements, changes, and impacts on EV research standardization. The 2023 guidelines build upon the 2018 principles, offering new recommendations for emerging areas. This comparative exploration contributes to understanding the transformative journey in EV research, emphasizing MISEV's pivotal role and the scientific community's adaptability to challenges.


Asunto(s)
Vesículas Extracelulares , Vesículas Extracelulares/metabolismo , Humanos , Neoplasias/terapia , Neoplasias/inmunología , Guías como Asunto , Investigación Biomédica/métodos , Comunicación Celular
8.
Am J Hum Genet ; 109(5): 825-837, 2022 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-35523146

RESUMEN

Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Humanos , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados , Transcriptoma/genética
9.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37870287

RESUMEN

Computational reproducibility is a simple premise in theory, but is difficult to achieve in practice. Building upon past efforts and proposals to maximize reproducibility and rigor in bioinformatics, we present a framework called the five pillars of reproducible computational research. These include (1) literate programming, (2) code version control and sharing, (3) compute environment control, (4) persistent data sharing and (5) documentation. These practices will ensure that computational research work can be reproduced quickly and easily, long into the future. This guide is designed for bioinformatics data analysts and bioinformaticians in training, but should be relevant to other domains of study.


Asunto(s)
Biología Computacional , Difusión de la Información , Reproducibilidad de los Resultados , Programas Informáticos
10.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37287135

RESUMEN

Hi-C is a widely applied chromosome conformation capture (3C)-based technique, which has produced a large number of genomic contact maps with high sequencing depths for a wide range of cell types, enabling comprehensive analyses of the relationships between biological functionalities (e.g. gene regulation and expression) and the three-dimensional genome structure. Comparative analyses play significant roles in Hi-C data studies, which are designed to make comparisons between Hi-C contact maps, thus evaluating the consistency of replicate Hi-C experiments (i.e. reproducibility measurement) and detecting statistically differential interacting regions with biological significance (i.e. differential chromatin interaction detection). However, due to the complex and hierarchical nature of Hi-C contact maps, it remains challenging to conduct systematic and reliable comparative analyses of Hi-C data. Here, we proposed sslHiC, a contrastive self-supervised representation learning framework, for precisely modeling the multi-level features of chromosome conformation and automatically producing informative feature embeddings for genomic loci and their interactions to facilitate comparative analyses of Hi-C contact maps. Comprehensive computational experiments on both simulated and real datasets demonstrated that our method consistently outperformed the state-of-the-art baseline methods in providing reliable measurements of reproducibility and detecting differential interactions with biological meanings.


Asunto(s)
Cromatina , Cromosomas , Reproducibilidad de los Resultados , Cromatina/genética , Cromosomas/genética , Genómica/métodos , Aprendizaje Automático Supervisado
11.
Proc Natl Acad Sci U S A ; 119(30): e2120377119, 2022 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-35858443

RESUMEN

This initiative examined systematically the extent to which a large set of archival research findings generalizes across contexts. We repeated the key analyses for 29 original strategic management effects in the same context (direct reproduction) as well as in 52 novel time periods and geographies; 45% of the reproductions returned results matching the original reports together with 55% of tests in different spans of years and 40% of tests in novel geographies. Some original findings were associated with multiple new tests. Reproducibility was the best predictor of generalizability-for the findings that proved directly reproducible, 84% emerged in other available time periods and 57% emerged in other geographies. Overall, only limited empirical evidence emerged for context sensitivity. In a forecasting survey, independent scientists were able to anticipate which effects would find support in tests in new samples.

12.
Nano Lett ; 24(22): 6553-6559, 2024 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-38775731

RESUMEN

New approaches such as selective area growth (SAG), where crystal growth is lithographically controlled, allow the integration of bottom-up grown semiconductor nanomaterials in large-scale classical and quantum nanoelectronics. This calls for assessment and optimization of the reproducibility between individual components. We quantify the structural and electronic statistical reproducibility within large arrays of nominally identical selective area growth InAs nanowires. The distribution of structural parameters is acquired through comprehensive atomic force microscopy studies and transmission electron microscopy. These are compared to the statistical distributions of the cryogenic electrical properties of 256 individual SAG nanowire field effect transistors addressed using cryogenic multiplexer circuits. Correlating measurements between successive thermal cycles allows distinguishing between the contributions of surface impurity scattering and fixed structural properties to device reproducibility. The results confirm the potential of SAG nanomaterials, and the methodologies for quantifying statistical metrics are essential for further optimization of reproducibility.

13.
Proteomics ; 24(10): e2300339, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38299459

RESUMEN

Detergent-based workflows incorporating sodium dodecyl sulfate (SDS) necessitate additional steps for detergent removal ahead of mass spectrometry (MS). These steps may lead to variable protein recovery, inconsistent enzyme digestion efficiency, and unreliable MS signals. To validate a detergent-based workflow for quantitative proteomics, we herein evaluate the precision of a bottom-up sample preparation strategy incorporating cartridge-based protein precipitation with organic solvent to deplete SDS. The variance of data-independent acquisition (SWATH-MS) data was isolated from sample preparation error by modelling the variance as a function of peptide signal intensity. Our SDS-assisted cartridge workflow yield a coefficient of variance (CV) of 13%-14%. By comparison, conventional (detergent-free) in-solution digestion increased the CV to 50%; in-gel digestion provided lower CVs between 14% and 20%. By filtering peptides predicting to display lower precision, we further enhance the validity of data in global comparative proteomics. These results demonstrate the detergent-based precipitation workflow is a reliable approach for in depth, label-free quantitative proteome analysis.


Asunto(s)
Precipitación Química , Detergentes , Proteómica , Dodecil Sulfato de Sodio , Flujo de Trabajo , Proteómica/métodos , Dodecil Sulfato de Sodio/química , Detergentes/química , Proteoma/análisis , Proteoma/química , Humanos , Péptidos/química , Péptidos/análisis
14.
J Neurosci ; 43(46): 7780-7798, 2023 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-37709539

RESUMEN

Animal studies have established that the mediodorsal nucleus (MD) of the thalamus is heavily and reciprocally connected with all areas of the prefrontal cortex (PFC). In humans, however, these connections are difficult to investigate. High-resolution imaging protocols capable of reliably tracing the axonal tracts linking the human MD with each of the PFC areas may thus be key to advance our understanding of the variation, development, and plastic changes of these important circuits, in health and disease. Here, we tested in adult female and male humans the reliability of a new reconstruction protocol based on in vivo diffusion MRI to trace, measure, and characterize the fiber tracts interconnecting the MD with 39 human PFC areas per hemisphere. Our protocol comprised the following three components: (1) defining regions of interest; (2) preprocessing diffusion data; and, (3) modeling white matter tracts and tractometry. This analysis revealed largely separate PFC territories of reciprocal MD-PFC tracts bearing striking resemblance with the topographic layout observed in macaque connection-tracing studies. We then examined whether our protocol could reliably reconstruct each of these MD-PFC tracts and their profiles across test and retest sessions. Results revealed that this protocol was able to trace and measure, in both left and right hemispheres, the trajectories of these 39 area-specific axon bundles with good-to-excellent test-retest reproducibility. This protocol, which has been made publicly available, may be relevant for cognitive neuroscience and clinical studies of normal and abnormal PFC function, development, and plasticity.SIGNIFICANCE STATEMENT Reciprocal MD-PFC interactions are critical for complex human cognition and learning. Reliably tracing, measuring and characterizing MD-PFC white matter tracts using high-resolution noninvasive methods is key to assess individual variation of these systems in humans. Here, we propose a high-resolution tractography protocol that reliably reconstructs 39 area-specific MD-PFC white matter tracts per hemisphere and quantifies structural information from diffusion MRI data. This protocol revealed a detailed mapping of thalamocortical and corticothalamic MD-PFC tracts in four different PFC territories (dorsal, medial, orbital/frontal pole, inferior frontal) showing structural connections resembling those observed in tracing studies with macaques. Furthermore, our automated protocol revealed high test-retest reproducibility and is made publicly available, constituting a step forward in mapping human MD-PFC circuits in clinical and academic research.


Asunto(s)
Núcleo Talámico Mediodorsal , Corteza Prefrontal , Adulto , Animales , Humanos , Masculino , Femenino , Reproducibilidad de los Resultados , Corteza Prefrontal/diagnóstico por imagen , Tálamo , Cognición , Macaca , Vías Nerviosas/diagnóstico por imagen
15.
BMC Bioinformatics ; 25(1): 26, 2024 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-38225565

RESUMEN

BACKGROUND: In recent years, human microbiome studies have received increasing attention as this field is considered a potential source for clinical applications. With the advancements in omics technologies and AI, research focused on the discovery for potential biomarkers in the human microbiome using machine learning tools has produced positive outcomes. Despite the promising results, several issues can still be found in these studies such as datasets with small number of samples, inconsistent results, lack of uniform processing and methodologies, and other additional factors lead to lack of reproducibility in biomedical research. In this work, we propose a methodology that combines the DADA2 pipeline for 16s rRNA sequences processing and the Recursive Ensemble Feature Selection (REFS) in multiple datasets to increase reproducibility and obtain robust and reliable results in biomedical research. RESULTS: Three experiments were performed analyzing microbiome data from patients/cases in Inflammatory Bowel Disease (IBD), Autism Spectrum Disorder (ASD), and Type 2 Diabetes (T2D). In each experiment, we found a biomarker signature in one dataset and applied to 2 other as further validation. The effectiveness of the proposed methodology was compared with other feature selection methods such as K-Best with F-score and random selection as a base line. The Area Under the Curve (AUC) was employed as a measure of diagnostic accuracy and used as a metric for comparing the results of the proposed methodology with other feature selection methods. Additionally, we use the Matthews Correlation Coefficient (MCC) as a metric to evaluate the performance of the methodology as well as for comparison with other feature selection methods. CONCLUSIONS: We developed a methodology for reproducible biomarker discovery for 16s rRNA microbiome sequence analysis, addressing the issues related with data dimensionality, inconsistent results and validation across independent datasets. The findings from the three experiments, across 9 different datasets, show that the proposed methodology achieved higher accuracy compared to other feature selection methods. This methodology is a first approach to increase reproducibility, to provide robust and reliable results.


Asunto(s)
Trastorno del Espectro Autista , Investigación Biomédica , Diabetes Mellitus Tipo 2 , Microbiota , Humanos , ARN Ribosómico 16S/genética , Reproducibilidad de los Resultados , Diabetes Mellitus Tipo 2/genética , Aprendizaje Automático , Biomarcadores , Microbiota/genética
16.
BMC Bioinformatics ; 25(1): 200, 2024 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-38802733

RESUMEN

BACKGROUND: The initial version of SEDA assists life science researchers without programming skills with the preparation of DNA and protein sequence FASTA files for multiple bioinformatics applications. However, the initial version of SEDA lacks a command-line interface for more advanced users and does not allow the creation of automated analysis pipelines. RESULTS: The present paper discusses the updates of the new SEDA release, including the addition of a complete command-line interface, new functionalities like gene annotation, a framework for automated pipelines, and improved integration in Linux environments. CONCLUSION: SEDA is an open-source Java application and can be installed using the different distributions available ( https://www.sing-group.org/seda/download.html ) as well as through a Docker image ( https://hub.docker.com/r/pegi3s/seda ). It is released under a GPL-3.0 license, and its source code is publicly accessible on GitHub ( https://github.com/sing-group/seda ). The software version at the time of submission is archived at Zenodo (version v1.6.0, http://doi.org/10.5281/zenodo.10201605 ).


Asunto(s)
Biología Computacional , Programas Informáticos , Biología Computacional/métodos , Análisis de Datos
17.
BMC Bioinformatics ; 25(1): 110, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38475691

RESUMEN

BACKGROUND: The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible research outcomes due to inconsistencies and the lack of standardization in the analysis process. These issues can lead to discrepancies in results, undermining the credibility and impact of bioinformatics research and creating mistrust in the scientific process. To address these challenges, open science practices such as sharing data, code, and methods have been encouraged. RESULTS: CREDO, a Customizable, REproducible, DOcker file generator for bioinformatics applications, has been developed as a tool to moderate reproducibility issues by building and distributing docker containers with embedded bioinformatics tools. CREDO simplifies the process of generating Docker images, facilitating reproducibility and efficient research in bioinformatics. The crucial step in generating a Docker image is creating the Dockerfile, which requires incorporating heterogeneous packages and environments such as Bioconductor and Conda. CREDO stores all required package information and dependencies in a Github-compatible format to enhance Docker image reproducibility, allowing easy image creation from scratch. The user-friendly GUI and CREDO's ability to generate modular Docker images make it an ideal tool for life scientists to efficiently create Docker images. Overall, CREDO is a valuable tool for addressing reproducibility issues in bioinformatics research and promoting open science practices.


Asunto(s)
Biología Computacional , Programas Informáticos , Reproducibilidad de los Resultados , Biología Computacional/métodos
18.
BMC Bioinformatics ; 25(1): 138, 2024 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-38553675

RESUMEN

Even though high-throughput transcriptome sequencing is routinely performed in many laboratories, computational analysis of such data remains a cumbersome process often executed manually, hence error-prone and lacking reproducibility. For corresponding data processing, we introduce Curare, an easy-to-use yet versatile workflow builder for analyzing high-throughput RNA-Seq data focusing on differential gene expression experiments. Data analysis with Curare is customizable and subdivided into preprocessing, quality control, mapping, and downstream analysis stages, providing multiple options for each step while ensuring the reproducibility of the workflow. For a fast and straightforward exploration and visualization of differential gene expression results, we provide the gene expression visualizer software GenExVis. GenExVis can create various charts and tables from simple gene expression tables and DESeq2 results without the requirement to upload data or install software packages. In combination, Curare and GenExVis provide a comprehensive software environment that supports the entire data analysis process, from the initial handling of raw RNA-Seq data to the final DGE analyses and result visualizations, thereby significantly easing data processing and subsequent interpretation.


Asunto(s)
Curare , RNA-Seq , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Transcriptoma , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Perfilación de la Expresión Génica/métodos
19.
BMC Bioinformatics ; 25(1): 8, 2024 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-38172657

RESUMEN

BACKGROUND: The increasing volume and complexity of genomic data pose significant challenges for effective data management and reuse. Public genomic data often undergo similar preprocessing across projects, leading to redundant or inconsistent datasets and inefficient use of computing resources. This is especially pertinent for bioinformaticians engaged in multiple projects. Tools have been created to address challenges in managing and accessing curated genomic datasets, however, the practical utility of such tools becomes especially beneficial for users who seek to work with specific types of data or are technically inclined toward a particular programming language. Currently, there exists a gap in the availability of an R-specific solution for efficient data management and versatile data reuse. RESULTS: Here we present ReUseData, an R software tool that overcomes some of the limitations of existing solutions and provides a versatile and reproducible approach to effective data management within R. ReUseData facilitates the transformation of ad hoc scripts for data preprocessing into Common Workflow Language (CWL)-based data recipes, allowing for the reproducible generation of curated data files in their generic formats. The data recipes are standardized and self-contained, enabling them to be easily portable and reproducible across various computing platforms. ReUseData also streamlines the reuse of curated data files and their integration into downstream analysis tools and workflows with different frameworks. CONCLUSIONS: ReUseData provides a reliable and reproducible approach for genomic data management within the R environment to enhance the accessibility and reusability of genomic data. The package is available at Bioconductor ( https://bioconductor.org/packages/ReUseData/ ) with additional information on the project website ( https://rcwl.org/dataRecipes/ ).


Asunto(s)
Manejo de Datos , Genómica , Programas Informáticos , Lenguajes de Programación , Flujo de Trabajo
20.
BMC Bioinformatics ; 25(1): 23, 2024 Jan 12.
Artículo en Inglés | MEDLINE | ID: mdl-38216898

RESUMEN

BACKGROUND: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches. RESULTS: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets. CONCLUSION: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.


Asunto(s)
Neoplasias , Humanos , Reproducibilidad de los Resultados , Neoplasias/genética , Entropía , Expresión Génica
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda