Pesquisa | Biblioteca Virtual em Saúde

1.

epidecodeR: a functional exploration tool for epigenetic and epitranscriptomic regulation.

Joshi, Kandarp; Wang, Dan O.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38271482

RESUMO

Recent technological advances in sequencing DNA and RNA modifications using high-throughput platforms have generated vast epigenomic and epitranscriptomic datasets whose power in transforming life science is yet fully unleashed. Currently available in silico methods have facilitated the identification, positioning and quantitative comparisons of individual modification sites. However, the essential challenge to link specific 'epi-marks' to gene expression in the particular context of cellular and biological processes is unmet. To fast-track exploration, we generated epidecodeR implemented in R, which allows biologists to quickly survey whether an epigenomic or epitranscriptomic status of their interest potentially influences gene expression responses. The evaluation is based on the cumulative distribution function and the statistical significance in differential expression of genes grouped by the number of 'epi-marks'. This tool proves useful in predicting the role of H3K9ac and H3K27ac in associated gene expression after knocking down deacetylases FAM60A and SDS3 and N6-methyl-adenosine-associated gene expression after knocking out the reader proteins. We further used epidecodeR to explore the effectiveness of demethylase FTO inhibitors and histone-associated modifications in drug abuse in animals. epidecodeR is available for downloading as an R package at https://bioconductor.riken.jp/packages/3.13/bioc/html/epidecodeR.html.

Assuntos

Epigenômica , Software , Animais , Epigenômica/métodos , Metilação de DNA , DNA/metabolismo , Epigênese Genética

2.

Label-free proteome quantification and evaluation.

Fu, Jianbo; Yang, Qingxia; Luo, Yongchao; Zhang, Song; Tang, Jing; Zhang, Ying; Zhang, Hongning; Xu, Hanxiang; Zhu, Feng.

Brief Bioinform ; 24(1)2023 01 19.

Artigo em Inglês | MEDLINE | ID: mdl-36403090

RESUMO

The label-free quantification (LFQ) has emerged as an exceptional technique in proteomics owing to its broad proteome coverage, great dynamic ranges and enhanced analytical reproducibility. Due to the extreme difficulty lying in an in-depth quantification, the LFQ chains incorporating a variety of transformation, pretreatment and imputation methods are required and constructed. However, it remains challenging to determine the well-performing chain, owing to its strong dependence on the studied data and the diverse possibility of integrated chains. In this study, an R package EVALFQ was therefore constructed to enable a performance evaluation on >3000 LFQ chains. This package is unique in (a) automatically evaluating the performance using multiple criteria, (b) exploring the quantification accuracy based on spiking proteins and (c) discovering the well-performing chains by comprehensive assessment. All in all, because of its superiority in assessing from multiple perspectives and scanning among over 3000 chains, this package is expected to attract broad interests from the fields of proteomic quantification. The package is available at https://github.com/idrblab/EVALFQ.

Assuntos

Proteoma , Proteômica , Proteoma/metabolismo , Proteômica/métodos , Reprodutibilidade dos Testes

3.

kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R.

Aslett, Louis J M; Christ, Ryan R.

BMC Bioinformatics ; 25(1): 86, 2024 Feb 28.

Artigo em Inglês | MEDLINE | ID: mdl-38418970

RESUMO

BACKGROUND: Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N × N distance matrix based on posterior decodings. RESULTS: We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. CONCLUSIONS: The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.

Assuntos

Genoma , Genômica , Humanos , Cadeias de Markov , Haplótipos , Etnicidade , Genética Populacional

4.

Inference of genomic landscapes using ordered Hidden Markov Models with emission densities (oHMMed).

Vogl, Claus; Karapetiants, Mariia; Yildirim, Burçin; Kjartansdóttir, Hrönn; Kosiol, Carolin; Bergman, Juraj; Majka, Michal; Mikula, Lynette Caitlin.

BMC Bioinformatics ; 25(1): 151, 2024 Apr 16.

Artigo em Inglês | MEDLINE | ID: mdl-38627634

RESUMO

BACKGROUND: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. RESULTS: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. CONCLUSIONS: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses.

Assuntos

Genoma , Genômica , Animais , Humanos , Camundongos , Cadeias de Markov , Composição de Bases , Probabilidade , Algoritmos

5.

IsoForma: An R Package for Quantifying and Visualizing Positional Isomers in Top-Down LC-MS/MS Data.

Degnan, David J; Lewis, Logan A; Bramer, Lisa M; McCue, Lee Ann; Pesavento, James J; Zhou, Mowei; Bilbao, Aivett.

J Proteome Res ; 23(8): 3318-3321, 2024 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-38421884

RESUMO

Proteoforms, the different forms of a protein with sequence variations including post-translational modifications (PTMs), execute vital functions in biological systems, such as cell signaling and epigenetic regulation. Advances in top-down mass spectrometry (MS) technology have permitted the direct characterization of intact proteoforms and their exact number of modification sites, allowing for the relative quantification of positional isomers (PI). Protein positional isomers refer to a set of proteoforms with identical total mass and set of modifications, but varying PTM site combinations. The relative abundance of PI can be estimated by matching proteoform-specific fragment ions to top-down tandem MS (MS2) data to localize and quantify modifications. However, the current approaches heavily rely on manual annotation. Here, we present IsoForma, an open-source R package for the relative quantification of PI within a single tool. Benchmarking IsoForma's performance against two existing workflows produced comparable results and improvements in speed. Overall, IsoForma provides a streamlined process for quantifying PI, reduces the analysis time, and offers an essential framework for developing customized proteoform analysis workflows. The software is open source and available at https://github.com/EMSL-Computing/isoforma-lib.

Assuntos

Espectrometria de Massa com Cromatografia Líquida , Isoformas de Proteínas , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas em Tandem , Humanos , Isomerismo , Espectrometria de Massa com Cromatografia Líquida/métodos , Isoformas de Proteínas/análise , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos

6.

SpaceANOVA: Spatial Co-occurrence Analysis of Cell Types in Multiplex Imaging Data Using Point Process and Functional ANOVA.

Seal, Souvik; Neelon, Brian; Angel, Peggi M; O'Quinn, Elizabeth C; Hill, Elizabeth; Vu, Thao; Ghosh, Debashis; Mehta, Anand S; Wallace, Kristin; Alekseyenko, Alexander V.

J Proteome Res ; 23(4): 1131-1143, 2024 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-38417823

RESUMO

Multiplex imaging platforms have enabled the identification of the spatial organization of different types of cells in complex tissue or the tumor microenvironment. Exploring the potential variations in the spatial co-occurrence or colocalization of different cell types across distinct tissue or disease classes can provide significant pathological insights, paving the way for intervention strategies. However, the existing methods in this context either rely on stringent statistical assumptions or suffer from a lack of generalizability. We present a highly powerful method to study differential spatial co-occurrence of cell types across multiple tissue or disease groups, based on the theories of the Poisson point process and functional analysis of variance. Notably, the method accommodates multiple images per subject and addresses the problem of missing tissue regions, commonly encountered due to data-collection complexities. We demonstrate the superior statistical power and robustness of the method in comparison with existing approaches through realistic simulation studies. Furthermore, we apply the method to three real data sets on different diseases collected using different imaging platforms. In particular, one of these data sets reveals novel insights into the spatial characteristics of various types of colorectal adenoma.

Assuntos

Simulação por Computador , Análise de Variância

7.

Multicenter Collaborative Study to Optimize Mass Spectrometry Workflows of Clinical Specimens.

Kardell, Oliver; von Toerne, Christine; Merl-Pham, Juliane; König, Ann-Christine; Blindert, Marcel; Barth, Teresa K; Mergner, Julia; Ludwig, Christina; Tüshaus, Johanna; Eckert, Stephan; Müller, Stephan A; Breimann, Stephan; Giesbertz, Pieter; Bernhardt, Alexander M; Schweizer, Lisa; Albrecht, Vincent; Teupser, Daniel; Imhof, Axel; Kuster, Bernhard; Lichtenthaler, Stefan F; Mann, Matthias; Cox, Jürgen; Hauck, Stefanie M.

J Proteome Res ; 23(1): 117-129, 2024 01 05.

Artigo em Inglês | MEDLINE | ID: mdl-38015820

RESUMO

The foundation for integrating mass spectrometry (MS)-based proteomics into systems medicine is the development of standardized start-to-finish and fit-for-purpose workflows for clinical specimens. An essential step in this pursuit is to highlight the common ground in a diverse landscape of different sample preparation techniques and liquid chromatography-mass spectrometry (LC-MS) setups. With the aim to benchmark and improve the current best practices among the proteomics MS laboratories of the CLINSPECT-M consortium, we performed two consecutive round-robin studies with full freedom to operate in terms of sample preparation and MS measurements. The six study partners were provided with two clinically relevant sample matrices: plasma and cerebrospinal fluid (CSF). In the first round, each laboratory applied their current best practice protocol for the respective matrix. Based on the achieved results and following a transparent exchange of all lab-specific protocols within the consortium, each laboratory could advance their methods before measuring the same samples in the second acquisition round. Both time points are compared with respect to identifications (IDs), data completeness, and precision, as well as reproducibility. As a result, the individual performances of participating study centers were improved in the second measurement, emphasizing the effect and importance of the expert-driven exchange of best practices for direct practical improvements.

Assuntos

Plasma , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/métodos , Fluxo de Trabalho , Reprodutibilidade dos Testes , Plasma/química

8.

Ravages: An R package for the simulation and analysis of rare variants in multicategory phenotypes.

Bocher, Ozvan; Marenne, Gaëlle; Génin, Emmanuelle; Perdry, Hervé.

Genet Epidemiol ; 47(6): 450-460, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37158367

RESUMO

Current software packages for the analysis and the simulations of rare variants are only available for binary and continuous traits. Ravages provides solutions in a single R package to perform rare variant association tests for multicategory, binary and continuous phenotypes, to simulate datasets under different scenarios and to compute statistical power. Association tests can be run in the whole genome thanks to C++ implementation of most of the functions, using either RAVA-FIRST, a recently developed strategy to filter and analyse genome-wide rare variants, or user-defined candidate regions. Ravages also includes a simulation module that generates genetic data for cases who can be stratified into several subgroups and for controls. Through comparisons with existing programmes, we show that Ravages complements existing tools and will be useful to study the genetic architecture of complex diseases. Ravages is available on the CRAN at https://cran.r-project.org/web/packages/Ravages/ and maintained on Github at https://github.com/genostats/Ravages.

Assuntos

Variação Genética , Modelos Genéticos , Humanos , Simulação por Computador , Fenótipo , Software

9.

Neutrality in plant-herbivore interactions.

Pan, Vincent S; Wetzel, William C.

Proc Biol Sci ; 291(2017): 20232687, 2024 Feb 28.

Artigo em Inglês | MEDLINE | ID: mdl-38378151

RESUMO

Understanding the distribution of herbivore damage among leaves and individual plants is a central goal of plant-herbivore biology. Commonly observed unequal patterns of herbivore damage have conventionally been attributed to the heterogeneity in plant quality or herbivore behaviour or distribution. Meanwhile, the potential role of stochastic processes in structuring plant-herbivore interactions has been overlooked. Here, we show that based on simple first principle expectations from metabolic theory, random sampling of different sizes of herbivores from a regional pool is sufficient to explain patterns of variation in herbivore damage. This is despite making the neutral assumption that herbivory is caused by randomly feeding herbivores on identical and passive plants. We then compared its predictions against 765 datasets of herbivory on 496 species across 116° of latitude from the Herbivory Variability Network. Using only one free parameter, the estimated attack rate, our neutral model approximates the observed frequency distribution of herbivore damage among plants and especially among leaves very well. Our results suggest that neutral stochastic processes play a large and underappreciated role in natural variation in herbivory and may explain the low predictability of herbivory patterns. We argue that such prominence warrants its consideration as a powerful force in plant-herbivore interactions.

Assuntos

Herbivoria , Folhas de Planta , Plantas

10.

LION: an integrated R package for effective prediction of ncRNA-protein interaction.

Han, Siyu; Yang, Xiao; Sun, Hang; Yang, Hu; Zhang, Qi; Peng, Cheng; Fang, Wensi; Li, Ying.

Brief Bioinform ; 23(6)2022 11 19.

Artigo em Inglês | MEDLINE | ID: mdl-36155620

RESUMO

Understanding ncRNA-protein interaction is of critical importance to unveil ncRNAs' functions. Here, we propose an integrated package LION which comprises a new method for predicting ncRNA/lncRNA-protein interaction as well as a comprehensive strategy to meet the requirement of customisable prediction. Experimental results demonstrate that our method outperforms its competitors on multiple benchmark datasets. LION can also improve the performance of some widely used tools and build adaptable models for species- and tissue-specific prediction. We expect that LION will be a powerful and efficient tool for the prediction and analysis of ncRNA/lncRNA-protein interaction. The R Package LION is available on GitHub at https://github.com/HAN-Siyu/LION/.

Assuntos

RNA Longo não Codificante , RNA não Traduzido/genética

11.

Improve consensus partitioning via a hierarchical procedure.

Gu, Zuguang; Hübschmann, Daniel.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35289356

RESUMO

Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/.

Assuntos

Software , Consenso

12.

A tool for feature extraction from biological sequences.

Amerifar, Sare; Norouzi, Mahammad; Ghandi, Mahmoud.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35383372

RESUMO

With the advances in sequencing technologies, a huge amount of biological data is extracted nowadays. Analyzing this amount of data is beyond the ability of human beings, creating a splendid opportunity for machine learning methods to grow. The methods, however, are practical only when the sequences are converted into feature vectors. Many tools target this task including iLearnPlus, a Python-based tool which supports a rich set of features. In this paper, we propose a holistic tool that extracts features from biological sequences (i.e. DNA, RNA and Protein). These features are the inputs to machine learning models that predict properties, structures or functions of the input sequences. Our tool not only supports all features in iLearnPlus but also 30 additional features which exist in the literature. Moreover, our tool is based on R language which makes an alternative for bioinformaticians to transform sequences into feature vectors. We have compared the conversion time of our tool with that of iLearnPlus: we transform the sequences much faster. We convert small nucleotides by a median of 2.8X faster, while we outperform iLearnPlus by a median of 6.3X for large sequences. Finally, in amino acids, our tool achieves a median speedup of 23.9X.

Assuntos

Aprendizado de Máquina , Proteínas , DNA/genética , Humanos , Proteínas/química , RNA/genética , Análise de Sequência/métodos

13.

ggPlantmap: an open-source R package for the creation of informative and quantitative ggplot maps derived from plant images.

Jo, Leonardo; Kajala, Kaisa.

J Exp Bot ; 2024 Feb 08.

Artigo em Inglês | MEDLINE | ID: mdl-38329371

RESUMO

As plant research generates an ever-growing volume of spatial quantitative data, the need for decentralized and user-friendly visualization tools to explore large and complex datasets tools becomes crucial. Existing resources, such as the Plant eFP (electronic Fluorescent Pictograph) viewer, have played a pivotal role on the communication of gene expression data across many plant species. However, although widely used by the plant research community, the Plant eFP viewer lacks open and user-friendly tools for the creation of customized expression maps independently. Plant biologists with less coding experience can often encounter challenges when attempting to explore ways to communicate their own spatial quantitative data. We present 'ggPlantmap' an open-source R package designed to address this challenge by providing an easy and user-friendly method for the creation of ggplot representative maps from plant images. ggPlantmap is built in R, one of the most used languages in biology to empower plant scientists to create and customize eFP-like viewers tailored to their experimental data. Here, we provide an overview of the package and tutorials that are accessible even to users with minimal R programming experience. We hope that ggPlantmap can assist the plant science community, fostering innovation and improving our understanding of plant development and function.

14.

SurvdigitizeR: an algorithm for automated survival curve digitization.

Zhang, Jasper Zhongyuan; Rios, Juan David; Pechlivanoglou, Tilemanchos; Yang, Alan; Zhang, Qiyue; Deris, Dimitrios; Cromwell, Ian; Pechlivanoglou, Petros.

BMC Med Res Methodol ; 24(1): 147, 2024 Jul 13.

Artigo em Inglês | MEDLINE | ID: mdl-39003440

RESUMO

BACKGROUND: Decision analytic models and meta-analyses often rely on survival probabilities that are digitized from published Kaplan-Meier (KM) curves. However, manually extracting these probabilities from KM curves is time-consuming, expensive, and error-prone. We developed an efficient and accurate algorithm that automates extraction of survival probabilities from KM curves. METHODS: The automated digitization algorithm processes images from a JPG or PNG format, converts them in their hue, saturation, and lightness scale and uses optical character recognition to detect axis location and labels. It also uses a k-medoids clustering algorithm to separate multiple overlapping curves on the same figure. To validate performance, we generated survival plots form random time-to-event data from a sample size of 25, 50, 150, and 250, 1000 individuals split into 1,2, or 3 treatment arms. We assumed an exponential distribution and applied random censoring. We compared automated digitization and manual digitization performed by well-trained researchers. We calculated the root mean squared error (RMSE) at 100-time points for both methods. The algorithm's performance was also evaluated by Bland-Altman analysis for the agreement between automated and manual digitization on a real-world set of published KM curves. RESULTS: The automated digitizer accurately identified survival probabilities over time in the simulated KM curves. The average RMSE for automated digitization was 0.012, while manual digitization had an average RMSE of 0.014. Its performance was negatively correlated with the number of curves in a figure and the presence of censoring markers. In real-world scenarios, automated digitization and manual digitization showed very close agreement. CONCLUSIONS: The algorithm streamlines the digitization process and requires minimal user input. It effectively digitized KM curves in simulated and real-world scenarios, demonstrating accuracy comparable to conventional manual digitization. The algorithm has been developed as an open-source R package and as a Shiny application and is available on GitHub: https://github.com/Pechli-Lab/SurvdigitizeR and https://pechlilab.shinyapps.io/SurvdigitizeR/ .

Assuntos

Algoritmos , Humanos , Estimativa de Kaplan-Meier , Análise de Sobrevida , Probabilidade

15.

crossnma: An R package to synthesize cross-design evidence and cross-format data using network meta-analysis and network meta-regression.

Hamza, Tasnim; Schwarzer, Guido; Salanti, Georgia.

BMC Med Res Methodol ; 24(1): 169, 2024 Aug 05.

Artigo em Inglês | MEDLINE | ID: mdl-39103781

RESUMO

BACKGROUND: Although aggregate data (AD) from randomised clinical trials (RCTs) are used in the majority of network meta-analyses (NMAs), other study designs (e.g., cohort studies and other non-randomised studies, NRS) can be informative about relative treatment effects. The individual participant data (IPD) of the study, when available, are preferred to AD for adjusting for important participant characteristics and to better handle heterogeneity and inconsistency in the network. RESULTS: We developed the R package crossnma to perform cross-format (IPD and AD) and cross-design (RCT and NRS) NMA and network meta-regression (NMR). The models are implemented as Bayesian three-level hierarchical models using Just Another Gibbs Sampler (JAGS) software within the R environment. The R package crossnma includes functions to automatically create the JAGS model, reformat the data (based on user input), assess convergence and summarize the results. We demonstrate the workflow within crossnma by using a network of six trials comparing four treatments. CONCLUSIONS: The R package crossnma enables the user to perform NMA and NMR with different data types in a Bayesian framework and facilitates the inclusion of all types of evidence recognising differences in risk of bias.

Assuntos

Teorema de Bayes , Metanálise em Rede , Software , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Projetos de Pesquisa , Algoritmos , Metanálise como Assunto

16.

IncidencePrevalence: An R package to calculate population-level incidence rates and prevalence using the OMOP common data model.

Raventós, Berta; Català, Martí; Du, Mike; Guo, Yuchen; Black, Adam; Inberg, Ger; Li, Xintong; López-Güell, Kim; Newby, Danielle; de Ridder, Maria; Barboza, Cesar; Duarte-Salles, Talita; Verhamme, Katia; Rijnbeek, Peter; Prieto Alhambra, Daniel; Burn, Edward.

Pharmacoepidemiol Drug Saf ; 33(1): e5717, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37876360

RESUMO

PURPOSE: Real-world data (RWD) offers a valuable resource for generating population-level disease epidemiology metrics. We aimed to develop a well-tested and user-friendly R package to compute incidence rates and prevalence in data mapped to the observational medical outcomes partnership (OMOP) common data model (CDM). MATERIALS AND METHODS: We created IncidencePrevalence, an R package to support the analysis of population-level incidence rates and point- and period-prevalence in OMOP-formatted data. On top of unit testing, we assessed the face validity of the package. To do so, we calculated incidence rates of COVID-19 using RWD from Spain (SIDIAP) and the United Kingdom (CPRD Aurum), and replicated two previously published studies using data from the Netherlands (IPCI) and the United Kingdom (CPRD Gold). We compared the obtained results to those previously published, and measured execution times by running a benchmark analysis across databases. RESULTS: IncidencePrevalence achieved high agreement to previously published data in CPRD Gold and IPCI, and showed good performance across databases. For COVID-19, incidence calculated by the package was similar to public data after the first-wave of the pandemic. CONCLUSION: For data mapped to the OMOP CDM, the IncidencePrevalence R package can support descriptive epidemiological research. It enables reliable estimation of incidence and prevalence from large real-world data sets. It represents a simple, but extendable, analytical framework to generate estimates in a reproducible and timely manner.

Assuntos

COVID-19 , Gerenciamento de Dados , Humanos , Incidência , Prevalência , Bases de Dados Factuais , COVID-19/epidemiologia

17.

Statistical rules for safety monitoring in clinical trials.

Martens, Michael J; Logan, Brent R.

Clin Trials ; 21(2): 152-161, 2024 04.

Artigo em Inglês | MEDLINE | ID: mdl-37877375

RESUMO

BACKGROUND/AIMS: Protecting patient safety is an essential component of the conduct of clinical trials. Rigorous safety monitoring schemes are implemented for these studies to guard against excess toxicity risk from study therapies. They often include protocol-specified stopping rules dictating that an excessive number of safety events will trigger a halt of the study. Statistical methods are useful for constructing rules that protect patients from exposure to excessive toxicity while also maintaining the chance of a false safety signal at a low level. Several statistical techniques have been proposed for this purpose, but the current literature lacks a rigorous comparison to determine which method may be best suitable for a given trial design. The aims of this article are (1) to describe a general framework for repeated monitoring of safety events in clinical trials; (2) to survey common statistical techniques for creating safety stopping criteria; and (3) to provide investigators with a software tool for constructing and assessing these stopping rules. METHODS: The properties and operating characteristics of stopping rules produced by Pocock and O'Brien-Fleming tests, Bayesian Beta-Binomial models, and sequential probability ratio tests (SPRTs) are studied and compared for common scenarios that may arise in phase II and III trials. We developed the R package "stoppingrule" for constructing and evaluating stopping rules from these methods. Its usage is demonstrated through a redesign of a stopping rule for BMT CTN 0601 (registered at Clinicaltrials.gov as NCT00745420), a phase II, single-arm clinical trial that evaluated outcomes in pediatric sickle cell disease patients treated by bone marrow transplant. RESULTS: Methods with aggressive stopping criteria early in the trial, such as the Pocock test and Bayesian Beta-Binomial models with weak priors, have permissive stopping criteria at late stages. This results in a trade-off where rules with aggressive early monitoring generally will have a smaller number of expected toxicities but also lower power than rules with more conservative early stopping, such as the O-Brien-Fleming test and Beta-Binomial models with strong priors. The modified SPRT method is sensitive to the choice of alternative toxicity rate. The maximized SPRT generally has a higher number of expected toxicities and/or worse power than other methods. CONCLUSIONS: Because the goal is to minimize the number of patients exposed to and experiencing toxicities from an unsafe therapy, we recommend using the Pocock or Beta-Binomial, weak prior methods for constructing safety stopping rules. At the design stage, the operating characteristics of candidate rules should be evaluated under various possible toxicity rates in order to guide the choice of rule(s) for a given trial; our R package facilitates this evaluation.

Assuntos

Modelos Estatísticos , Projetos de Pesquisa , Humanos , Criança , Teorema de Bayes , Probabilidade , Avaliação de Resultados em Cuidados de Saúde

18.

PCAS: An Integrated Tool for Multi-Dimensional Cancer Research Utilizing Clinical Proteomic Tumor Analysis Consortium Data.

Wang, Jin; Song, Xiangrong; Wei, Meidan; Qin, Lexin; Zhu, Qingyun; Wang, Shujie; Liang, Tingting; Hu, Wentao; Zhu, Xinyu; Li, Jianxiang.

Int J Mol Sci ; 25(12)2024 Jun 18.

Artigo em Inglês | MEDLINE | ID: mdl-38928396

RESUMO

Proteomics offers a robust method for quantifying proteins and elucidating their roles in cellular functions, surpassing the insights provided by transcriptomics. The Clinical Proteomic Tumor Analysis Consortium database, enriched with comprehensive cancer proteomics data including phosphorylation and ubiquitination profiles, alongside transcriptomics data from the Genomic Data Commons, allow for integrative molecular studies of cancer. The ProteoCancer Analysis Suite (PCAS), our newly developed R package and Shinyapp, leverages these resources to facilitate in-depth analyses of proteomics, phosphoproteomics, and transcriptomics, enhancing our understanding of the tumor microenvironment through features like immune infiltration and drug sensitivity analysis. This tool aids in identifying critical signaling pathways and therapeutic targets, particularly through its detailed phosphoproteomic analysis. To demonstrate the functionality of the PCAS, we conducted an analysis of GAPDH across multiple cancer types, revealing a significant upregulation of protein levels, which is consistent with its important biological and clinical significance in tumors, as indicated in our prior research. Further experiments were used to validate the findings performed using the tool. In conclusion, the PCAS is a powerful and valuable tool for conducting comprehensive proteomic analyses, significantly enhancing our ability to uncover oncogenic mechanisms and identify potential therapeutic targets in cancer research.

Assuntos

Neoplasias , Proteômica , Humanos , Proteômica/métodos , Neoplasias/metabolismo , Neoplasias/genética , Microambiente Tumoral/genética , Software , Biologia Computacional/métodos , Proteoma/metabolismo

19.

Causal moderated mediation analysis: Methods and software.

Qin, Xu; Wang, Lijuan.

Behav Res Methods ; 56(3): 1314-1334, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-37845423

RESUMO

Research questions regarding how, for whom, and where a treatment achieves its effect on an outcome have become increasingly valued in substantive research. Such questions can be answered by causal moderated mediation analysis, which assesses the heterogeneity of the mediation mechanism underlying the treatment effect across individual and contextual characteristics. Various moderated mediation analysis methods have been developed under the traditional path analysis/structural equation modeling framework. One challenge is that the definitions of moderated mediation effects depend on statistical models of the mediator and the outcome, and no solutions have been provided when either the mediator or the outcome is binary, or when the mediator or outcome model is nonlinear. In addition, it remains unclear to empirical researchers how to make causal arguments of moderated mediation effects due to a lack of clarifications of the underlying assumptions and methods for assessing the sensitivity to violations of the assumptions. This article overcomes the limitations by developing general definition, identification, estimation, and sensitivity analysis for causal moderated mediation effects under the potential outcomes framework. We also developed a user-friendly R package moderate.mediation ( https://cran.r-project.org/web/packages/moderate.mediation/index.html ) that allows applied researchers to easily implement the proposed methods and visualize the initial analysis results and sensitivity analysis results. We illustrated the application of the proposed methods and the package implementation with a re-analysis of the National Evaluation of Welfare-to-Work Strategies (NEWWS) Riverside data.

Assuntos

Modelos Estatísticos , Software , Humanos , Causalidade

20.

materialmodifier: An R package of photo editing effects for material perception research.

Tsuda, Hiroyuki; Kawabata, Hideaki.

Behav Res Methods ; 56(3): 2657-2674, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-37162649

RESUMO

In this paper, we introduce an R package that performs automated photo editing effects. Specifically, it is an R implementation of an image-processing algorithm proposed by Boyadzhiev et al. (2015). The software allows the user to manipulate the appearance of objects in photographs, such as emphasizing facial blemishes and wrinkles, smoothing the skin, or enhancing the gloss of fruit. It provides a reproducible method to quantitatively control specific surface properties of objects (e.g., gloss and roughness), which is useful for researchers interested in topics related to material perception, from basic mechanisms of perception to the aesthetic evaluation of faces and objects. We describe the functionality, usage, and algorithm of the method, report on the findings of a behavioral evaluation experiment, and discuss its usefulness and limitations for psychological research. The package can be installed via CRAN, and documentation and source code are available at https://github.com/tsuda16k/materialmodifier .

Assuntos

Algoritmos , Software , Humanos , Percepção

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA