RESUMO
Recent technological advances in sequencing DNA and RNA modifications using high-throughput platforms have generated vast epigenomic and epitranscriptomic datasets whose power in transforming life science is yet fully unleashed. Currently available in silico methods have facilitated the identification, positioning and quantitative comparisons of individual modification sites. However, the essential challenge to link specific 'epi-marks' to gene expression in the particular context of cellular and biological processes is unmet. To fast-track exploration, we generated epidecodeR implemented in R, which allows biologists to quickly survey whether an epigenomic or epitranscriptomic status of their interest potentially influences gene expression responses. The evaluation is based on the cumulative distribution function and the statistical significance in differential expression of genes grouped by the number of 'epi-marks'. This tool proves useful in predicting the role of H3K9ac and H3K27ac in associated gene expression after knocking down deacetylases FAM60A and SDS3 and N6-methyl-adenosine-associated gene expression after knocking out the reader proteins. We further used epidecodeR to explore the effectiveness of demethylase FTO inhibitors and histone-associated modifications in drug abuse in animals. epidecodeR is available for downloading as an R package at https://bioconductor.riken.jp/packages/3.13/bioc/html/epidecodeR.html.
Assuntos
Epigenômica , Software , Animais , Epigenômica/métodos , Metilação de DNA , DNA/metabolismo , Epigênese GenéticaRESUMO
The label-free quantification (LFQ) has emerged as an exceptional technique in proteomics owing to its broad proteome coverage, great dynamic ranges and enhanced analytical reproducibility. Due to the extreme difficulty lying in an in-depth quantification, the LFQ chains incorporating a variety of transformation, pretreatment and imputation methods are required and constructed. However, it remains challenging to determine the well-performing chain, owing to its strong dependence on the studied data and the diverse possibility of integrated chains. In this study, an R package EVALFQ was therefore constructed to enable a performance evaluation on >3000 LFQ chains. This package is unique in (a) automatically evaluating the performance using multiple criteria, (b) exploring the quantification accuracy based on spiking proteins and (c) discovering the well-performing chains by comprehensive assessment. All in all, because of its superiority in assessing from multiple perspectives and scanning among over 3000 chains, this package is expected to attract broad interests from the fields of proteomic quantification. The package is available at https://github.com/idrblab/EVALFQ.
Assuntos
Proteoma , Proteômica , Proteoma/metabolismo , Proteômica/métodos , Reprodutibilidade dos TestesRESUMO
Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. mulea is distributed as a CRAN R package downloadable from https://cran.r-project.org/web/packages/mulea/ and https://github.com/ELTEbioinformatics/mulea . It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.
Assuntos
Ontologia Genética , Software , Bases de Dados Genéticas , Biologia Computacional/métodosRESUMO
BACKGROUND: Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N × N distance matrix based on posterior decodings. RESULTS: We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. CONCLUSIONS: The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.
Assuntos
Genoma , Genômica , Humanos , Cadeias de Markov , Haplótipos , Etnicidade , Genética PopulacionalRESUMO
BACKGROUND: Genomes are inherently inhomogeneous, with features such as base composition, recombination, gene density, and gene expression varying along chromosomes. Evolutionary, biological, and biomedical analyses aim to quantify this variation, account for it during inference procedures, and ultimately determine the causal processes behind it. Since sequential observations along chromosomes are not independent, it is unsurprising that autocorrelation patterns have been observed e.g., in human base composition. In this article, we develop a class of Hidden Markov Models (HMMs) called oHMMed (ordered HMM with emission densities, the corresponding R package of the same name is available on CRAN): They identify the number of comparably homogeneous regions within autocorrelated observed sequences. These are modelled as discrete hidden states; the observed data points are realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are inferred. RESULTS: We apply our oHMMed algorithms to the proportion of G and C bases (modelled as a mixture of normal distributions) and the number of genes (modelled as a mixture of poisson-gamma distributions) in windows along the human, mouse, and fruit fly genomes. This results in a partitioning of the genomes into regions by statistically distinguishable averages of these features, and in a characterisation of their continuous patterns of variation. In regard to the genomic G and C proportion, this latter result distinguishes oHMMed from segmentation algorithms based in isochore or compositional domain theory. We further use oHMMed to conduct a detailed analysis of variation of chromatin accessibility (ATAC-seq) and epigenetic markers H3K27ac and H3K27me3 (modelled as a mixture of poisson-gamma distributions) along the human chromosome 1 and their correlations. CONCLUSIONS: Our algorithms provide a biologically assumption free approach to characterising genomic landscapes shaped by continuous, autocorrelated patterns of variation. Despite this, the resulting genome segmentation enables extraction of compositionally distinct regions for further downstream analyses.
Assuntos
Genoma , Genômica , Animais , Humanos , Camundongos , Cadeias de Markov , Composição de Bases , Probabilidade , AlgoritmosRESUMO
Proteoforms, the different forms of a protein with sequence variations including post-translational modifications (PTMs), execute vital functions in biological systems, such as cell signaling and epigenetic regulation. Advances in top-down mass spectrometry (MS) technology have permitted the direct characterization of intact proteoforms and their exact number of modification sites, allowing for the relative quantification of positional isomers (PI). Protein positional isomers refer to a set of proteoforms with identical total mass and set of modifications, but varying PTM site combinations. The relative abundance of PI can be estimated by matching proteoform-specific fragment ions to top-down tandem MS (MS2) data to localize and quantify modifications. However, the current approaches heavily rely on manual annotation. Here, we present IsoForma, an open-source R package for the relative quantification of PI within a single tool. Benchmarking IsoForma's performance against two existing workflows produced comparable results and improvements in speed. Overall, IsoForma provides a streamlined process for quantifying PI, reduces the analysis time, and offers an essential framework for developing customized proteoform analysis workflows. The software is open source and available at https://github.com/EMSL-Computing/isoforma-lib.
Assuntos
Espectrometria de Massa com Cromatografia Líquida , Isoformas de Proteínas , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas em Tandem , Humanos , Isomerismo , Espectrometria de Massa com Cromatografia Líquida/métodos , Isoformas de Proteínas/análise , Proteômica/métodos , Espectrometria de Massas em Tandem/métodosRESUMO
Multiplex imaging platforms have enabled the identification of the spatial organization of different types of cells in complex tissue or the tumor microenvironment. Exploring the potential variations in the spatial co-occurrence or colocalization of different cell types across distinct tissue or disease classes can provide significant pathological insights, paving the way for intervention strategies. However, the existing methods in this context either rely on stringent statistical assumptions or suffer from a lack of generalizability. We present a highly powerful method to study differential spatial co-occurrence of cell types across multiple tissue or disease groups, based on the theories of the Poisson point process and functional analysis of variance. Notably, the method accommodates multiple images per subject and addresses the problem of missing tissue regions, commonly encountered due to data-collection complexities. We demonstrate the superior statistical power and robustness of the method in comparison with existing approaches through realistic simulation studies. Furthermore, we apply the method to three real data sets on different diseases collected using different imaging platforms. In particular, one of these data sets reveals novel insights into the spatial characteristics of various types of colorectal adenoma.
Assuntos
Simulação por Computador , Análise de VariânciaRESUMO
The foundation for integrating mass spectrometry (MS)-based proteomics into systems medicine is the development of standardized start-to-finish and fit-for-purpose workflows for clinical specimens. An essential step in this pursuit is to highlight the common ground in a diverse landscape of different sample preparation techniques and liquid chromatography-mass spectrometry (LC-MS) setups. With the aim to benchmark and improve the current best practices among the proteomics MS laboratories of the CLINSPECT-M consortium, we performed two consecutive round-robin studies with full freedom to operate in terms of sample preparation and MS measurements. The six study partners were provided with two clinically relevant sample matrices: plasma and cerebrospinal fluid (CSF). In the first round, each laboratory applied their current best practice protocol for the respective matrix. Based on the achieved results and following a transparent exchange of all lab-specific protocols within the consortium, each laboratory could advance their methods before measuring the same samples in the second acquisition round. Both time points are compared with respect to identifications (IDs), data completeness, and precision, as well as reproducibility. As a result, the individual performances of participating study centers were improved in the second measurement, emphasizing the effect and importance of the expert-driven exchange of best practices for direct practical improvements.
Assuntos
Plasma , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/métodos , Fluxo de Trabalho , Reprodutibilidade dos Testes , Plasma/químicaRESUMO
Current software packages for the analysis and the simulations of rare variants are only available for binary and continuous traits. Ravages provides solutions in a single R package to perform rare variant association tests for multicategory, binary and continuous phenotypes, to simulate datasets under different scenarios and to compute statistical power. Association tests can be run in the whole genome thanks to C++ implementation of most of the functions, using either RAVA-FIRST, a recently developed strategy to filter and analyse genome-wide rare variants, or user-defined candidate regions. Ravages also includes a simulation module that generates genetic data for cases who can be stratified into several subgroups and for controls. Through comparisons with existing programmes, we show that Ravages complements existing tools and will be useful to study the genetic architecture of complex diseases. Ravages is available on the CRAN at https://cran.r-project.org/web/packages/Ravages/ and maintained on Github at https://github.com/genostats/Ravages.
Assuntos
Variação Genética , Modelos Genéticos , Humanos , Simulação por Computador , Fenótipo , SoftwareRESUMO
BACKGROUND: Bio-ontologies are keys in structuring complex biological information for effective data integration and knowledge representation. Semantic similarity analysis on bio-ontologies quantitatively assesses the degree of similarity between biological concepts based on the semantics encoded in ontologies. It plays an important role in structured and meaningful interpretations and integration of complex data from multiple biological domains. RESULTS: We present simona, a novel R package for semantic similarity analysis on general bio-ontologies. Simona implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. Moreover, it provides a robust toolbox supporting over 70 methods for semantic similarity analysis. With simona, we conducted a benchmark against current semantic similarity methods. The results demonstrate methods are clustered based on their mathematical methodologies, thus guiding researchers in the selection of appropriate methods. Additionally, we explored annotation-based versus topology-based methods, revealing that semantic similarities solely based on ontology topology can efficiently reveal semantic similarity structures, facilitating analysis on less-studied organisms and other ontologies. CONCLUSIONS: Simona offers a versatile interface and efficient implementation for processing, visualization, and semantic similarity analysis on bio-ontologies. We believe that simona will serve as a robust tool for uncovering relationships and enhancing the interoperability of biological knowledge systems.
Assuntos
Ontologias Biológicas , Semântica , Software , Biologia Computacional/métodosRESUMO
In the realm of biological image analysis, deep learning (DL) has become a core toolkit, for example for segmentation and classification. However, conventional DL methods are challenged by large biodiversity datasets characterized by unbalanced classes and hard-to-distinguish phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes these challenges by focussing on learning relationships between individual data points rather than on the separability of classes. BioEncoder is released as a Python package, created for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and simple hyperparameter adjustments through text-based configuration files. The toolkit's significance lies in its potential to unlock new research avenues in biological image analysis while democratizing access to advanced deep metric learning techniques. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL pipelines and practical applications in biological research.
Assuntos
Aprendizado Profundo , Software , Animais , Processamento de Imagem Assistida por Computador/métodos , BiodiversidadeRESUMO
Understanding the distribution of herbivore damage among leaves and individual plants is a central goal of plant-herbivore biology. Commonly observed unequal patterns of herbivore damage have conventionally been attributed to the heterogeneity in plant quality or herbivore behaviour or distribution. Meanwhile, the potential role of stochastic processes in structuring plant-herbivore interactions has been overlooked. Here, we show that based on simple first principle expectations from metabolic theory, random sampling of different sizes of herbivores from a regional pool is sufficient to explain patterns of variation in herbivore damage. This is despite making the neutral assumption that herbivory is caused by randomly feeding herbivores on identical and passive plants. We then compared its predictions against 765 datasets of herbivory on 496 species across 116° of latitude from the Herbivory Variability Network. Using only one free parameter, the estimated attack rate, our neutral model approximates the observed frequency distribution of herbivore damage among plants and especially among leaves very well. Our results suggest that neutral stochastic processes play a large and underappreciated role in natural variation in herbivory and may explain the low predictability of herbivory patterns. We argue that such prominence warrants its consideration as a powerful force in plant-herbivore interactions.
Assuntos
Herbivoria , Folhas de Planta , PlantasRESUMO
Understanding ncRNA-protein interaction is of critical importance to unveil ncRNAs' functions. Here, we propose an integrated package LION which comprises a new method for predicting ncRNA/lncRNA-protein interaction as well as a comprehensive strategy to meet the requirement of customisable prediction. Experimental results demonstrate that our method outperforms its competitors on multiple benchmark datasets. LION can also improve the performance of some widely used tools and build adaptable models for species- and tissue-specific prediction. We expect that LION will be a powerful and efficient tool for the prediction and analysis of ncRNA/lncRNA-protein interaction. The R Package LION is available on GitHub at https://github.com/HAN-Siyu/LION/.
Assuntos
RNA Longo não Codificante , RNA não Traduzido/genéticaRESUMO
Consensus partitioning is an unsupervised method widely used in high-throughput data analysis for revealing subgroups and assigning stability for the classification. However, standard consensus partitioning procedures are weak for identifying large numbers of stable subgroups. There are two major issues. First, subgroups with small differences are difficult to be separated if they are simultaneously detected with subgroups with large differences. Second, stability of classification generally decreases as the number of subgroups increases. In this work, we proposed a new strategy to solve these two issues by applying consensus partitioning in a hierarchical procedure. We demonstrated hierarchical consensus partitioning can be efficient to reveal more meaningful subgroups. We also tested the performance of hierarchical consensus partitioning on revealing a great number of subgroups with a large deoxyribonucleic acid methylation dataset. The hierarchical consensus partitioning is implemented in the R package cola with comprehensive functionalities for analysis and visualization. It can also automate the analysis only with a minimum of two lines of code, which generates a detailed HTML report containing the complete analysis. The cola package is available at https://bioconductor.org/packages/cola/.
Assuntos
Software , ConsensoRESUMO
One of the main challenges in applying machine learning algorithms to biological sequence data is how to numerically represent a sequence in a numeric input vector. Feature extraction techniques capable of extracting numerical information from biological sequences have been reported in the literature. However, many of these techniques are not available in existing packages, such as mathematical descriptors. This paper presents a new package, MathFeature, which implements mathematical descriptors able to extract relevant numerical information from biological sequences, i.e. DNA, RNA and proteins (prediction of structural features along the primary sequence of amino acids). MathFeature makes available 20 numerical feature extraction descriptors based on approaches found in the literature, e.g. multiple numeric mappings, genomic signal processing, chaos game theory, entropy and complex networks. MathFeature also allows the extraction of alternative features, complementing the existing packages. To ensure that our descriptors are robust and to assess their relevance, experimental results are presented in nine case studies. According to these results, the features extracted by MathFeature showed high performance (0.6350-0.9897, accuracy), both applying only mathematical descriptors, but also hybridization with well-known descriptors in the literature. Finally, through MathFeature, we overcame several studies in eight benchmark datasets, exemplifying the robustness and viability of the proposed package. MathFeature has advanced in the area by bringing descriptors not available in other packages, as well as allowing non-experts to use feature extraction techniques.
Assuntos
Proteínas , RNA , Algoritmos , Sequência de Aminoácidos , DNA/genética , Aprendizado de Máquina , Proteínas/química , RNA/genéticaRESUMO
With the advances in sequencing technologies, a huge amount of biological data is extracted nowadays. Analyzing this amount of data is beyond the ability of human beings, creating a splendid opportunity for machine learning methods to grow. The methods, however, are practical only when the sequences are converted into feature vectors. Many tools target this task including iLearnPlus, a Python-based tool which supports a rich set of features. In this paper, we propose a holistic tool that extracts features from biological sequences (i.e. DNA, RNA and Protein). These features are the inputs to machine learning models that predict properties, structures or functions of the input sequences. Our tool not only supports all features in iLearnPlus but also 30 additional features which exist in the literature. Moreover, our tool is based on R language which makes an alternative for bioinformaticians to transform sequences into feature vectors. We have compared the conversion time of our tool with that of iLearnPlus: we transform the sequences much faster. We convert small nucleotides by a median of 2.8X faster, while we outperform iLearnPlus by a median of 6.3X for large sequences. Finally, in amino acids, our tool achieves a median speedup of 23.9X.
Assuntos
Aprendizado de Máquina , Proteínas , DNA/genética , Humanos , Proteínas/química , RNA/genética , Análise de Sequência/métodosRESUMO
As plant research generates an ever-growing volume of spatial quantitative data, the need for decentralized and user-friendly visualization tools to explore large and complex datasets becomes crucial. Existing resources, such as the Plant eFP (electronic Fluorescent Pictograph) viewer, have played a pivotal role on the communication of gene expression data across many plant species. However, although widely used by the plant research community, the Plant eFP viewer lacks open and user-friendly tools for the creation of customized expression maps independently. Plant biologists with less coding experience can often encounter challenges when attempting to explore ways to communicate their own spatial quantitative data. We present 'ggPlantmap' an open-source R package designed to address this challenge by providing an easy and user-friendly method for the creation of ggplot representative maps from plant images. ggPlantmap is built in R, one of the most used languages in biology, to empower plant scientists to create and customize eFP-like viewers tailored to their experimental data. Here, we provide an overview of the package and tutorials that are accessible even to users with minimal R programming experience. We hope that ggPlantmap can assist the plant science community, fostering innovation, and improving our understanding of plant development and function.
Assuntos
Plantas , Software , Plantas/metabolismo , Processamento de Imagem Assistida por Computador/métodosRESUMO
Bayesian phylogenetic inference requires a tree prior, which models the underlying diversification process that gives rise to the phylogeny. Existing birth-death diversification models include a wide range of features, for instance, lineage-specific variations in speciation and extinction (SSE) rates. While across-lineage variation in SSE rates is widespread in empirical datasets, few heterogeneous rate models have been implemented as tree priors for Bayesian phylogenetic inference. As a consequence, rate heterogeneity is typically ignored when reconstructing phylogenies, and rate heterogeneity is usually investigated on fixed trees. In this paper, we present a new BEAST2 package implementing the cladogenetic diversification rate shift (ClaDS) model as a tree prior. ClaDS is a birth-death diversification model designed to capture small progressive variations in birth and death rates along a phylogeny. Unlike previous implementations of ClaDS, which were designed to be used with fixed, user-chosen phylogenies, our package is implemented in the BEAST2 framework and thus allows full phylogenetic inference, where the phylogeny and model parameters are co-estimated from a molecular alignment. Our package provides all necessary components of the inference, including a new tree object and operators to propose moves to the Monte-Carlo Markov chain. It also includes a graphical interface through BEAUti. We validate our implementation of the package by comparing the produced distributions to simulated data and show an empirical example of the full inference, using a dataset of cetaceans.
Assuntos
Especiação Genética , Filogenia , Teorema de Bayes , Método de Monte Carlo , Cadeias de MarkovRESUMO
BACKGROUND: Nutrient content and degree of processing are complementary but distinct concepts, and a growing body of evidence shows that ultra-processed foods (UPFs) can have detrimental health effects independently from nutrient content. 10 + countries currently mandate front-of-package labels (FOPL) to inform consumers when products are high in added sugars, saturated fat, and/or sodium. Public health advocates have been calling for the addition of ultra-processed warning labels to these FOPLs, but the extent to which consumers would understand and be influenced by such labels remains unknown. We examined whether the addition of ultra-processed warning labels to existing nutrient warning labels could influence consumers' product perceptions and purchase intentions. METHODS: In 2023, a sample of adults in Brazil (n = 1,004) answered an open-ended question about the meaning of the term "ultra-processed," followed by an online experiment where they saw four ultra-processed products carrying warning labels. Participants were randomly assigned to view either only nutrient warning labels or nutrient plus ultra-processed warning labels. Participants then answered questions about their intentions to purchase the products, product perceptions, and perceived label effectiveness. RESULTS: Most participants (69%) exhibited a moderate understanding of the term "ultra-processed" prior to the experiment. The addition of an ultra-processed warning label led to a higher share of participants who correctly identified the products as UPFs compared to nutrient warning labels alone (Cohen's d = 0.16, p = 0.02). However, the addition of the ultra-processed warning label did not significantly influence purchase intentions, product healthfulness perceptions, or perceived label effectiveness compared to nutrient warning labels alone (all p > 0.05). In exploratory analyses, demographic characteristics and prior understanding of the concept of UPF did not moderate the effect of ultra-processed warning labels. CONCLUSIONS: Ultra-processed warning labels may help consumers better identify UPFs, although they do not seem to influence behavioral intentions and product perceptions beyond the influence already exerted by nutrient warning labels. Future research should examine how ultra-processed warning labels would work for products that do and do not require nutrient warnings, as well as examine the benefits of labeling approaches that signal the health effects of UPFs. TRIAL REGISTRATION: ClinicalTrials.gov, NCT05842460. Prospectively registered March 15th, 2023.
Assuntos
Comportamento do Consumidor , Rotulagem de Alimentos , Intenção , Humanos , Rotulagem de Alimentos/métodos , Feminino , Masculino , Adulto , Adulto Jovem , Brasil , Pessoa de Meia-Idade , Fast Foods , Valor Nutritivo , Percepção , Adolescente , Conhecimentos, Atitudes e Prática em SaúdeRESUMO
OBJECTIVE: In 2020, Mexico implemented innovative front-of-package nutrition warning labels (FoPWLs) for packaged foods to increase the salience and understanding of nutrition information. This study evaluated Mexican Americans' self-reported exposure to Mexican FoPWLs and self-reported effects of FoPWLs on purchasing behavior. METHODS: The 2021 International Food Policy Study surveyed online panels of adult Mexican Americans in the US (n = 3361) to self-report on buying food at Mexican-oriented stores, noticing Mexican FoPWLs, and being influenced by FoPWLs to purchase less of eight different unhealthy foods (each assessed separately). After recoding the frequency of buying foods in Mexican stores and noticing FoPWLs (i.e., "often" or "very often" vs. less often), logistic models regressed these outcomes on sociodemographics, adjusting for post-stratification weights. RESULTS: Most participants (88.0%) purchased foods in Mexican stores. Of these, 64.1% reported noticing FoPWLs, among whom many reported that FoPWLs influenced them to buy fewer unhealthy foods (range = 32% [snacks like chips] - 44% [colas]). Participants were more likely to buy foods in Mexican stores and notice FoPWLs if they were younger, had ≥two children at home vs no children (AOR = 1.40, 95%CI = 1.15-1.71; AOR = 1.37, 95%CI = 1.03-1.80, respectively), and more frequently used Spanish (AOR = 1.91, 95%CI = 1.77-2.07; AOR = 1.87, 95%CI = 1.69-2.07). Also, high vs. low education (AOR = 1.51, 95%CI = 1.17-1.94) and higher income adequacy (AOR = 1.37, 95%CI = 1.25-1.51) were positively associated with noticing FoPWLs. Being female and more frequent Spanish use were consistently associated with reporting purchase of fewer unhealthy foods because of FoPWLs. CONCLUSIONS: Many Mexican Americans report both exposure to Mexican FOPWLs and reducing purchases of unhealthy foods because of them.