RESUMO
HJ-Biplot analysis is a multivariate graphic representation that collects data covariation structure between variables and individuals to represent them in a low-dimensional space with the highest quality in the same reference system. Consequently, it is a promising technique for evaluating dietary exposure to polyphenols and accurately characterizing female nutrition. Herein, we hypothesized that polyphenol intake defines specific clusters with dietary impacts, which can be assessed using HJ-Biplot, based on a cross-sectional study in Argentina. The study included 275 healthy postpartum women who provided information about their food frequency intake and other conditions, which were then used to evaluate polyphenolic intake using the Phenol-Explorer database. Outcomes were established using HJ-Biplot for clustering and ANOVA to compare their impact on diet quality indicators. Two HJ-Biplot models were run (for intakes >20 mg/d and 5â¼20 mg/d, respectively) to identify three clusters per model with excellent statistical fitness to explain the data. Thus, specific polyphenolic clusters with potentially bioactive and safe compounds were defined despite significant interindividual variability. In fact, women with the lowest polyphenolic intake exhibited worse dietary quality, body fat, and physical activity. As a result, HJ-Biplot proved to be an effective technique for clustering women based on their dietary intake of these compounds. Furthermore, cluster membership improved the intake of antioxidants, water, fiber, and healthy fats. Additionally, women with formal jobs and a higher educational level showed a better diet. Dietary polyphenols are critical during postpartum because they exert beneficial effects on women and breastfed infants.
RESUMO
Los métodos de clasificación permiten explorar y analizar grandes conjuntos de datos visualmente, lo cual es de gran utilidad para tomar decisiones rápidas. El objetivo fue comparar dos métodos de análisis de clúster para big data en variables demográficas de las provincias del Ecuador. Se hizo uso de un estudio observacional de tipo comparativo mediante la representación simultanea del HJ-Biplot y el método Two Step (clúster bietápico), a través del software MultBiplot y SPSS. Los datos corresponden a variables demográficas de interés sociosanitarias tasa de mortalidad general, tasa de mortalidad infantil, tasa de natalidad, densidad poblacional, porcentaje urbano y esperanza de vida, medidas en las provincias del Ecuador. Se utilizaron datos provenientes del Instituto de Estadísticas y Censos INEC. Se analizó la asociación entre variables y se identificaron clústeres de las provincias del Ecuador según estas variables demográficas. Según la representación simultánea del HJ-Biplot se identificaron 3 clústeres, el clúster 1 son provincias con mayor densidad poblacional y tasas de mortalidad general, pero valores bajos de tasas de natalidad, el clúster 2 agrupa provincias con mayor esperanza de vida y tasas de mortalidad infantil pero bajos valores de tasa de natalidad y el clúster 3 están las provincias con valores altos de tasas de natalidad y valores bajos de densidad poblacional, esperanza de vida, tasas de mortalidad general y mortalidad infantil, distintos resultados se obtuvieron con el método Two Step. Se pudo concluir que estos métodos son de utilidad para explorar las similitudes entre las provincias según variables demográficas.
The classification methods allow to explore and analyze big data sets visually, which is very useful for making quick decisions. This work aimed to compare of two methods of cluster analysis for big data in demographic variables of the provinces of Ecuador. An observational study of comparative type was carried out through the simultaneous representation of the HJ/Biplot and the Two Step method (two-stage cluster), through the MultBiplot and SPSS software. The data correspond to demographic variables of socio-health interest, general mortality rate, infant mortality rate, birth rate, population density, urban percentage and life expectancy, measured in the provinces of Ecuador. Data from Statistics and Census Institute were used. The association between variables was analyzed and clusters of the provinces of Ecuador were identified according to these demographic variables. According to the simultaneous representation of the HJBiplot, 3 clusters were identified, cluster 1 are provinces with higher population density and general mortality rates, but low birth rates values, cluster 2 are provinces with higher life expectancy and mortality rates infantile but low birth rate values and cluster 3 are the provinces with high birth rates values and low population density, life expectancy, general mortality and infant mortality rates, different results were obtained with the Two Step method. It was concluded that these methods are useful for exploring the similarities between provinces according to demographic variables.
Assuntos
Humanos , Análise por Conglomerados , Demografia , Modelos Estatísticos , Estatísticas Vitais , Equador/epidemiologiaRESUMO
En la actualidad, los análisis de distribución espacial mediante el uso de técnicas de clusters para enfermedades crónicas como el cáncer de mama, son relevantes para la identificación de patrones espaciales de la mortalidad por cáncer según áreas geográficas. Identificar clústeres espaciales de la mortalidad por cáncer de mama en mujeres a nivel de las provincias del Ecuador, entre 2004 al 2018. Estudio observacional, de tipo descriptivo, ecológico multigrupal que compara a nivel espacio temporal las tasas de mortalidad por cáncer de mama en mujeres según las provincias del Ecuador, utilizando el índice de Móran para el análisis de autocorrelación y el algoritmo de k-medias para el análisis de agrupamiento en períodos quinquenales mediante el programa informático ArcGIS versión 10.5. Resultados. En el Ecuador, el 86,5% de las muertes por cáncer de mama en mujeres se registraron en el área urbana, dichas muertes tienen un patrón no aleatorio según el índice de Morán, distinto al área rural que tiene un patrón aleatorio; se identificó diferencia en el agrupamiento de la mortalidad por cáncer de mama en las provincias urbanas y rurales, donde se obtuvo para el área urbana, clústeres con altas, media-altas, media-baja y bajas tasas de mortalidad, mientras que en lo rural se obtuvieron solo clústeres con altas, medias y bajas tasas de mortalidad. La distribución espacial y el análisis de agrupamiento identificó clústeres de la mortalidad por cáncer de mama en el Ecuador, evidenciando entre lo urbano y rural diferencias en los clústeres obtenidos, siendo esta información de utilidad para la implementación de estrategias de control del cáncer en el país.
Currently spatial distribution analyzes through the use of cluster techniques for chronic diseases such as breast cancer are revealing for the identification of spatial patterns of cancer mortality according to geographic areas. Objective. Identify spatial clusters of breast cancer mortality in women at the level of the provinces of Ecuador, between 2004 to 2018. We used an observational, descriptive, ecological multigroup study that compares at a Spatio-temporal level the rates of breast cancer mortality in women according to the provinces of Ecuador, using the Moran index for the autocorrelation analysis and the k-, means algorithm for cluster analysis in five-year periods using the ArcGIS version 10.5 software. Results. In Ecuador, 86.5% of breast cancer deaths in women were recorded in the urban area, these deaths have a non-random pattern according to the Morán Index different from the rural area that has a random pattern; difference was identified in the grouping of breast cancer mortality in urban and rural provinces, where it was obtained for urban areas, clusters with high, medium. high, medium-low and low mortality rates. While in rural areas only clusters with high, medium and low mortality rates were obtained. Conclusions. The spatial distribution and cluster analysis identified clusters of breast cancer mortality in Ecuador; evidencing between urban and rural differences in the clusters obtained, this information is useful for the development of cancer control strategies in the country.