Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros

Intervalo de año de publicación
1.
An Acad Bras Cienc ; 96(1): e20230064, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38656054

RESUMEN

In this work, we focus on obtaining insights of the performances of some well-known machine learning image classification techniques (k-NN, Support Vector Machine, randomized decision tree and one based on stochastic distances) for PolSAR (Polarimetric Synthetic Aperture Radar) imagery. We test the classifiers methods on a set of actual PolSAR data and provide some conclusions. The aim of this work is to show that suitable adapted standard machine learning methods offer excellent performances vs. computational complexity trade-off for PolSAR image classification. In this work, we evaluate well-known machine learning techniques for PolSAR (Polarimetric Synthetic Aperture Radar) image classification, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), randomized decision tree, and a method based on the Kullback-Leibler stochastic distance. Our experiments with real PolSAR data show that standard machine learning methods, when adapted appropriately, offer a favourable trade-off between performance and computational complexity. The KNN and SVM perform poorly on these data, likely due to their failure to account for the inherent speckle presence and properties of the studied reliefs. Overall, our findings highlight the potential of the Kullback-Leibler stochastic distance method for PolSAR image classification.


Asunto(s)
Aprendizaje Automático , Máquina de Vectores de Soporte , Algoritmos
2.
Nonlinear Dyn ; 111(7): 6855-6872, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36588986

RESUMEN

A generalized pathway model, with time-dependent parameters, is applied to describe the mortality curves of the COVID-19 disease for several countries that exhibit multiple waves of infections. The pathway approach adopted here is formulated explicitly in time, in the sense that the model's growth rate for the number of deaths or infections is written as an explicit function of time, rather than in terms of the cumulative quantity itself. This allows for a direct fit of the model to daily data (new deaths or new cases) without the need of any integration. The model is applied to COVID-19 mortality curves for ten selected countries and found to be in very good agreement with the data for all cases considered. From the fitted theoretical curves for a given location, relevant epidemiological information can be extracted, such as the starting and peak dates for each successive wave. It is argued that obtaining reliable estimates for such characteristic points is important for studying the effectiveness of interventions and the possible negative impact of their relaxation, as it allows for a direct comparison of the time of adoption/relaxation of control measures with the peaks and troughs of the epidemic curve.

3.
Appl Soft Comput ; 137: 110159, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36874079

RESUMEN

We present the software ModInterv as an informatics tool to monitor, in an automated and user-friendly manner, the evolution and trend of COVID-19 epidemic curves, both for cases and deaths. The ModInterv software uses parametric generalized growth models, together with LOWESS regression analysis, to fit epidemic curves with multiple waves of infections for countries around the world as well as for states and cities in Brazil and the USA. The software automatically accesses publicly available COVID-19 databases maintained by the Johns Hopkins University (for countries as well as states and cities in the USA) and the Federal University of Viçosa (for states and cities in Brazil). The richness of the implemented models lies in the possibility of quantitatively and reliably detecting the distinct acceleration regimes of the disease. We describe the backend structure of software as well as its practical use. The software helps the user not only to understand the current stage of the epidemic in a chosen location but also to make short term predictions as to how the curves may evolve. The app is freely available on the internet (http://fisica.ufpr.br/modinterv), thus making a sophisticated mathematical analysis of epidemic data readily accessible to any interested user.

4.
Sensors (Basel) ; 22(10)2022 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-35632152

RESUMEN

In this paper, we propose a new privatization mechanism based on a naive theory of a perturbation on a probability using wavelets, such as a noise perturbs the signal of a digital image sensor. Wavelets are employed to extract information from a wide range of types of data, including audio signals and images often related to sensors, as unstructured data. Specifically, the cumulative wavelet integral function is defined to build the perturbation on a probability with the help of this function. We show that an arbitrary distribution function additively perturbed is still a distribution function, which can be seen as a privatized distribution, with the privatization mechanism being a wavelet function. Thus, we offer a mathematical method for choosing a suitable probability distribution for data by starting from some guessed initial distribution. Examples of the proposed method are discussed. Computational experiments were carried out using a database-sensor and two related algorithms. Several knowledge areas can benefit from the new approach proposed in this investigation. The areas of artificial intelligence, machine learning, and deep learning constantly need techniques for data fitting, whose areas are closely related to sensors. Therefore, we believe that the proposed privatization mechanism is an important contribution to increasing the spectrum of existing techniques.


Asunto(s)
Inteligencia Artificial , Privatización , Algoritmos , Aprendizaje Automático , Probabilidad
5.
An Acad Bras Cienc ; 93(4): e20190316, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34550162

RESUMEN

The interpretation of odds ratios (OR) as prevalence ratios (PR) in cross-sectional studies have been criticized since this equivalence is not true unless under specific circumstances. The logistic regression model is a very well known statistical tool for analysis of binary outcomes and frequently used to obtain adjusted OR. Here, we introduce the prLogistic for the R statistical computing environment which can be obtained from The Comprehensive R Archive Network, https://cran.r-project.org/package=prLogistic. The package prLogistic was built to assist the estimation of PR via logistic regression models adjusted by delta method and bootstrap for analysis of independent and correlated binary data. Two applications are presented to illustrate its use for analysis of independent observations and data from clustered studies.


Asunto(s)
Modelos Logísticos , Estudios Transversales , Oportunidad Relativa , Prevalencia
6.
Phys Rev E ; 109(4-1): 044313, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38755908

RESUMEN

We present a multiscale stochastic analysis of foreign exchange rates using the H-theory formalism, which provides a hierarchical intermittency model for the information cascade in the currency market. We examine the distributions of returns and volatilities for the three most traded currency pairs: euro-U.S. dollar, U.S. dollar-Japanese yen, and British pound-U.S. dollar. We find that these markets have a hierarchy of timescales, with larger markets exhibiting more hierarchy levels. We provide a theoretical framework for understanding why the number of levels in the information cascade increases with market size, in analogy with similar behavior for the energy cascade in turbulence as a function of Reynolds number. We briefly argue that using turbulence-like models for financial markets can also provide valuable insights for developing efficient algorithmic trading strategies.

7.
Biology (Basel) ; 12(3)2023 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-36979135

RESUMEN

In this article, we propose a comparative study between two models that can be used by researchers for the analysis of survival data: (i) the Weibull regression model and (ii) the random survival forest (RSF) model. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. In the study, the length of stay of patients undergoing cardiac surgery, within the operating room, was used as the response variable. The obtained results show that the RSF model has less error rate for the training and testing data sets, at 23.55% and 20.31%, respectively, than the Weibull model, which has an error rate of 23.82%. Regarding the Harrell C-index, we obtain the values 0.76, 0.79, and 0.76, for the RSF and Weibull models, respectively. After the selection procedure, the Weibull model contains variables associated with the type of protocol and type of patient being statistically significant at 5%. The RSF model chooses age, type of patient, and type of protocol as relevant variables for prediction. We employ the randomForestSRC package of the R software to perform our data analysis and computational experiments. The proposal that we present has many applications in biology and medicine, which are discussed in the conclusions of this work.

8.
Biology (Basel) ; 12(7)2023 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-37508389

RESUMEN

Predictive models based on empirical similarity are instrumental in biology and data science, where the premise is to measure the likeness of one observation with others in the same dataset. Biological datasets often encompass data that can be categorized. When using empirical similarity-based predictive models, two strategies for handling categorical covariates exist. The first strategy retains categorical covariates in their original form, applying distance measures and allocating weights to each covariate. In contrast, the second strategy creates binary variables, representing each variable level independently, and computes similarity measures solely through the Euclidean distance. This study performs a sensitivity analysis of these two strategies using computational simulations, and applies the results to a biological context. We use a linear regression model as a reference point, and consider two methods for estimating the model parameters, alongside exponential and fractional inverse similarity functions. The sensitivity is evaluated by determining the coefficient of variation of the parameter estimators across the three models as a measure of relative variability. Our results suggest that the first strategy excels over the second one in effectively dealing with categorical variables, and offers greater parsimony due to the use of fewer parameters.

9.
Biomedicines ; 11(10)2023 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-37892978

RESUMEN

This research aims to enhance the classification and prediction of ischemic heart diseases using machine learning techniques, with a focus on resource efficiency and clinical applicability. Specifically, we introduce novel non-invasive indicators known as Campello de Souza features, which require only a tensiometer and a clock for data collection. These features were evaluated using a comprehensive dataset of heart disease cases from a machine learning data repository. Our findings highlight the ability of machine learning algorithms to not only streamline diagnostic procedures but also reduce diagnostic errors and the dependency on extensive clinical testing. Three key features-mean arterial pressure, pulsatile blood pressure index, and resistance-compliance indicator-were found to significantly improve the accuracy of machine learning algorithms in binary heart disease classification. Logistic regression achieved the highest average accuracy among the examined classifiers when utilizing these features. While such novel indicators contribute substantially to the classification process, they should be integrated into a broader diagnostic framework that includes comprehensive patient evaluations and medical expertise. Therefore, the present study offers valuable insights for leveraging data science techniques in the diagnosis and management of cardiovascular diseases.

10.
Artículo en Inglés | MEDLINE | ID: mdl-37502671

RESUMEN

The advent of technological developments is allowing to gather large amounts of data in several research fields. Learning analytics (LA)/educational data mining has access to big observational unstructured data captured from educational settings and relies mostly on unsupervised machine learning (ML) algorithms to make sense of such type of data. Generalized additive models for location, scale, and shape (GAMLSS) are a supervised statistical learning framework that allows modeling all the parameters of the distribution of the response variable with respect to the explanatory variables. This article overviews the power and flexibility of GAMLSS in relation to some ML techniques. Also, GAMLSS' capability to be tailored toward causality via causal regularization is briefly commented. This overview is illustrated via a data set from the field of LA. This article is categorized under:Application Areas > Education and LearningAlgorithmic Development > StatisticsTechnologies > Machine Learning.

11.
J Stat Theory Appl ; 21(4): 175-185, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36160758

RESUMEN

In The hitchhiker's guide to responsible machine learning, Biecek, Kozak, and Zawada (here BKZ) provide an illustrated and engaging step-by-step guide on how to perform a machine learning (ML) analysis such that the algorithms, the software, and the entire process is interpretable and transparent for both the data scientist and the end user. This review summarises BKZ's book and elaborates on three elements key to ML analyses: inductive inference, causality, and interpretability.

12.
Softw Impacts ; 14: 100409, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35990010

RESUMEN

The COVID-19 pandemic has proven the importance of mathematical tools to understand the evolution of epidemic outbreaks and provide reliable information to the general public and health authorities. In this perspective, we have developed ModInterv, an online software that applies growth models to monitor the evolution of the COVID-19 epidemic in locations chosen by the user among countries worldwide or states and cities in the USA or Brazil. This paper describes the software capabilities and its use both in recent research works and by technical committees assisting government authorities. Possible applications to other epidemics are also briefly discussed.

13.
PLoS One ; 16(11): e0259266, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34767560

RESUMEN

Many machine learning procedures, including clustering analysis are often affected by missing values. This work aims to propose and evaluate a Kernel Fuzzy C-means clustering algorithm considering the kernelization of the metric with local adaptive distances (VKFCM-K-LP) under three types of strategies to deal with missing data. The first strategy, called Whole Data Strategy (WDS), performs clustering only on the complete part of the dataset, i.e. it discards all instances with missing data. The second approach uses the Partial Distance Strategy (PDS), in which partial distances are computed among all available resources and then re-scaled by the reciprocal of the proportion of observed values. The third technique, called Optimal Completion Strategy (OCS), computes missing values iteratively as auxiliary variables in the optimization of a suitable objective function. The clustering results were evaluated according to different metrics. The best performance of the clustering algorithm was achieved under the PDS and OCS strategies. Under the OCS approach, new datasets were derive and the missing values were estimated dynamically in the optimization process. The results of clustering under the OCS strategy also presented a superior performance when compared to the resulting clusters obtained by applying the VKFCM-K-LP algorithm on a version where missing values are previously imputed by the mean or the median of the observed values.


Asunto(s)
Análisis por Conglomerados , Lógica Difusa , Algoritmos , Recolección de Datos
14.
Sci Rep ; 11(1): 4619, 2021 02 25.
Artículo en Inglés | MEDLINE | ID: mdl-33633290

RESUMEN

We apply a versatile growth model, whose growth rate is given by a generalised beta distribution, to describe the complex behaviour of the fatality curves of the COVID-19 disease for several countries in Europe and North America. We show that the COVID-19 epidemic curves not only may present a subexponential early growth but can also exhibit a similar subexponential (power-law) behaviour in the saturation regime. We argue that the power-law exponent of the latter regime, which measures how quickly the curve approaches the plateau, is directly related to control measures, in the sense that the less strict the control, the smaller the exponent and hence the slower the diseases progresses to its end. The power-law saturation uncovered here is an important result, because it signals to policymakers and health authorities that it is important to keep control measures for as long as possible, so as to avoid a slow, power-law ending of the disease. The slower the approach to the plateau, the longer the virus lingers on in the population, and the greater not only the final death toll but also the risk of a resurgence of infections.


Asunto(s)
COVID-19/epidemiología , Algoritmos , COVID-19/mortalidad , Europa (Continente)/epidemiología , Humanos , América del Norte/epidemiología , Pandemias , SARS-CoV-2/aislamiento & purificación
15.
Exp Psychol ; 67(1): 14-22, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-32394814

RESUMEN

In this experiment, we replicated the effect of muscle engagement on perception such that the recognition of another's facial expressions was biased by the observer's facial muscular activity (Blaesi & Wilson, 2010). We extended this replication to show that such a modulatory effect is also observed for the recognition of dynamic bodily expressions. Via a multilab and within-subjects approach, we investigated the emotion recognition of point-light biological walkers, along with that of morphed face stimuli, while subjects were or were not holding a pen in their teeth. Under the "pen-in-the-teeth" condition, participants tended to lower their threshold of perception of happy expressions in facial stimuli compared to the "no-pen" condition, thus replicating the experiment by Blaesi and Wilson (2010). A similar effect was found for the biological motion stimuli such that participants lowered their threshold to perceive happy walkers in the pen-in-the-teeth condition compared to the no-pen condition. This pattern of results was also found in a second experiment in which the no-pen condition was replaced by a situation in which participants held a pen in their lips ("pen-in-lips" condition). These results suggested that facial muscular activity alters the recognition of not only facial expressions but also bodily expressions.


Asunto(s)
Emociones/fisiología , Expresión Facial , Reconocimiento Facial/fisiología , Reconocimiento en Psicología/fisiología , Adulto , Femenino , Humanos , Masculino , Adulto Joven
16.
PeerJ ; 8: e9421, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32612894

RESUMEN

The main objective of the present article is twofold: first, to model the fatality curves of the COVID-19 disease, as represented by the cumulative number of deaths as a function of time; and second, to use the corresponding mathematical model to study the effectiveness of possible intervention strategies. We applied the Richards growth model (RGM) to the COVID-19 fatality curves from several countries, where we used the data from the Johns Hopkins University database up to May 8, 2020. Countries selected for analysis with the RGM were China, France, Germany, Iran, Italy, South Korea, and Spain. The RGM was shown to describe very well the fatality curves of China, which is in a late stage of the COVID-19 outbreak, as well as of the other above countries, which supposedly are in the middle or towards the end of the outbreak at the time of this writing. We also analysed the case of Brazil, which is in an initial sub-exponential growth regime, and so we used the generalised growth model which is more appropriate for such cases. An analytic formula for the efficiency of intervention strategies within the context of the RGM is derived. Our findings show that there is only a narrow window of opportunity, after the onset of the epidemic, during which effective countermeasures can be taken. We applied our intervention model to the COVID-19 fatality curve of Italy of the outbreak to illustrate the effect of several possible interventions.

17.
Front Psychol ; 10: 352, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30873078

RESUMEN

Research on the metaphorical mapping of valenced concepts onto space indicates that positive, neutral, and negative concepts are mapped onto upward, midward, and downward locations, respectively. More recently, this type of research has been tested for the very first time in 3D physical space. The findings corroborate the mapping of valenced concepts onto the vertical space as described above but further show that positive and negative concepts are placed close to and away from the body; neutral concepts are placed midway. The current study aimed at investigating whether valenced perceptual stimuli are positioned onto 3D space akin to the way valenced concepts are positioned. By using a unique device known as the cognition cube, participants placed visual, auditory, tactile and olfactory stimuli on 3D space. The results mimicked the placing of valenced concepts onto 3D space; i.e., positive percepts were placed in upward and close-to-the-body locations and negative percepts were placed in downward and away-from-the-body locations; neutral percepts were placed midway. These pattern of results was more pronounced in the case of visual stimuli, followed by auditory, tactile, and olfactory stimuli. Significance Statement Just recently, a unique device called "the cognition cube" (CC) enabled to find that positive words are mapped onto upward and close-to-the-body locations and negative words are mapped onto downward and away-from-the-body locations; neutral words are placed midway. This way of placing words in relation to the body is consistent with an approach-avoidance effect such that "good" and "bad" things are kept close to and away from one's body. We demonstrate for the very first time that this same pattern emerges when visual, auditory, tactile, and olfactory perceptual stimuli are placed on 3D physical space. We believe these results are significant in that the CC can be used as a new tool to diagnose emotion-related disorders.

18.
Front Psychol ; 9: 699, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29867666

RESUMEN

We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.

19.
Educ Psychol Meas ; 77(5): 881-895, 2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-29795937

RESUMEN

We present three strategies to replace the null hypothesis statistical significance testing approach in psychological research: (1) visual representation of cognitive processes and predictions, (2) visual representation of data distributions and choice of the appropriate distribution for analysis, and (3) model comparison. The three strategies have been proposed earlier, so we do not claim originality. Here we propose to combine the three strategies and use them not only as analytical and reporting tools but also to guide the design of research. The first strategy involves a visual representation of the cognitive processes involved in solving the task at hand in the form of a theory or model together with a representation of a pattern of predictions for each condition. The second approach is the GAMLSS approach, which consists of providing a visual representation of distributions to fit the data, and choosing the best distribution that fits the raw data for further analyses. The third strategy is the model comparison approach, which compares the model of the researcher with alternative models. We present a worked example in the field of reasoning, in which we follow the three strategies.

20.
PLoS One ; 11(12): e0166868, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27907014

RESUMEN

We present a new approach for handwritten signature classification and verification based on descriptors stemming from time causal information theory. The proposal uses the Shannon entropy, the statistical complexity, and the Fisher information evaluated over the Bandt and Pompe symbolization of the horizontal and vertical coordinates of signatures. These six features are easy and fast to compute, and they are the input to an One-Class Support Vector Machine classifier. The results are better than state-of-the-art online techniques that employ higher-dimensional feature spaces which often require specialized software and hardware. We assess the consistency of our proposal with respect to the size of the training sample, and we also use it to classify the signatures into meaningful groups.


Asunto(s)
Biometría/métodos , Escritura Manual , Reconocimiento de Normas Patrones Automatizadas/métodos , Máquina de Vectores de Soporte , Entropía , Humanos , Individualidad , Teoría de la Información , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA