Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
1.
Multivariate Behav Res ; 59(2): 266-288, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38361218

RESUMEN

The walktrap algorithm is one of the most popular community-detection methods in psychological research. Several simulation studies have shown that it is often effective at determining the correct number of communities and assigning items to their proper community. Nevertheless, it is important to recognize that the walktrap algorithm relies on hierarchical clustering because it was originally developed for networks much larger than those encountered in psychological research. In this paper, we present and demonstrate a computational alternative to the hierarchical algorithm that is conceptually easier to understand. More importantly, we show that better solutions to the sum-of-squares optimization problem that is heuristically tackled by hierarchical clustering in the walktrap algorithm can often be obtained using exact or approximate methods for K-means clustering. Three simulation studies and analyses of empirical networks were completed to assess the impact of better sum-of-squares solutions.


Asunto(s)
Algoritmos , Simulación por Computador , Análisis por Conglomerados
2.
Behav Res Methods ; 55(7): 3566-3584, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-36266525

RESUMEN

The Ising model has received significant attention in network psychometrics during the past decade. A popular estimation procedure is IsingFit, which uses nodewise l1-regularized logistic regression along with the extended Bayesian information criterion to establish the edge weights for the network. In this paper, we report the results of a simulation study comparing IsingFit to two alternative approaches: (1) a nonregularized nodewise stepwise logistic regression method, and (2) a recently proposed global l1-regularized logistic regression method that estimates all edge weights in a single stage, thus circumventing the need for nodewise estimation. MATLAB scripts for the methods are provided as supplemental material. The global l1-regularized logistic regression method generally provided greater accuracy and sensitivity than IsingFit, at the expense of lower specificity and much greater computation time. The stepwise approach showed considerable promise. Relative to the l1-regularized approaches, the stepwise method provided better average specificity for all experimental conditions, as well as comparable accuracy and sensitivity at the largest sample size.


Asunto(s)
Modelos Logísticos , Humanos , Teorema de Bayes , Simulación por Computador
3.
Behav Res Methods ; 55(7): 3549-3565, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-36258108

RESUMEN

The modularity index (Q) is an important criterion for many community detection heuristics used in network psychometrics and its subareas (e.g., exploratory graph analysis). Some heuristics seek to directly maximize Q, whereas others, such as the walktrap algorithm, only use the modularity index post hoc to determine the number of communities. Researchers in network psychometrics have typically not employed methods that are guaranteed to find a partition that maximizes Q, perhaps because of the complexity of the underlying mathematical programming problem. In this paper, for networks of the size commonly encountered in network psychometrics, we explore the utility of finding the partition that maximizes Q via formulation and solution of a clique partitioning problem (CPP). A key benefit of the CPP is that the number of communities is naturally determined by its solution and, therefore, need not be prespecified in advance. The results of two simulation studies comparing maximization of Q to two other methods that seek to maximize modularity (fast greedy and Louvain), as well as one popular method that does not (walktrap algorithm), provide interesting insights as to the relative performances of the methods with respect to identification of the correct number of communities and the recovery of underlying community structure.


Asunto(s)
Algoritmos , Humanos , Psicometría , Simulación por Computador
4.
Multivariate Behav Res ; 56(2): 329-335, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33960861

RESUMEN

This reply addresses the commentary by Epskamp et al. (in press) on our prior work, of using fixed marginals for sampling the data for testing hypothesis in psychometric network application. Mathematical results are presented for expected column (e.g., item prevalence) and row (e.g., subject severity) probabilities under three classical sampling schemes in categorical data analysis: (i) fixing the density, (ii) fixing either the row or column marginal, or (iii) fixing both the row and column marginal. It is argued that, while a unidimensional structure may not be the model we want, it is the structure we are confronted with given the binary nature of the data. Interpreting network models in the context of this artifactual structure is necessary, with preferred solutions to be expanding the item sets of disorders and moving away from the use of binary data and their associated constraints.


Asunto(s)
Psicometría , Probabilidad
5.
Multivariate Behav Res ; 56(1): 57-69, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32054331

RESUMEN

Using complete enumeration (e.g., generating all possible subsets of item combinations) to evaluate clustering problems has the benefit of locating globally optimal solutions automatically without the concern of sampling variability. The proposed method is meant to combine clustering variables in such a way as to create groups that are maximally different on a theoretically sound derivation variable(s). After the population of all unique sets is permuted, optimization on some predefined, user-specific function can occur. We apply this technique to optimizing the diagnosis of Alcohol Use Disorder. This is a unique application, from a clustering point of view, in that the decision rule for clustering observations into the "diagnosis" group relies on both the set of items being considered and a predefined threshold on the number of items required to be endorsed for the "diagnosis" to occur. In optimizing diagnostic rules, criteria set sizes can be reduced without a loss of significant information when compared to current and proposed, alternative, diagnostic schemes.


Asunto(s)
Alcoholismo , Análisis por Conglomerados , Trastornos Mentales , Alcoholismo/diagnóstico , Trastornos Mentales/diagnóstico
6.
Multivariate Behav Res ; 53(1): 57-73, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29220584

RESUMEN

Cohen's κ, a similarity measure for categorical data, has since been applied to problems in the data mining field such as cluster analysis and network link prediction. In this paper, a new application is examined: community detection in networks. A new algorithm is proposed that uses Cohen's κ as a similarity measure for each pair of nodes; subsequently, the κ values are then clustered to detect the communities. This paper defines and tests this method on a variety of simulated and real networks. The results are compared with those from eight other community detection algorithms. Results show this new algorithm is consistently among the top performers in classifying data points both on simulated and real networks. Additionally, this is one of the broadest comparative simulations for comparing community detection algorithms to date.


Asunto(s)
Algoritmos , Redes de Comunicación de Computadores , Apoyo Social , Análisis por Conglomerados , Humanos
7.
Behav Res Methods ; 50(6): 2256-2266, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29218590

RESUMEN

The problem of comparing the agreement of two n × n matrices has a variety of applications in experimental psychology. A well-known index of agreement is based on the sum of the element-wise products of the matrices. Although less familiar to many researchers, measures of agreement based on within-row and/or within-column gradients can also be useful. We provide a suite of MATLAB programs for computing agreement indices and performing matrix permutation tests of those indices. Programs for computing exact p-values are available for small matrices, whereas resampling programs for approximate p-values are provided for larger matrices.


Asunto(s)
Investigación Conductal/estadística & datos numéricos , Interpretación Estadística de Datos , Modelos Estadísticos , Programas Informáticos , Humanos
8.
Behav Res Methods ; 49(1): 282-293, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-26721666

RESUMEN

Mixture modeling is a popular technique for identifying unobserved subpopulations (e.g., components) within a data set, with Gaussian (normal) mixture modeling being the form most widely used. Generally, the parameters of these Gaussian mixtures cannot be estimated in closed form, so estimates are typically obtained via an iterative process. The most common estimation procedure is maximum likelihood via the expectation-maximization (EM) algorithm. Like many approaches for identifying subpopulations, finite mixture modeling can suffer from locally optimal solutions, and the final parameter estimates are dependent on the initial starting values of the EM algorithm. Initial values have been shown to significantly impact the quality of the solution, and researchers have proposed several approaches for selecting the set of starting values. Five techniques for obtaining starting values that are implemented in popular software packages are compared. Their performances are assessed in terms of the following four measures: (1) the ability to find the best observed solution, (2) settling on a solution that classifies observations correctly, (3) the number of local solutions found by each technique, and (4) the speed at which the start values are obtained. On the basis of these results, a set of recommendations is provided to the user.


Asunto(s)
Análisis de Elementos Finitos , Distribución Normal , Algoritmos , Modelos Teóricos , Probabilidad
9.
Multivariate Behav Res ; 51(4): 466-81, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27494191

RESUMEN

It is common knowledge that mixture models are prone to arrive at locally optimal solutions. Typically, researchers are directed to utilize several random initializations to ensure that the resulting solution is adequate. However, it is unknown what factors contribute to a large number of local optima and whether these coincide with the factors that reduce the accuracy of a mixture model. A real-data illustration and a series of simulations are presented that examine the effect of a variety of data structures on the propensity of local optima and the classification quality of the resulting solution. We show that there is a moderately strong relationship between a solution that has a high proportion of local optima and one that is poorly classified.


Asunto(s)
Algoritmos , Modelos Estadísticos , Simulación por Computador
10.
Behav Res Methods ; 48(2): 487-502, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-25899042

RESUMEN

An asymmetric one-mode data matrix has rows and columns that correspond to the same set of objects. However, the roles of the objects frequently differ for the rows and the columns. For example, in a visual alphabetic confusion matrix from an experimental psychology study, both the rows and columns pertain to letters of the alphabet. Yet the rows correspond to the presented stimulus letter, whereas the columns refer to the letter provided as the response. Other examples abound in psychology, including applications related to interpersonal interactions (friendship, trust, information sharing) in social and developmental psychology, brand switching in consumer psychology, journal citation analysis in any discipline (including quantitative psychology), and free association tasks in any subarea of psychology. When seeking to establish a partition of the objects in such applications, it is overly restrictive to require the partitions of the row and column objects to be identical, or even the numbers of clusters for the row and column objects to be the same. This suggests the need for a biclustering approach that simultaneously establishes separate partitions of the row and column objects. We present and compare several approaches for the biclustering of one-mode matrices using data sets from the empirical literature. A suite of MATLAB m-files for implementing the procedures is provided as a Web supplement with this article.


Asunto(s)
Investigación Conductal/métodos , Análisis por Conglomerados , Humanos
11.
Soc Networks ; 42: 72-79, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-30337771

RESUMEN

As network data gains popularity for research in various fields, the need for methods to predict future links or find missing links in the data increases. One subset of the methodology used to solve this problem involves creating a similarity measure between each pair of nodes in the network; unfortunately, these methods can be shown to have arbitrary cutoffs and poor performance. To address these shortcomings, we use the adjusted Rand index to create a similarity measure between nodes that has a natural threshold of zero. The effectiveness of this method is then compared to a number of other similarity measures and assessed on a variety of simulated data sets with block model structure and three real network data sets. Under this particular formulation of the adjusted Rand index, information is also provided on dissimilarity. As such, we then go on to test its use for detecting incorrect links in network data, highlighting the dual use of the approach.

12.
Psychol Methods ; 2024 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-39264649

RESUMEN

Many clustering problems are associated with a particular objective criterion that is sought to be optimized. There are often several methods that can be used to tackle the optimization problem, and one or more of them might guarantee a globally optimal solution. However, it is quite possible that, relative to one or more suboptimal solutions, a globally optimal solution might be less interpretable from the standpoint of psychological theory or be less in accordance with some known (i.e., true) cluster structure. For example, in simulation experiments, it has sometimes been observed that there is not a perfect correspondence between the optimized clustering criterion and recovery of the underlying known cluster structure. This can lead to the misconception that clustering methods with a tendency to produce suboptimal solutions might, in some instances, be preferable to superior methods that provide globally optimal (or at least better locally optimal) solutions. In this article, we present results from simulation studies in the context of K-median clustering where departure from global optimality was carefully controlled. Although the results showed that suboptimal solutions sometimes produced marginally better recovery for experimental cells where the known cluster structure was less well-defined, capriciously accepting inferior solutions is an unwise practice. However, there are instances in which some sacrifice in the optimization criterion value to meet certain desirable constraints or to improve the value of one or more other relevant criteria is principled. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

13.
Psychol Methods ; 2022 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-35786981

RESUMEN

Most researchers have estimated the edge weights for relative importance networks using a well-established measure of general dominance for multiple regression. This approach has several desirable properties including edge weights that represent R² contributions, in-degree centralities that correspond to R² for each item when using other items as predictors, and strong replicability. We endorse the continued use of relative importance networks and believe they have a valuable role in network psychometrics. However, to improve their utility, we introduce a modified approach that uses best-subsets regression as a preceding step to select an appropriate subset of predictors for each item. The benefits of this modification include: (a) computation time savings that can enable larger relative importance networks to be estimated, (b) a principled approach to edge selection that can significantly improve specificity, (c) the provision of a signed network if desired, (d) the potential use of the best-subsets regression approach for estimating Gaussian graphical models, and (e) possible generalization to best-subsets logistic regression for Ising models. We describe, evaluate, and demonstrate the proposed approach and discuss its strengths and limitations. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

14.
Psychol Methods ; 2022 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-35797161

RESUMEN

Spectral clustering is a well-known method for clustering the vertices of an undirected network. Although its use in network psychometrics has been limited, spectral clustering has a close relationship to the commonly used walktrap algorithm. In this article, we report results from simulation experiments designed to evaluate the ability of spectral clustering and the walktrap algorithm to recover underlying cluster (or community) structure in networks. The salient findings include: (a) the recovery performance of the walktrap algorithm can be improved by using K-means clustering instead of hierarchical clustering; (b) K-means and K-median clustering led to comparable recovery performance when used to cluster vertices based on the eigenvectors of Laplacian matrices in spectral clustering; (c) spectral clustering using the unnormalized Laplacian matrix generally yielded inferior cluster recovery in comparison to the other methods; (d) when the correct number of clusters was provided for the methods, spectral clustering using the normalized Laplacian matrix led to better recovery than the walktrap algorithm; and (e) when the correct number of clusters was not provided, the walktrap algorithm using the Qw modularity index was better than spectral clustering using the eigengap heuristic at determining the appropriate number of clusters. Overall, both the walktrap algorithm and spectral clustering of the normalized Laplacian matrix are effective for partitioning the vertices of undirected networks, with the latter performing better in most instances. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

15.
Psychometrika ; 87(1): 133-155, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34282531

RESUMEN

Common outputs of software programs for network estimation include association matrices containing the edge weights between pairs of symptoms and a plot of the symptom network. Although such outputs are useful, it is sometimes difficult to ascertain structural relationships among symptoms from these types of output alone. We propose that matrix permutation provides a simple, yet effective, approach for clarifying the order relationships among the symptoms based on the edge weights of the network. For directed symptom networks, we use a permutation criterion that has classic applications in electrical circuit theory and economics. This criterion can be used to place symptoms that strongly predict other symptoms at the beginning of the ordering, and symptoms that are strongly predicted by other symptoms at the end. For undirected symptom networks, we recommend a permutation criterion that is based on location theory in the field of operations research. When using this criterion, symptoms with many strong ties tend to be placed centrally in the ordering, whereas weakly-tied symptoms are placed at the ends. The permutation optimization problems are solved using dynamic programming. We also make use of branch-search algorithms for extracting maximum cardinality subsets of symptoms that have perfect structure with respect to a selected criterion. Software for implementing the dynamic programming algorithms is available in MATLAB and R. Two networks from the literature are used to demonstrate the matrix permutation algorithms.


Asunto(s)
Algoritmos , Programas Informáticos , Psicometría
16.
Psychometrika ; 76(4): 612-33, 2011 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-27519683

RESUMEN

Two-mode binary data matrices arise in a variety of social network contexts, such as the attendance or non-attendance of individuals at events, the participation or lack of participation of groups in projects, and the votes of judges on cases. A popular method for analyzing such data is two-mode blockmodeling based on structural equivalence, where the goal is to identify partitions for the row and column objects such that the clusters of the row and column objects form blocks that are either complete (all 1s) or null (all 0s) to the greatest extent possible. Multiple restarts of an object relocation heuristic that seeks to minimize the number of inconsistencies (i.e., 1s in null blocks and 0s in complete blocks) with ideal block structure is the predominant approach for tackling this problem. As an alternative, we propose a fast and effective implementation of tabu search. Computational comparisons across a set of 48 large network matrices revealed that the new tabu-search heuristic always provided objective function values that were better than those of the relocation heuristic when the two methods were constrained to the same amount of computation time.

17.
Br J Math Stat Psychol ; 74(1): 34-63, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-31705539

RESUMEN

Deterministic blockmodelling is a well-established clustering method for both exploratory and confirmatory social network analysis seeking partitions of a set of actors so that actors within each cluster are similar with respect to their patterns of ties to other actors (or, in some cases, other objects when considering two-mode networks). Even though some of the historical foundations for certain types of blockmodelling stem from the psychological literature, applications of deterministic blockmodelling in psychological research are relatively rare. This scarcity is potentially attributable to three factors: a general unfamiliarity with relevant blockmodelling methods and applications; a lack of awareness of the value of partitioning network data for understanding group structures and processes; and the unavailability of such methods on software platforms familiar to most psychological researchers. To tackle the first two items, we provide a tutorial presenting a general framework for blockmodelling and describe two of the most important types of deterministic blockmodelling applications relevant to psychological research: structural balance partitioning and two-mode partitioning based on structural equivalence. To address the third problem, we developed a suite of software programs that are available as both Fortran executable files and compiled Fortran dynamic-link libraries that can be implemented in the R software system. We demonstrate these software programs using networks from the literature.


Asunto(s)
Programas Informáticos , Análisis por Conglomerados
18.
PLoS One ; 16(4): e0247751, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33826612

RESUMEN

There are many psychological applications that require collapsing the information in a two-mode (e.g., respondents-by-attributes) binary matrix into a one-mode (e.g., attributes-by-attributes) similarity matrix. This process requires the selection of a measure of similarity between binary attributes. A vast number of binary similarity coefficients have been proposed in fields such as biology, geology, and ecology. Although previous studies have reported cluster analyses of binary similarity coefficients, there has been little exploration of how cluster memberships are affected by the base rates (percentage of ones) for the binary attributes. We conducted a simulation experiment that compared two-cluster K-median partitions of 71 binary similarity coefficients based on their pairwise correlations obtained under 15 different base-rate configurations. The results reveal that some subsets of coefficients consistently group together regardless of the base rates. However, there are other subsets of coefficients that group together for some base rates, but not for others.


Asunto(s)
Algoritmos , Simulación por Computador , Modelos Teóricos
19.
Ophthalmic Plast Reconstr Surg ; 26(4): 305-6, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20551856

RESUMEN

An 87-year-old patient presented with a 6-week history of an isolated progressive destructive nodular eyelid mass, secondary nodular and ulcerative lesions, and regional painful lymphadenopathy. After 4 weeks, fungal cultures demonstrated Sporothrix schenckii. S. schenckii is a rare dimorphic fungus that can occasionally involve the periocular skin. The authors' case demonstrates typical clinical features, emphasizes the delay in diagnosis, and shows effective treatment with oral itraconazole.


Asunto(s)
Dermatomicosis/diagnóstico , Infecciones Fúngicas del Ojo/diagnóstico , Enfermedades de los Párpados/diagnóstico , Sporothrix/aislamiento & purificación , Esporotricosis/diagnóstico , Anciano de 80 o más Años , Antifúngicos/uso terapéutico , Dermatomicosis/tratamiento farmacológico , Dermatomicosis/microbiología , Infecciones Fúngicas del Ojo/tratamiento farmacológico , Infecciones Fúngicas del Ojo/microbiología , Enfermedades de los Párpados/tratamiento farmacológico , Enfermedades de los Párpados/microbiología , Femenino , Humanos , Itraconazol/uso terapéutico , Esporotricosis/tratamiento farmacológico , Esporotricosis/microbiología
20.
Br J Math Stat Psychol ; 73(3): 375-396, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-31512759

RESUMEN

Most partitioning methods used in psychological research seek to produce homogeneous groups (i.e., groups with low intra-group dissimilarity). However, there are also applications where the goal is to provide heterogeneous groups (i.e., groups with high intra-group dissimilarity). Examples of these anticlustering contexts include construction of stimulus sets, formation of student groups, assignment of employees to project work teams, and assembly of test forms from a bank of items. Unfortunately, most commercial software packages are not equipped to accommodate the objective criteria and constraints that commonly arise for anticlustering problems. Two important objective criteria for anticlustering based on information in a dissimilarity matrix are: a diversity measure based on within-cluster sums of dissimilarities; and a dispersion measure based on the within-cluster minimum dissimilarities. In many instances, it is possible to find a partition that provides a large improvement in one of these two criteria with little (or no) sacrifice in the other criterion. For this reason, it is of significant value to explore the trade-offs that arise between these two criteria. Accordingly, the key contribution of this paper is the formulation of a bicriterion optimization problem for anticlustering based on the diversity and dispersion criteria, along with heuristics to approximate the Pareto efficient set of partitions. A motivating example and computational study are provided within the framework of test assembly.


Asunto(s)
Análisis por Conglomerados , Modelos Estadísticos , Psicología/estadística & datos numéricos , Algoritmos , Heurística Computacional , Simulación por Computador , Evaluación Educacional/estadística & datos numéricos , Humanos , Pruebas Neuropsicológicas/estadística & datos numéricos , Psicometría/estadística & datos numéricos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA