Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
J Nonparametr Stat ; 34(3): 628-662, 2022.
Article in English | MEDLINE | ID: mdl-36172077

ABSTRACT

We consider incomplete observations of stochastic processes governing the spread of infectious diseases through finite populations by way of contact. We propose a flexible semiparametric modeling framework with at least three advantages. First, it enables researchers to study the structure of a population contact network and its impact on the spread of infectious diseases. Second, it can accommodate short- and long-tailed degree distributions and detect potential superspreaders, who represent an important public health concern. Third, it addresses the important issue of incomplete data. Starting from first principles, we show when the incomplete-data generating process is ignorable for the purpose of Bayesian inference for the parameters of the population model. We demonstrate the semiparametric modeling framework by simulations and an application to the partially observed MERS epidemic in South Korea in 2015. We conclude with an extended discussion of open questions and directions for future research.

2.
Stat Methods Appt ; 30(5): 1285-1288, 2021.
Article in English | MEDLINE | ID: mdl-34776825

ABSTRACT

The special issue on Statistical Analysis of Networks aspires to convey the breadth and depth of statistical learning with networks, ranging from networks that are observed to networks that are unobserved and learned from data. It includes ten select papers with methodological and theoretical advances, and demonstrates the usefulness of the network paradigm by applications to current problems.

3.
Psychometrika ; 86(2): 378-403, 2021 06.
Article in English | MEDLINE | ID: mdl-33939062

ABSTRACT

Classic item response models assume that all items with the same difficulty have the same response probability among all respondents with the same ability. These assumptions, however, may very well be violated in practice, and it is not straightforward to assess whether these assumptions are violated, because neither the abilities of respondents nor the difficulties of items are observed. An example is an educational assessment where unobserved heterogeneity is present, arising from unobserved variables such as cultural background and upbringing of students, the quality of mentorship and other forms of emotional and professional support received by students, and other unobserved variables that may affect response probabilities. To address such violations of assumptions, we introduce a novel latent space model which assumes that both items and respondents are embedded in an unobserved metric space, with the probability of a correct response decreasing as a function of the distance between the respondent's and the item's position in the latent space. The resulting latent space approach provides an interaction map that represents interactions of respondents and items, and helps derive insightful diagnostic information on items as well as respondents. In practice, such interaction maps enable teachers to detect students from underrepresented groups who need more support than other students. We provide empirical evidence to demonstrate the usefulness of the proposed latent space approach, along with simulation results.


Subject(s)
Educational Measurement , Space Simulation , Humans , Probability , Psychometrics , Surveys and Questionnaires
4.
Comput Stat Data Anal ; 152: 107029, 2020 Dec.
Article in English | MEDLINE | ID: mdl-32834264

ABSTRACT

A class of random graph models is considered, combining features of exponential-family models and latent structure models, with the goal of retaining the strengths of both of them while reducing the weaknesses of each of them. An open problem is how to estimate such models from large networks. A novel approach to large-scale estimation is proposed, taking advantage of the local structure of such models for the purpose of local computing. The main idea is that random graphs with local dependence can be decomposed into subgraphs, which enables parallel computing on subgraphs and suggests a two-step estimation approach. The first step estimates the local structure underlying random graphs. The second step estimates parameters given the estimated local structure of random graphs. Both steps can be implemented in parallel, which enables large-scale estimation. The advantages of the two-step estimation approach are demonstrated by simulation studies with up to 10,000 nodes and an application to a large Amazon product recommendation network with more than 10,000 products.

5.
Netw Sci (Camb Univ Press) ; 59: 98-119, 2019 Oct.
Article in English | MEDLINE | ID: mdl-32547745

ABSTRACT

Multilevel network data provide two important benefits for ERG modeling. First, they facilitate estimation of the decay parameters in geometrically weighted terms for degree and triad distributions. Estimating decay parameters from a single network is challenging, so in practice they are typically fixed rather than estimated. Multilevel network data overcome that challenge by leveraging replication. Second, such data make it possible to assess out-of-sample performance using traditional cross-validation techniques. We demonstrate these benefits by using a multilevel network sample of classroom networks from Poland. We show that estimating the decay parameters improves in-sample performance of the model and that the out-of-sample performance of our best model is strong, suggesting that our findings can be generalized to the population of interest.

6.
J Am Stat Assoc ; 77(3): 647-676, 2015 06 01.
Article in English | MEDLINE | ID: mdl-26560142

ABSTRACT

Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with 'ground truth'.

7.
Soc Networks ; 37: 42-55, 2014 May.
Article in English | MEDLINE | ID: mdl-24707073

ABSTRACT

The rescue and relief operations triggered by the September 11, 2001 attacks on the World Trade Center in New York City demanded collaboration among hundreds of organisations. To shed light on the response to the September 11, 2001 attacks and help to plan and prepare the response to future disasters, we study the inter-organisational network that emerged in response to the attacks. Studying the inter-organisational network can help to shed light on (1) whether some organisations dominated the inter-organisational network and facilitated communication and coordination of the disaster response; (2) whether the dominating organisations were supposed to coordinate disaster response or emerged as coordinators in the wake of the disaster; and (3) the degree of network redundancy and sensitivity of the inter-organisational network to disturbances following the initial disaster. We introduce a Bayesian framework which can answer the substantive questions of interest while being as simple and parsimonious as possible. The framework allows organisations to have varying propensities to collaborate, while taking covariates into account, and allows to assess whether the inter-organisational network had network redundancy-in the form of transitivity-by using a test which may be regarded as a Bayesian score test. We discuss implications in terms of disaster management.

8.
Ann Appl Stat ; 7(2): 1010-1039, 2013 Dec 10.
Article in English | MEDLINE | ID: mdl-26605002

ABSTRACT

We describe a network clustering framework, based on finite mixture models, that can be applied to discrete-valued networks with hundreds of thousands of nodes and billions of edge variables. Relative to other recent model-based clustering work for networks, we introduce a more flexible modeling framework, improve the variational-approximation estimation algorithm, discuss and implement standard error estimation via a parametric bootstrap approach, and apply these methods to much larger data sets than those seen elsewhere in the literature. The more flexible framework is achieved through introducing novel parameterizations of the model, giving varying degrees of parsimony, using exponential family models whose structure may be exploited in various theoretical and algorithmic ways. The algorithms are based on variational generalized EM algorithms, where the E-steps are augmented by a minorization-maximization (MM) idea. The bootstrapped standard error estimates are based on an efficient Monte Carlo network simulation idea. Last, we demonstrate the usefulness of the model-based clustering framework by applying it to a discrete-valued network with more than 131,000 nodes and 17 billion edge variables.

9.
Br J Math Stat Psychol ; 65(2): 263-81, 2012 May.
Article in English | MEDLINE | ID: mdl-21696381

ABSTRACT

Networks of relationships between individuals influence individual and collective outcomes and are therefore of interest in social psychology, sociology, the health sciences, and other fields. We consider network panel data, a common form of longitudinal network data. In the framework of estimating functions, which includes the method of moments as well as the method of maximum likelihood, we propose score-type tests. The score-type tests share with other score-type tests, including the classic goodness-of-fit test of Pearson, the property that the score-type tests are based on comparing the observed value of a function of the data to values predicted by a model. The score-type tests are most useful in forward model selection and as tests of homogeneity assumptions, and possess substantial computational advantages. We derive one-step estimators which are useful as starting values of parameters in forward model selection and therefore complement the usefulness of the score-type tests. The finite-sample behaviour of the score-type tests is studied by Monte Carlo simulation and compared to t-type tests.


Subject(s)
Models, Statistical , Social Networking , Computer Simulation/statistics & numerical data , Humans , Longitudinal Studies/statistics & numerical data , Models, Economic , Monte Carlo Method , Ownership/statistics & numerical data
10.
J Comput Graph Stat ; 21(4): 856-882, 2012 Dec 01.
Article in English | MEDLINE | ID: mdl-23828720

ABSTRACT

We review the broad range of recent statistical work in social network models, with emphasis on computational aspects of these methods. Particular focus is applied to exponential-family random graph models (ERGM) and latent variable models for data on complete networks observed at a single time point, though we also briefly review many methods for incompletely observed networks and networks observed at multiple time points. Although we mention far more modeling techniques than we can possibly cover in depth, we provide numerous citations to current literature. We illustrate several of the methods on a small, well-known network dataset, Sampson's monks, providing code where possible so that these analyses may be duplicated.

11.
Adv Data Anal Classif ; 5(2): 147-176, 2011 Jul.
Article in English | MEDLINE | ID: mdl-22003370

ABSTRACT

This paper explores time heterogeneity in stochastic actor oriented models (SAOM) proposed by Snijders (Sociological Methodology. Blackwell, Boston, pp 361-395, 2001) which are meant to study the evolution of networks. SAOMs model social networks as directed graphs with nodes representing people, organizations, etc., and dichotomous relations representing underlying relationships of friendship, advice, etc. We illustrate several reasons why heterogeneity should be statistically tested and provide a fast, convenient method for assessment and model correction. SAOMs provide a flexible framework for network dynamics which allow a researcher to test selection, influence, behavioral, and structural properties in network data over time. We show how the forward-selecting, score type test proposed by Schweinberger (Chapter 4: Statistical modeling of network panel data: goodness of fit. PhD thesis, University of Groningen 2007) can be employed to quickly assess heterogeneity at almost no additional computational cost. One step estimates are used to assess the magnitude of the heterogeneity. Simulation studies are conducted to support the validity of this approach. The ASSIST dataset (Campbell et al. Lancet 371(9624):1595-1602, 2008) is reanalyzed with the score type test, one step estimators, and a full estimation for illustration. These tools are implemented in the RSiena package, and a brief walkthrough is provided.

12.
J Am Stat Assoc ; 106(496): 1361-1370, 2011 Dec 01.
Article in English | MEDLINE | ID: mdl-22844170

ABSTRACT

In applications to dependent data, first and foremost relational data, a number of discrete exponential family models has turned out to be near-degenerate and problematic in terms of Markov chain Monte Carlo simulation and statistical inference. We introduce the notion of instability with an eye to characterize, detect, and penalize discrete exponential family models that are near-degenerate and problematic in terms of Markov chain Monte Carlo simulation and statistical inference. We show that unstable discrete exponential family models are characterized by excessive sensitivity and near-degeneracy. In special cases, the subset of the natural parameter space corresponding to non-degenerate distributions and mean-value parameters far from the boundary of the mean-value parameter space turns out to be a lower-dimensional subspace of the natural parameter space. These characteristics of unstable discrete exponential family models tend to obstruct Markov chain Monte Carlo simulation and statistical inference. In applications to relational data, we show that discrete exponential family models with Markov dependence tend to be unstable and that the parameter space of some curved exponential families contains unstable subsets.

13.
Ann Appl Stat ; 4(2): 567-588, 2010 Jun 01.
Article in English | MEDLINE | ID: mdl-25419259

ABSTRACT

A model for network panel data is discussed, based on the assumption that the observed data are discrete observations of a continuous-time Markov process on the space of all directed graphs on a given node set, in which changes in tie variables are independent conditional on the current graph. The model for tie changes is parametric and designed for applications to social network analysis, where the network dynamics can be interpreted as being generated by choices made by the social actors represented by the nodes of the graph. An algorithm for calculating the Maximum Likelihood estimator is presented, based on data augmentation and stochastic approximation. An application to an evolving friendship network is given and a small simulation study is presented which suggests that for small data sets the Maximum Likelihood estimator is more efficient than the earlier proposed Method of Moments estimator.

SELECTION OF CITATIONS
SEARCH DETAIL
...