Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS One ; 16(3): e0247795, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33760852

RESUMO

Human mortality is in part a function of multiple socioeconomic factors that differ both spatially and temporally. Adjusting for other covariates, the human lifespan is positively associated with household wealth. However, the extent to which mortality in a geographical region is a function of socioeconomic factors in both that region and its neighbors is unclear. There is also little information on the temporal components of this relationship. Using the districts of Hong Kong over multiple census years as a case study, we demonstrate that there are differences in how wealth indicator variables are associated with longevity in (a) areas that are affluent but neighbored by socially deprived districts versus (b) wealthy areas surrounded by similarly wealthy districts. We also show that the inclusion of spatially-distributed variables reduces uncertainty in mortality rate predictions in each census year when compared with a baseline model. Our results suggest that geographic mortality models should incorporate nonlocal information (e.g., spatial neighbors) to lower the variance of their mortality estimates, and point to a more in-depth analysis of sociospatial spillover effects on mortality rates.


Assuntos
Mortalidade , Fatores Socioeconômicos , Teorema de Bayes , Hong Kong/epidemiologia , Humanos , Modelos Estatísticos
4.
PLoS One ; 15(12): e0244245, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33332455

RESUMO

Allowing members of the crowd to propose novel microtasks for one another is an effective way to combine the efficiencies of traditional microtask work with the inventiveness and hypothesis generation potential of human workers. However, microtask proposal leads to a growing set of tasks that may overwhelm limited crowdsourcer resources. Crowdsourcers can employ methods to utilize their resources efficiently, but algorithmic approaches to efficient crowdsourcing generally require a fixed task set of known size. In this paper, we introduce cost forecasting as a means for a crowdsourcer to use efficient crowdsourcing algorithms with a growing set of microtasks. Cost forecasting allows the crowdsourcer to decide between eliciting new tasks from the crowd or receiving responses to existing tasks based on whether or not new tasks will cost less to complete than existing tasks, efficiently balancing resources as crowdsourcing occurs. Experiments with real and synthetic crowdsourcing data show that cost forecasting leads to improved accuracy. Accuracy and efficiency gains for crowd-generated microtasks hold the promise to further leverage the creativity and wisdom of the crowd, with applications such as generating more informative and diverse training data for machine learning applications and improving the performance of user-generated content and question-answering platforms.


Assuntos
Algoritmos , Crowdsourcing/métodos , Aprendizado de Máquina , Resolução de Problemas , Análise e Desempenho de Tarefas , Simulação por Computador , Humanos
5.
Entropy (Basel) ; 22(3)2020 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-33286039

RESUMO

Contagion models are a primary lens through which we understand the spread of information over social networks. However, simple contagion models cannot reproduce the complex features observed in real-world data, leading to research on more complicated complex contagion models. A noted feature of complex contagion is social reinforcement that individuals require multiple exposures to information before they begin to spread it themselves. Here we show that the quoter model, a model of the social flow of written information over a network, displays features of complex contagion, including the weakness of long ties and that increased density inhibits rather than promotes information flow. Interestingly, the quoter model exhibits these features despite having no explicit social reinforcement mechanism, unlike complex contagion models. Our results highlight the need to complement contagion models with an information-theoretic view of information spreading to better understand how network properties affect information flow and what are the most necessary ingredients when modeling social behavior.

6.
J R Soc Interface ; 17(171): 20200667, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33050776

RESUMO

Creativity is viewed as one of the most important skills in the context of future-of-work. In this paper, we explore how the dynamic (self-organizing) nature of social networks impacts the fostering of creative ideas. We run six trials (N = 288) of a web-based experiment involving divergent ideation tasks. We find that network connections gradually adapt to individual creative performances, as the participants predominantly seek to follow high-performing peers for creative inspirations. We unearth both opportunities and bottlenecks afforded by such self-organization. While exposure to high-performing peers is associated with better creative performances of the followers, we see a counter-effect that choosing to follow the same peers introduces semantic similarities in the followers' ideas. We formulate an agent-based simulation model to capture these intuitions in a tractable manner, and experiment with corner cases of various simulation parameters to assess the generality of the findings. Our findings may help design large-scale interventions to improve the creative aptitude of people interacting in a social network.


Assuntos
Criatividade , Pensamento , Humanos , Rede Social
7.
PeerJ Comput Sci ; 6: e296, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33816947

RESUMO

Non-experts have long made important contributions to machine learning (ML) by contributing training data, and recent work has shown that non-experts can also help with feature engineering by suggesting novel predictive features. However, non-experts have only contributed features to prediction tasks already posed by experienced ML practitioners. Here we study how non-experts can design prediction tasks themselves, what types of tasks non-experts will design, and whether predictive models can be automatically trained on data sourced for their tasks. We use a crowdsourcing platform where non-experts design predictive tasks that are then categorized and ranked by the crowd. Crowdsourced data are collected for top-ranked tasks and predictive models are then trained and evaluated automatically using those data. We show that individuals without ML experience can collectively construct useful datasets and that predictive models can be learned on these datasets, but challenges remain. The prediction tasks designed by non-experts covered a broad range of domains, from politics and current events to health behavior, demographics, and more. Proper instructions are crucial for non-experts, so we also conducted a randomized trial to understand how different instructions may influence the types of prediction tasks being proposed. In general, understanding better how non-experts can contribute to ML can further leverage advances in Automatic machine learning and has important implications as ML continues to drive workplace automation.

8.
Nat Hum Behav ; 3(2): 122-128, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30944448

RESUMO

Modern society depends on the flow of information over online social networks, and users of popular platforms generate substantial behavioural data about themselves and their social ties1-5. However, it remains unclear what fundamental limits exist when using these data to predict the activities and interests of individuals, and to what accuracy such predictions can be made using an individual's social ties. Here, we show that 95% of the potential predictive accuracy for an individual is achievable using their social ties only, without requiring that individual's data. We used information theoretic tools to estimate the predictive information in the writings of Twitter users, providing an upper bound on the available predictive information that holds for any predictive or machine learning methods. As few as 8-9 of an individual's contacts are sufficient to obtain predictability compared with that of the individual alone. Distinct temporal and social effects are visible by measuring information flow along social ties, allowing us to better study the dynamics of online activity. Our results have distinct privacy implications: information is so strongly embedded in a social network that, in principle, one can profile an individual from their available social ties even when the individual forgoes the platform completely.


Assuntos
Teoria da Informação , Idioma , Aprendizado de Máquina , Redes Sociais Online , Comportamento Social , Mídias Sociais , Humanos
9.
Nat Hum Behav ; 3(2): 195, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30944454

RESUMO

The original and corrected figures are shown in the accompanying Publisher Correction.

10.
Chaos ; 28(7): 075304, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30070496

RESUMO

We propose a model for the social flow of information in the form of text data, which simulates the posting and sharing of short social media posts. Nodes in a graph representing a social network take turns generating words, leading to a symbolic time series associated with each node. Information propagates over the graph via a quoting mechanism, where nodes randomly copy short segments of text from each other. We characterize information flows from these text via information-theoretic estimators, and we derive analytic relationships between model parameters and the values of these estimators. We explore and validate the model with simulations on small network motifs and larger random graphs. Tractable models such as ours that generate symbolic data while controlling the information flow allow us to test and compare measures of information flow applicable to real social media data. In particular, by choosing different network structures, we can develop test scenarios to determine whether or not measures of information flow can distinguish between true and spurious interactions, and how topological network properties relate to information flow.

11.
Sci Rep ; 8: 46959, 2018 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-29553130

RESUMO

This corrects the article DOI: 10.1038/srep44499.

12.
PLoS One ; 12(8): e0182662, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28806413

RESUMO

Crowdsourcing works by distributing many small tasks to large numbers of workers, yet the true potential of crowdsourcing lies in workers doing more than performing simple tasks-they can apply their experience and creativity to provide new and unexpected information to the crowdsourcer. One such case is when workers not only answer a crowdsourcer's questions but also contribute new questions for subsequent crowd analysis, leading to a growing set of questions. This growth creates an inherent bias for early questions since a question introduced earlier by a worker can be answered by more subsequent workers than a question introduced later. Here we study how to perform efficient crowdsourcing with such growing question sets. By modeling question sets as networks of interrelated questions, we introduce algorithms to help curtail the growth bias by efficiently distributing workers between exploring new questions and addressing current questions. Experiments and simulations demonstrate that these algorithms can efficiently explore an unbounded set of questions without losing confidence in crowd answers.


Assuntos
Comunicação , Crowdsourcing , Algoritmos , Simulação por Computador , Internet , Modelos Teóricos , Probabilidade , Interface Usuário-Computador , Vocabulário
13.
Sci Rep ; 7: 44499, 2017 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-28317835

RESUMO

Increased interconnection between critical infrastructure networks, such as electric power and communications systems, has important implications for infrastructure reliability and security. Others have shown that increased coupling between networks that are vulnerable to internetwork cascading failures can increase vulnerability. However, the mechanisms of cascading in these models differ from those in real systems and such models disregard new functions enabled by coupling, such as intelligent control during a cascade. This paper compares the robustness of simple topological network models to models that more accurately reflect the dynamics of cascading in a particular case of coupled infrastructures. First, we compare a topological contagion model to a power grid model. Second, we compare a percolation model of internetwork cascading to three models of interdependent power-communication systems. In both comparisons, the more detailed models suggest substantially different conclusions, relative to the simpler topological models. In all but the most extreme case, our model of a "smart" power network coupled to a communication system suggests that increased power-communication coupling decreases vulnerability, in contrast to the percolation model. Together, these results suggest that robustness can be enhanced by interconnecting networks with complementary capabilities if modes of internetwork failure propagation are constrained.

14.
R Soc Open Sci ; 3(4): 160007, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27152217

RESUMO

Complex problems often require coordinated group effort and can consume significant resources, yet our understanding of how teams form and succeed has been limited by a lack of large-scale, quantitative data. We analyse activity traces and success levels for approximately 150 000 self-organized, online team projects. While larger teams tend to be more successful, workload is highly focused across the team, with only a few members performing most work. We find that highly successful teams are significantly more focused than average teams of the same size, that their members have worked on more diverse sets of projects, and the members of highly successful teams are more likely to be core members or 'leads' of other teams. The relations between team success and size, focus and especially team experience cannot be explained by confounding factors such as team age, external contributions from non-team members, nor by group mechanisms such as social loafing. Taken together, these features point to organizational principles that may maximize the success of collaborative endeavours.

15.
Artigo em Inglês | MEDLINE | ID: mdl-26565290

RESUMO

In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability. While we are interested here in text and have framed our treatment appropriately, our work is potentially applicable to other areas of research (e.g., speech, genomics, and mobility patterns) where one has ordered categorical data (e.g., sounds, genes, and locations). Our approach focuses on the phrase (whether word or larger) as the primary meaning-bearing lexical unit and object of study. To do so, we employ our previously developed framework for generating word-conserving phrase-frequency data. Upon training our model with the Wiktionary, an extensive, online, collaborative, and open-source dictionary that contains over 100000 phrasal definitions, we develop highly effective filters for the identification of meaningful, missing phrase entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique and expanding our knowledge of the defined English lexicon of phrases.

16.
Sci Rep ; 5: 12209, 2015 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-26259699

RESUMO

With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipf's law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.


Assuntos
Mineração de Dados/métodos , Idioma , Modelos Teóricos , Humanos
17.
Artigo em Inglês | MEDLINE | ID: mdl-26066216

RESUMO

Natural languages are full of rules and exceptions. One of the most famous quantitative rules is Zipf's law, which states that the frequency of occurrence of a word is approximately inversely proportional to its rank. Though this "law" of ranks has been found to hold across disparate texts and forms of data, analyses of increasingly large corpora since the late 1990s have revealed the existence of two scaling regimes. These regimes have thus far been explained by a hypothesis suggesting a separability of languages into core and noncore lexica. Here we present and defend an alternative hypothesis that the two scaling regimes result from the act of aggregating texts. We observe that text mixing leads to an effective decay of word introduction, which we show provides accurate predictions of the location and severity of breaks in scaling. Upon examining large corpora from 10 languages in the Project Gutenberg eBooks collection, we find emphatic empirical support for the universality of our claim.

18.
Artigo em Inglês | MEDLINE | ID: mdl-25974553

RESUMO

Power lines, roadways, pipelines, and other physical infrastructure are critical to modern society. These structures may be viewed as spatial networks where geographic distances play a role in the functionality and construction cost of links. Traditionally, studies of network robustness have primarily considered the connectedness of large, random networks. Yet for spatial infrastructure, physical distances must also play a role in network robustness. Understanding the robustness of small spatial networks is particularly important with the increasing interest in microgrids, i.e., small-area distributed power grids that are well suited to using renewable energy resources. We study the random failures of links in small networks where functionality depends on both spatial distance and topological connectedness. By introducing a percolation model where the failure of each link is proportional to its spatial length, we find that when failures depend on spatial distances, networks are more fragile than expected. Accounting for spatial effects in both construction and robustness is important for designing efficient microgrids and other network infrastructure.

20.
Proc Natl Acad Sci U S A ; 112(8): 2389-94, 2015 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-25675475

RESUMO

Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i) the words of natural human language possess a universal positivity bias, (ii) the estimated emotional content of words is consistent between languages under translation, and (iii) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.


Assuntos
Viés , Emoções , Idioma , Humanos , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA