A Novel Clustering Methodology Based on Modularity Optimisation for Detecting Authorship Affinities in Shakespearean Era Plays.

Naeni, Leila M; Craig, Hugh; Berretta, Regina; Moscato, Pablo

Naeni, Leila M; Craig, Hugh; Berretta, Regina; Moscato, Pablo.

Afiliação

Naeni LM; The Priority Research Centre of Bioinformatics and Information-Based Medicine, The University of Newcastle, Newcastle, New South Wales, Australia.
Craig H; School of Electrical Engineering and Computer Science, Faculty of Engineering and Built Environment, The University of Newcastle, Newcastle, New South Wales, Australia.
Berretta R; School of Built Environment, Faculty of Design, Architecture and Building, University of Technology Sydney, Sydney, Australia.
Moscato P; Centre for Literary and Linguistic Computing, School of Humanities and Social Science, The University of Newcastle, Newcastle, New South Wales, Australia.

PLoS One ; 11(8): e0157988, 2016.

Article em En | MEDLINE | ID: mdl-27571416

ABSTRACT

ABSTRACT

In this study we propose a novel, unsupervised clustering methodology for analyzing large datasets. This new, efficient methodology converts the general clustering problem into the community detection problem in graph by using the Jensen-Shannon distance, a dissimilarity measure originating in Information Theory. Moreover, we use graph theoretic concepts for the generation and analysis of proximity graphs. Our methodology is based on a newly proposed memetic algorithm (iMA-Net) for discovering clusters of data elements by maximizing the modularity function in proximity graphs of literary works. To test the effectiveness of this general methodology, we apply it to a text corpus dataset, which contains frequencies of approximately 55,114 unique words across all 168 written in the Shakespearean era (16th and 17th centuries), to analyze and detect clusters of similar plays. Experimental results and comparison with state-of-the-art clustering methods demonstrate the remarkable performance of our new method for identifying high quality clusters which reflect the commonalities in the literary style of the plays.

Assuntos

Algoritmos; Análise por Conglomerados

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Análise por Conglomerados Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2016 Tipo de documento: Article País de afiliação: Austrália

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google