TY - GEN
T1 - Comparing the performance of group detection algorithm in serial and parallel processing environments
AU - Brown, Channing
AU - Poole, Marshall Scott
AU - Ahmed, Iftekhar
AU - Pilny, Andrew
AU - Cai, Dora
AU - Atouba, Yannick
PY - 2012
Y1 - 2012
N2 - Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.
AB - Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.
KW - MMOG
KW - data mining
KW - group detection
KW - online games
KW - serial vs. parallel processing
KW - social computing
KW - virtual groups
UR - http://www.scopus.com/inward/record.url?scp=84865312348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84865312348&partnerID=8YFLogxK
U2 - 10.1145/2335755.2335817
DO - 10.1145/2335755.2335817
M3 - Conference contribution
AN - SCOPUS:84865312348
SN - 9781450316026
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the XSEDE12 Conference
T2 - 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond, XSEDE12
Y2 - 16 July 2012 through 19 July 2012
ER -