Share Email Print

Proceedings Paper

Coalmine: an experience in building a system for social media analytics
Author(s): Joshua S. White; Jeanna N. Matthews; John L. Stacy
Format Member Price Non-Member Price
PDF $17.00 $21.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Social media networks make up a large percentage of the content available on the Internet and most of the time users spend online today is in interacting with them. All of the seemingly small pieces of information added by billions of people result in a enormous rapidly changing dataset. Searching, correlating, and understanding billions of individual posts is a significant technical problem; even the data from a single site such as Twitter can be difficult to manage. In this paper, we present Coalmine a social network data-mining system. We describe the overall architecture of Coalmine including the capture, storage and search components. We also describe our experience with pulling 150-350 GB of Twitter data per day through their REST API. Specifically, we discuss our experience with the evolution of the Twitter data APIs from 2011 to 2012 and present strategies for maximizing the amount of data collected. Finally, we describe our experiences looking for evidence of botnet command and control channels and examining patterns of SPAM in the Twitter dataset.

Paper Details

Date Published: 7 May 2012
PDF: 11 pages
Proc. SPIE 8408, Cyber Sensing 2012, 84080A (7 May 2012); doi: 10.1117/12.918933
Show Author Affiliations
Joshua S. White, Clarkson Univ. (United States)
Jeanna N. Matthews, Clarkson Univ. (United States)
John L. Stacy, Clarkson Univ. (United States)

Published in SPIE Proceedings Vol. 8408:
Cyber Sensing 2012
Igor V. Ternovskiy; Peter Chin, Editor(s)

© SPIE. Terms of Use
Back to Top