Share Email Print

Proceedings Paper

Fractals, malware, and data models
Author(s): Holger M. Jaenisch; Andrew N. Potter; Deborah Williams; James W. Handley
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

We examine the hypothesis that the decision boundary between malware and non-malware is fractal. We introduce a novel encoding method derived from text mining for converting disassembled programs first into opstrings and then filter these into a reduced opcode alphabet. These opcodes are enumerated and encoded into real floating point number format and used for characterizing frequency of occurrence and distribution properties of malware functions to compare with non-malware functions. We use the concept of invariant moments to characterize the highly non-Gaussian structure of the opcode distributions. We then derive Data Model based classifiers from identified features and interpolate and extrapolate the parameter sample space for the derived Data Models. This is done to examine the nature of the parameter space classification boundary between families of malware and the general non-malware category. Preliminary results strongly support the fractal boundary hypothesis, and a summary of our methods and results are presented here.

Paper Details

Date Published: 7 May 2012
PDF: 16 pages
Proc. SPIE 8408, Cyber Sensing 2012, 84080X (7 May 2012); doi: 10.1117/12.941769
Show Author Affiliations
Holger M. Jaenisch, Sentar, Inc. (United States)
Licht Strahl Engineering INC (United States)
Johns Hopkins Univ. (United States)
Andrew N. Potter, Sentar, Inc. (United States)
Deborah Williams, Sentar, Inc. (United States)
James W. Handley, Licht Strahl Engineering INC (United States)

Published in SPIE Proceedings Vol. 8408:
Cyber Sensing 2012
Igor V. Ternovskiy; Peter Chin, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?