Information hiding through errors: a confusing approach
Author(s): Mercan Topkara; Umut Topkara; Mikhail J. Atallah
A substantial portion of the text available online is of a kind that tends to contain many typos and ungrammatical abbreviations, e.g., emails, blogs, forums. It is therefore not surprising that, in such texts, one can carry out information-hiding by the judicious injection of typos (broadly construed to include abbreviations and acronyms). What is surprising is that, as this paper demonstrates, this form of embedding can be made quite resilient. The resilience is achieved through the use of computationally asymmetric transformations (CAT for short): Transformations that can be carried out inexpensively, yet reversing them requires much more extensive semantic analyses (easy for humans to carry out, but hard to automate). An example of CAT is transformations that consist of introducing typos that are ambiguous in that they have many possible corrections, making them harder to automatically restore to their original form: When considering alternative typos, we prefer ones that are also close to other vocabulary words. Such encodings do not materially degrade the text's meaning because, compared to machines, humans are very good at disambiguation. We use typo confusion matrices and word level ambiguity to carry out this kind of encoding. Unlike robust synonym substitution that also cleverly used ambiguity, the task here is harder because typos are very conspicuous and an obvious target for the adversary (synonyms are stealthy, typos are not). Our resilience does not depend on preventing the adversary from correcting without damage: It only depends on a multiplicity of alternative corrections. In fact, even an adversary who has boldly "corrected" all the typos by randomly choosing from the ambiguous alternatives has, on average, destroyed around w/4 of our w-bit mark (and incurred a high cost in terms of the damage done to the meaning of the text).

Date Published: 27 February 2007
PDF: 12 pages
Proc. SPIE 6505, Security, Steganography, and Watermarking of Multimedia Contents IX, 65050V (27 February 2007); doi: 10.1117/12.706980
Mercan Topkara, Purdue Univ. (United States)
Umut Topkara, Purdue Univ. (United States)
Mikhail J. Atallah, Purdue Univ. (United States)

Published in SPIE Proceedings Vol. 6505:
Security, Steganography, and Watermarking of Multimedia Contents IX
Edward J. Delp III; Ping Wah Wong, Editor(s)