Thursday May 10th 2012

Linguistic Steganography: Information Hiding in Text

Seminar by Stephen Clark, University Senior Lecturer, University of Cambridge, Computer Lab

Linguistic Steganography is concerned with hiding information in a natural language text, for the purposes of sending secret messages. A related area is natural language watermarking, in which information is added to a text in order to identify it, for example for the purposes of copyright. Linguistic Steganography algorithms hide information by manipulating properties of the text, for example by replacing some words with their synonyms. Unlike image-based steganography, linguistic steganography is in its infancy with little existing work. In this talk I will motivate the problem, in particular as an interesting application for natural language processing (NLP) and especially generation. Linguistic steganography is a difficult NLP problem because any change to the cover text must retain the meaning and style of the original, in order to prevent detection by an adversary.

I will describe a number of linguistic transformations that we have investigated, including synonym substitution and adjective deletion. For the adjective deletion I will describe a novel secret sharing scheme in which many people receive a copy of the original text, but with different adjectives deleted; only when the various texts are combined together can the secret message be revealed.

Joint work with Ching-Yun (Frannie) Chang.

Thursday May 10th at 11am in room 4.31/33 at the School of Informatics, University of Edinburgh

