Sunday, 15 April, 2018 UTC


Summary

Fingerprinting text is really very nifty; the ability to encode hidden data within a string of characters opens up a large number of opportunities. For example, someone within your team is leaking confidential information but you don’t know who. Simply send each team member some classified text with their name encoded in it. Wait for it to be leaked, then extract the name from the text — the classic canary trap.
Here’s a method that hides data in text using zero-width characters. Unlike various other ways of text fingerprinting, zero width characters are not removed if the formatting is stripped, making them nearly impossible to get rid of without re-typing the text or using a special tool. In fact you’ll have a hard time detecting them at all – even terminals and code editors won’t display them.
To make the process easy to perform, [Vedhavyas] created a command line utility to embed and extract a payload using any text. Each letter in the secret message is converted to binary, then encoded in zero-width characters. A zero-width-non-joiner character is used for 0, and a zero-width-space character for 1.
[Vedhavyas’] tool was inspired by a post by [Tom], who uses a javascript example (with online demo) to explain what’s going on. This lets you test out the claim that you can paste the text without losing the hidden data. Try pasting it into a text editor. We were able to copy it again from there and retrieve the data, but it didn’t survive being saved and cat’d to the command line.
Of course, to get your encoding game really tight, you should be looking at getting yourself an enigma wristwatch…