I came across a worrying article about discrepancies between viewed and printed PDFs (thanks, Julia Evans). That got me concerned about signing digital documents in non-trivial formats such as PDF.

In short, the article talked about an corner case in the possibilities of encoding fonts in PDFs that made room for different interpretations, and it just so happened that one interpretation, used by most viewers, would make text encoded in that font visible, whereas another interpretation, used by some printers, would scale that text into invisibly fine print, one thousand times smaller than intended.

I immediately pictured the following scenario: someone receives a legal document in PDF form to sign, prints it, reads it, and then proceeds to sign it. Instead of signing the paper copy and taking a picture (or using a scanner), people are being encouraged to use digital signatures these days. Whether using a detached GnuPG signature or one of these JavaScrippled (because they won't work unless you allow them to install and run JavaScript programs on your browser) onlines services that embed visible and invisible digital artifacts in PDF files, you sign it, and send it back.

A few days later, you open the PDF file in a viewer, and you're surprised by some commitments spelled out in it that you don't recall seeing or agreeing to before. Indeed, when you check back the printed version, they weren't there. But your digital signature on it checks out. What gives?

You may have just falled victim of a new (?) kind of exploit, akin to getting people to sign pages on paper with parts written in invisible ink.

An initial reaction might be to advise against relying on printouts, and checking the document with a specific viewer. That may thwart the attack above, but not the reverse exploit: the same font scaling ambiguity in the PDF specification could be exploited by laying it out on the page as a thousand times bigger, so that it would scale out of the page into invisibility on most viewers, but it would shrink to regular print size on paper.

In either case, you think you read it before signing it, but a software trick made some of the document invisible to you, so that you'd sign without reading it.

But really, I'm trying to get at a far more general problem, namely, that most people don't realize the complexity involved in rendering a digital document into an image on the screen or on paper, and the numerous potential incompatibilities that omissions and underspecifications of the digital format may bring about, that could make room for parts of digital documents to appear or disappear depending on the implementation of the renderer.

At a time when most people are led (or forced) to allow a third party (or many) to automatically install, on their own computers, modified versions of software they rely on for supposedly-secure communication, or download and run it from a third party web site every time, that may seem like a far-fetched concern.

But for people who take safety and security seriously, this is yet another aspect to worry about, and it's not even a new one. Long ago, I already warned people that, when they sent me files in such proprietary formats as .DOC, although I had tools to view them, there was no way for me to be sure that the rendering that I got was what they meant for me to see. But people didn't trust digital signatures so readily back then.

Some people joke that digital computing's purpose is to solve problems that didn't exist before it introduced them. Unfortunately, it looks like this is one of those cases in which it introduces a problem, but a satisfying solution is not readily available: reencoding and signing the result would invalidate earlier signatures and turn text into paths; viewing with multiple viewers and printers would be exhausting but not exhaustive; relying on the third-party signing software's rendering could make you vulnerable to third-party updates that change the rendering...

It doesn't seem to be solvable, short of requiring (or extracting) a plain-text reference version, and signing that too (or only). How about we start adopting (or going back to) better practices, including simpler textual document formats?

So blong...