Signatures
Signatures
Putting data out in the environment allows everyone to access it. The problem is that it allows any one to alter and recreate that data in their own vision. Sometimes that is great, other times it is disastrous. I love the idea of someone using public data to make something, but I really don’t like when data is altered to promote a falsehood.
The historic way to protect a document and validate its authenticity was by adding a signature. If I produce something I can sign it. Artist do this for artwork and it was on every letter when those were a common thing. If I endorse the works of others I can also sign it. This is very common in legal documents where multiple parties negotiate terms and language, followed by signatures. Often the contracts themselves are written by our agents (lawyers in this case) and we are just signing to show we agree.
Another feature of signatures is that they can often be placed in a way to make it harder for someone to modify what the signature protects. We don’t leave a large open area between the last paragraph and the signature block for that reason. On more important documents we will often sign or initialize each page to prevent them from being replaced.
Signatures also come in many forms. It can be the stylized strokes for your name with a pen. In some cultures it can be a custom stamp that you press into ink or wax. There is nothing stopping someone from signing a work with their fingerprints other than convention (Davinci left his fingerprints all over his work which has become a signature after the fact for him).
There have always been problems with forgeries and fraud with signatures. Once computers that could work with more than text appeared the problem with the old style of signatures blew up, but so did the solutions.
Digital Signatures
There have been a wide range of digital signatures since the computer was invented. Early on it was little more than the metadata the system kept with each file. Ownership was proven by the system logs and the authenticity was reliant on the trust of the system security. The ability to copy content made this a weak form of proof at best.
With the advent of Public Key Cryptography we obtained the ability to mathematically sign an ordered set of bits. All of these work by first taking the bits that need to be signed and hashing them with a cryptographic hash. The cryptographic hash takes any length of input bits and produces a set number of output bits. Along the way it does operations that provide specific guarantees to make it computationally infeasible to find or create any other set of bits to generates the same output. The person endorsing the data performs a set of operations with a key only they have on that hash. The result is a digital signature.
A digital signature has some really neat concepts. First off, if any single bit of the bits signed change, the signature doesn’t match. The verification does not require access to the private key, only the signers public version of the key. In this case the signer should make the public key available far and wide so that others can verify all of their signatures.
There are a number of options for the cryptographic hashes and the signature algorithms. Some of the earliest were PGP, but industry and governments pushed on towards X.509 that you see in email and web pages today. Both of these use mathematically operations based around either factoring large prime numbers or solving discrete logarithm problems. With sufficiently large versions of each today’s computers would take until the end of time to break these, thus making the signatures very secure, for now.
Quantum computing
There are a number of problems that quantum computers can attack faster than a regular computer. Adiabatic quantum computers like those from D-Wave can run optimization problems at scales that may become truly astounding soon. General quantum computers, like those being developed by Google and IBM, can factor composite numbers that fit within the number of entangled qubits of the machine.
Right now most public key cryptography is in the thousands of bits for key size. At the same time the largest quantum computers are below 100 entangled qubits and loose coherence from heat quickly. While the engineering for quantum computers aren’t even close to attacking public key algorithms, anyone with some sense should wonder for how long that might last.
Post quantum crypto
I’m not going to talk about all the ways people are looking at keeping secrets before and after quantum computers can attack our public key systems. From an anticryptography perspective though I am worried that all of the data out there is going to have its origins questioned when quantum computers can be used to fake signatures.
Think about that for a moment. We trust a lot of digital materials today. Some times because it comes from a government or corporate web site (which is often a reasonable assumption for trust). Maybe the data is held by a trusted archive. More and more though, it should be signed. All of the current signature methods we use today could be attacked and faked in the near future. We don’t know to what scale or at what cost, but it should have us thinking about how to preserve data authenticity today.
Let’s be clear about something up front. Quantum computing does not have any apparent way to attack cryptographic hashes. There are classes of attacks that can weaken the hashes by roughly half their space, but a large enough hash should remain secure no matter how large the quantum computer gets. That is a strong concept that we need to lean on going forward.
Some more good news is that we have some hash based signature algorithms that have been around for a while. Leslie Lamport created a signature system using only hashes in 1979. The system is fairly easy to implement and is secure.
- Select a cryptographic hash function of n bits to use.
- Create a private key by generating 2*n, n bit numbers and making n pairs.
- Create a public key by hashing each number in the private key.
- Sign a message by hashing the message. Each bit of the message hash result is to pick one of the pairs of the private key (0 picks the first half of the pair, 1 picks the second half) and include that number with the message as the signature.
- Verify the signature by hashing the message and each of the numbers in the signature. Check if each hash of a number matches the appropriate first or second pair in the public key.
You can implement this in Python in about an hour. With modern hashes available like SHA512 it is incredibly secure. The downsides include the keys and signatures being large (64KB key and 32KB signature for SHA512), and only being able to use it once. Each time you use it after the first time some of pairs become fully available. In just a few cases the private key will mostly be revealed.
For signing things like a video the signature size is minimal, but the ability to use it only once makes it awkward at best. The good news is that there are answers to fix both.
Winternitz signatures compress the public key, private key and signature to a set of chunks by increasing the computational load as a trade off. The trade off is better than linear allowing for signature mechanisms like WOTS+ to implement with secret keys and signatures of 3-4 KB using public keys around half a KB. The storage burden is on par with a few pages of text, and a light burden for audio, images or programs.
No one wants to carry around all the keys they will ever need even after WOTS+ shrinks them. In the case of a microcontroller, the firmware updates can only be signed as many times as there is room to store the public keys. That adds up quickly even at 512 bytes each. Even for a laptop it can be burdensome when you need to verify thousands of messages from each member of a community.
Merkle trees come to the rescue here. You may have heard of this term in relation to blockchains and the reason is almost identical to how they can be used to work around the one time use issue with Lamport signatures.
To use the Merkle tree we first decide how high it should be. In the diagram above it is 4 nodes high. In that case we generate 8 key pairs, public and private. We publish the root of the Merkle tree as our public key (a3,0 in this case).
Now every time we need to sign something we use the left most key pair. Starting with Y0 we sign the message of interest, but we need to add all of the hashes between the Y0 and the root. In the case of Y0 that is (a0,1), (a1,1), (a2,1). This lets the verifier recreate our root key (a3,0) and prove the authenticity of the Y0 key. We can then sign 8 messages in this case. Doubling the number of messages we can sign adds another node we need to include in the signature. It is fairly cheap to create a key for 1,000 - 1,000,000 messages. Even a billion messages would only be 30x the size of the original signature.
These concepts have been codified in the XMSS standard, IETF RFC 8391. There are very few implementations of XMSS, but you can find them if you look. A couple in Go, a few in Python and at least one in C. While it would be nice to have a few more implementations, this is enough to meet the needs of anticryptography.
Anticryptography meets Post Quantum crypto
I’ve said it before, there is a lot of data running around that does not deserve to be preserved and presented to others. It is important that details we do find important to preserve can be vouched for. Signing those things is one way to preserve them. Standard signatures aren’t going to mean much in the future, especially if somewhere in between now and then all of the private keys are cracked by a quantum computer.
If someone like the Library of Congress, Wikipedia or the Internet Archive wants to sign their holdings they could create a master key to sign each item they hold. Going into the future a reader would only need that root hash to verify any signature on any supposed holding. This prevents the rewriting of history, which is something that must never happen.
I’d like to see XMSS become an easy thing to implement and use. That will allow the field of anticryptography to shore up it’s security going into the future in a way that generations to come will appreciate.