A personally refined transcript of #Hal Finney from 26 years ago made from this video. "https://www.youtube.com/watch?v=e4ha2NTEox0" ""We will next hear from Hal Finney. I want to prove to you that I know a message that hashes to a given hash value using the SHA-1 hash. I don't want to reveal anything about the message to you—this is a zero-knowledge proof. I've written a program to do this, which I will explain. Here's the basic model: Peggy, the prover, knows a message, and Victor, the verifier, is going to verify this proof. She sends a commitment to the message over to Victor, which locks her into the message without revealing any information about it. Next, she calculates the SHA-1 hash of this message, while Victor starts with the message commitment and simultaneously calculates a commitment to the SHA-1 hash. She assists him in this calculation, again without revealing any information about the message. At the end, she has the hash, Victor has the commitment to the hash, and when she opens the commitment, he can verify that it matches what he calculated. This is how he knows that this was the message. The motivation for doing this is that many presentations we hear about say this particular protocol could be done with general-purpose zero-knowledge proofs or multi-party computations, but those techniques can be inefficient or impractical. I wanted to explore just how impractical we really are. I chose SHA because there isn't going to be an elegant zero-knowledge proof for a SHA hash, as the whole algorithm is designed to be complex and irreversible. This serves as a benchmark to assess the state of the art for zero-knowledge proof systems that can handle general and irregular problems like this. The proof system I'm using is one developed by Kramer and Gentry, which will be presented this Thursday right here in Crypto. Their zero-knowledge proofs are efficient and quite flexible. I've implemented two of the generators described in their paper: one suitable for zero-knowledge arguments where privacy is protected unconditionally, and the other for zero-knowledge proofs where the prover is unconditionally bound. The choice depends on the model being used. A nice feature of their system is that you can either commit to values up to the size of the group order in these discrete logarithm systems or do commitments under just the value, which is restricted to a certain range. I make use of that in my program. Another advantage is that it allows for pre-computation of the commitments. The prover and verifier can work together to calculate a pool of 50 commitments ahead of time, which is relatively expensive, and then perform the protocol in a shorter period. Now, just a brief description of what SHA-1 looks like. I won't go into detail, but I'll give you an idea of the operations involved. In this case, the message fits into just one SHA-1 buffer, so that amount of information is linked about it, but not the actual length. The message is padded in a 512-bit buffer, treated as 16 words. The rest of this W array is filled up in an 80-word array of 32-bit words, with each entry being the result of four XORs of earlier entries in the array and rotations. This creates the W array, which is where the input goes. The actual SHA compression function works with five 32-bit words that hold the state. These are processed through a big five-way addition, where five values are combined in an arithmetic sum modulo 2^32. A value from the W array, a constant, and a function that takes three of the words (using selection, parity, or majority function) are involved. These operations are done bitwise, again modulo 2^32. The plan is to transform the program so that instead of having our variables be 32-bit variables, they become arrays of bit commitments for 32 entries. Each variable is still a single variable within C, and when needed, we translate those to the value commitment. This allows for parallel execution: the prover executes the code while the verifier executes almost the same code, working together as I showed earlier. I have a library of primitive operations on these commitment arrays, including Boolean and arithmetic operations. The original program, which has straightforward arithmetic, is translated by substituting function calls into the library to perform the operations. This is all done using cryptographic limits, which handle large integers. I'll show you a sample of the code to illustrate both the form used in the zero-knowledge proof and the original form side by side, highlighting the transformations that need to be made. Originally, this is a fragment of a compression function. Here’s what we’re doing with the app function, which has four different types used. The big addition I showed, where five values come together, corresponds to the code that needs to be converted into the zero-knowledge proof. Each operation has become a function call, and now I have a function that adds the five values. The privacy of the prover is protected, taking 40 minutes to calculate on my machine, which is a 200 MHz Pentium with 6 megabytes of data exchanged. When you actually go to the protocol, it only takes 100 seconds with 22 kilobytes. The zero-knowledge proof takes about twice as many resources. If anyone is interested in seeing how this works, this is my email address right there. Thank you!""
Notes by neoplasma | export