29 Nov 2023

ZK in a nutshell

TOC

ZK SNARKs

Zero-Knowledge: The Verifier learns nothing about the secret inputs.
Succinct: The proof is very small, and the verification process is very fast.
Non-Interactive: The Prover simply sends a proof, and the Verifier can check it without any back-and-forth communication.

Four aspects are involved:

Encoding as a Polynomial Problem

The initial challenge of a zkSNARK is to prove that a program—which can be complex, with loops, conditional logic, and various operations—was executed correctly. To make this verifiable, the program is first “flattened” into a long sequence of basic arithmetic steps (e.g., \(a \times b = c\) or \(a + b = c\)). This process, often called arithmetization, creates an “arithmetic circuit.” This circuit is then transformed into a large, single polynomial equation.

The text describes this final form as \(t(x)h(x) = w(x)v(x)\). In this representation, the polynomials \(t(x)\), \(h(x)\), \(w(x)\), and \(v(x)\) are constructed to represent the entire computation’s “witness” (the secret inputs and intermediate values) and its constraints (the program logic). These polynomials are constructed so that the equation \(t(x)h(x) = w(x)v(x)\) will hold true for a specific set of inputs \(x\) if and only if every single gate in the original arithmetic circuit was computed correctly. The Prover’s task is thus transformed from “I ran this code” to “I know polynomials \(t, h, w, v\) that satisfy this specific equation.”

Succinctness by Random Sampling

The polynomials created in the previous step have a very high degree, proportional to the number of steps in the computation. To verify the Prover’s claim, the Verifier would theoretically need to check that \(t(x)h(x) - w(x)v(x) = 0\) for all possible values of \(x\), which is computationally infeasible.

This is solved by random sampling, based on the Schwartz-Zippel Lemma. This principle states that if two different polynomials are evaluated at a random point \(s\), the probability that they will yield the same result is negligible. Therefore, the Verifier does not check the entire polynomial. Instead, they choose a single, secret, random number \(s\) from a very large field and challenge the Prover to prove the equality only at that one point: \(t(s)h(s) = w(s)v(s)\). This single check is probabilistically sufficient to ensure the entire polynomial equation is correct. This technique reduces a computationally demanding verification problem to a much simpler one, achieving Succinctness.

Homomorphic Encoding / Encryption

This random sampling creates a new problem: how can the Prover compute \(t(s)\), \(h(s)\), \(w(s)\), and \(v(s)\) if \(s\) is the Verifier’s secret? The Prover must not learn \(s\); if they did, they could cheat by creating fake, small polynomials that only work at \(s\). This is solved using Homomorphic Encoding.

Instead of giving \(s\) to the Prover, the Verifier (or a trusted setup) provides encoded versions of powers of \(s\), specifically \(E(s^1), E(s^2), E(s^3), \dots\) up to the highest degree needed. The encoding function \(E\) has a “homomorphic” property that allows for certain computations on the encoded data. Since a polynomial like \(t(x)\) is a weighted sum of powers of \(x\) (e.g., \(c_0 + c_1x + c_2x^2 + \dots\)), the Prover can use the encoded powers \(E(s^i)\) and their knowledge of the coefficients \(c_i\) to compute the encoded result \(E(t(s))\). This allows the Prover to evaluate the polynomials at the secret point \(s\) without knowing its value.

Zero Knowledge

After the previous step, the Prover has the values \(E(t(s))\), \(E(h(s))\), etc. If they simply sent these to the Verifier, the Verifier could still potentially analyze them and reverse-engineer information about the Prover’s secret inputs. To prevent this and achieve Zero Knowledge, the Prover must obfuscate the proof.

This is done by multiplying the proof components by another secret, random number \(k\), which only the Prover knows. Instead of proving \(A = B\) (where \(A = t(s)h(s)\) and \(B = w(s)v(s)\)), the Prover uses \(k\) to prove something equivalent, like \(A \cdot k = B \cdot k\). (The actual cryptography, often involving elliptic curve pairings, is more complex, but this illustrates the fundamental principle). The Verifier can check this “blinded” equation. Because \(k\) is random and unknown to them, they cannot use the final values to deduce the original \(A\) or \(B\). This ensures that the proof is randomized and reveals no information about the secret inputs other than the validity of the statement.

ZK STARKs

Scalable: Can handle very large computations efficiently.
Transparent: Does not require a trusted setup.

Encoding as a Polynomial Problem

Like SNARKs, zk-STARKs also prove a program was executed correctly, but its arithmetization begins with an “execution trace.” This trace, represented by a polynomial \(P(x)\), must satisfy certain constraints at every step. For a computational step \(x\) within a specific domain \(H\), a state transition is valid if it satisfies a set of constraint equations. These can be combined into a single polynomial constraint \(C\), which checks the relationship between the current state \(P(x)\) and the next state \(P(g \cdot x)\) (where \(g\) is a generator for the domain \(H\)). For the entire computation to be correct, this must hold for all steps:

\[ C(x, P(x), P(g \cdot x)) = 0 \quad \text{for all } x \in H \]

To transform this multi-point check into a single polynomial identity, we use a “vanishing polynomial” \(Z_H(x) = \prod_{h \in H}(x-h)\). The Prover must then find a “quotient” polynomial \(D(x)\) such that the following equation holds true for all \(x\):

\[ C(x, P(x), P(g \cdot x)) = Z_H(x) \cdot D(x) \]

The Prover’s task is thus transformed into proving they know polynomials \(P(x)\) and \(D(x)\) that satisfy this specific equation.

Succinctness by Random Sampling

Likewise, the polynomials created in the previous step are far too large for a Verifier to check exhaustively. STARKs also solve this with random sampling. The Prover commits to the polynomials \(P(x)\) and \(D(x)\) (typically via Merkle trees), and the Verifier challenges them with a single random point \(s\) from a very large field.

The Verifier only needs to check that the equation holds at this single point:

\[ C(s, P(s), P(g \cdot s)) = Z_H(s) \cdot D(s) \]

The Prover provides the evaluations \(P(s)\), \(P(g \cdot s)\), and \(D(s)\), along with Merkle proofs to show they correspond to the committed polynomials. This single check is probabilistically sufficient to ensure the entire polynomial equation is correct.

Transparency and Polynomial Integrity

This “commit-then-reveal” process creates a new problem: how can the Verifier trust that the data the Prover committed to actually represents a low-degree polynomial? This is solved using the FRI (Fast Reed-Solomon Interactive Oracle Proof of Proximity) protocol.

FRI works through a recursive process. In each step, the Prover takes a polynomial \(f(x)\) of degree \(d\) and splits it into two smaller polynomials, \(g(x)\) and \(h(x)\), of degree less than \(d/2\), such that:

\[ f(x) = g(x^2) + x \cdot h(x^2) \]

The Verifier provides a random challenge and asks the Prover to combine \(g\) and \(h\) into a new polynomial for the next round. This process repeats, reducing the degree of the polynomial at each step. After several rounds, the degree becomes so small that the Verifier can check it directly. This protocol ensures the integrity of the polynomials without requiring a trusted setup.

Zero Knowledge

To achieve Zero Knowledge, STARKs must also prevent the revealed proof values from leaking information. This is done by adding a random, secret low-degree polynomial to the proof. Instead of committing to the original trace polynomial \(P(x)\), the Prover commits to a masked polynomial \(P'(x)\):

\[ P'(x) = P(x) + Z_H(x) \cdot R(x) \]

Here, \(R(x)\) is a random polynomial known only to the Prover. Because \(Z_H(x)\) is zero across the entire execution trace \(H\), \(P'(x)\) is identical to \(P(x)\) at every crucial step, meaning the proof of correctness is unaffected. However, at the random point \(s\) chosen by the Verifier, the value \(P'(s)\) is randomized by \(R(s)\), effectively masking the original secret values.

From Interactive to Non-Interactive

A detail of the STARK protocol is that its underlying mechanism (FRI) is interactive, requiring a back-and-forth exchange of challenges. For practical use, this is made non-interactive using the Fiat-Shamir heuristic. Instead of a live Verifier, a cryptographic hash function generates the random challenges from the Prover’s previous messages. This allows the Prover to generate the entire proof transcript by themself, which can be verified by anyone without interaction.

Block Chain

WUJS

I'm here

ZK in a nutshell

ZK SNARKs

Encoding as a Polynomial Problem

Succinctness by Random Sampling

Homomorphic Encoding / Encryption

Zero Knowledge

ZK STARKs

Encoding as a Polynomial Problem

Succinctness by Random Sampling

Transparency and Polynomial Integrity

Zero Knowledge

From Interactive to Non-Interactive