
Over the past year, the consensus around how to build zk-VMs has converged towards using the RISC-V architecture as a basis for the byte code these VMs can run. To prove the correctness of the implementation of a zkVM, as well as to give concrete semantics to smart contracts compiled to RISC-V code, we need to have precise specifications telling us what each RISC-V instruction does. Specifications for RISC-V instruction, as well as for other instruction set architectures, exist in the Sail language [1], but to use them in a general purpose interactive theorem prover, we would like to “export” them to our favorite tool Lean [2]. But while Sail front ends for provers like Rocq and Isabelle already exist, an export for Lean is still missing.
Funded by the Ethereum Foundation’s “verified zkEVMs” program, Lindy Labs helped writing a Lean exporter for Sail, and therefore a pathway to having specifications for RISC-V instructions in Lean. This happened in collaboration with the University of Cambridge and Galois Inc.
An up-to-date model of the RISC-V specifications can be found here: https://github.com/opencompl/sail-riscv-lean
To give an example of the translation, the specification of multiplication, as specified in Sail by the following code
function clause execute (MUL(rs2, rs1, rd, mul_op)) = {
let rs1_bits = X(rs1);
let rs2_bits = X(rs2);
let rs1_int = if mul_op.signed_rs1 then signed(rs1_bits) else unsigned(rs1_bits);
let rs2_int = if mul_op.signed_rs2 then signed(rs2_bits) else unsigned(rs2_bits);
let result_wide = to_bits_truncate(2 * xlen, rs1_int * rs2_int);
X(rd) = if mul_op.high
then result_wide[(2 * xlen - 1) .. xlen]
else result_wide[(xlen - 1) .. 0];
RETIRE_SUCCESS
}
is translated to the Lean code
def execute_MUL (rs2 : regidx) (rs1 : regidx) (rd : regidx) (mul_op : mul_op) : SailM ExecutionResult := do
let rs1_bits ← do (rX_bits rs1)
let rs2_bits ← do (rX_bits rs2)
let rs1_int :=
bif mul_op.signed_rs1
then (BitVec.toInt rs1_bits)
else (BitVec.toNat rs1_bits)
let rs2_int :=
bif mul_op.signed_rs2
then (BitVec.toInt rs2_bits)
else (BitVec.toNat rs2_bits)
let result_wide := (to_bits_truncate (l := (2 *i xlen)) (rs1_int *i rs2_int))
(wX_bits rd
(bif mul_op.high
then (Sail.BitVec.extractLsb result_wide ((2 *i xlen) -i 1) xlen)
else (Sail.BitVec.extractLsb result_wide (xlen -i 1) 0)))
(pure RETIRE_SUCCESS)
where SailM is a monad capturing the side effect a sail function can have on the processor (e.g. reading and writing on registers).
Notable challenges in the translation have been the following:
Sail has some very specific language features that are hard to replicate in Lean. A good example for this is the the presence of bidirectional mappings, which are resolved to unidirectional ones before the translation. The handling of side effects of those mappings make it hard to model them succinctly in Lean, and their elaboration remains one of the performance bottlenecks of the RISC-V model.
Sail liberally casts between (non-negative) natural numbers and (signed) integers, something that Lean is much more hesitant to do. Getting this right required quite some plumbing on the Lean side.
Sail is not an inherently total language, so for recursive functions, it doesn’t check for termination. As already the case with the Rocq front end for Sail, we needed to help Lean figure out how to prove termination for some of the translated functions.
We are happy to contribute to the effort of having an ecosystem of components needed to build fully verified zk-VMs and thank the Ethereum Foundation and especially Alex Hicks for their coordination between different groups and companies.
[1] https://github.com/rems-project/sail