Sep 19, 2023
4 min read
Introducing Aegis
There are cases in which automated tools like Certora are not enough to validate that a contract written in Cairo fulfills a given specification. Such cases include functions whose correctness relies on mathematical theorems which are not present or even not expressible in most or all automated tools. Interactive Theorem Provers like Lean can be an option in these cases: they enable the user to write down any sort of mathematical statement in a very expressive language and prove in a way that is guided by the output of the Lean System and the editor. When a proof is finished, it is checked by the theorem prover and added to the canon of its knowledge.
Lean itself doesn't have any concept of programming languages, of STARKs, or of Cairo. But for Cairo 0.1, Jeremy Avigad et al. tweaked the Cairo compiler in a way that the produced output was in the form of lean, containing automatically derived and verified specifications of all Cairo functions involved. These specifications were meant to be refined by the users to match the intended semantics of the functions. The user was then asked to show that the refinement is valid by proving that it is implied by the automatically derived specification.
We used this tool to verify parts of our fixed point library WadRay. Unfortunately, with the advent of Cairo 1.0, this tool was deprecated. That is why we decided to take it on ourselves to build a framework to verify Cairo 1.0 functions.
Introducing Aegis
Instead of tweaking the compiler or analyzing the Cairo syntax itself, we decided to use Cairo's intermediate representation, Sierra, to extract the syntax of a function. Sierra's simple design is generally very much suited to base a semantic definition on.
Suppose for example we wanted to define this admittedly very basic function in Cairo:
fn double(a: felt252) -> felt252 {
a * a
}
The compiled Sierra code would look like this:
type felt252 = felt252;
libfunc dup<felt252> = dup<felt252>;
libfunc felt252_mul = felt252_mul;
libfunc store_temp<felt252> = store_temp<felt252>;
dup<felt252>([0]) -> ([0], [2]);
felt252_mul([2], [0]) -> ([1]);
store_temp<felt252>([1]) -> ([3]);
return([3]);
foo::foo::double@0([0]: felt252) -> (felt252);
In this code, the first block declares the types which are used, the second block lists the internal library functions which are used ("libfuncs"), and the third block invokes these library functions. From the semantics of each libfunc invocation we derive the semantics of the whole function.
Aegis now allows us to verify the semantics (the “intended meaning of the function”). From the user's perspective, applying our verification approach to the above examples looks like this:
aegis_load_file "SierraLean/Tests/foo.cairo"
aegis_spec "foo::foo::double" := fun _ a ρ => ρ = a * a
aegis_sound "foo::foo::double" := fun _ a ρ => by
rintro rfl
rfl
aegis_complete
The aegis_load_file, aegis_spec, aegis_sound, and aegis_complete commands have been implemented by us to make the verification Lean file as concise as possible. Specifically,
aegis_load_file parses a Sierra or Cairo file. If necessary it invokes the Cairo compiler to generate the Sierra file from the given Cairo file.
aegis_spec allows the user to give a specification for the given Sierra function. The specification is a predicate that depends on the function's inputs and outputs as well as a Metadata variable, which holds, as of now, the gas costs for each function.
aegis_sound hands the obligation to prove the specification to the user. The proof goal states that the automatically derived specification of the function implies manual specification given by the user.
aegis_complete signals the completion of the verification of the loaded Sierra file. It checks whether all specifications have been given and throws an error if this is not the case.
Notable Features
The semantics of function calls refer to the manual specification of the called function. So the specification of the called function has to be given before the soundness of the calling function can be proven.
Loops in Sierra are converted to recursive calls, so they are inherently supported because function calls are.
The implicit range checks which Sierra adds as explicit RangeCheck typed inputs and outputs are handled but not equipped with semantics. The same goes for many other implicits.
What's Next
A more stratified handling of polymorphic functions.
More Lean tactics which help to prove the soundness of specifications.
A complete verification of Cairo’s core library (”corelib”).
If you are interested in working with Aegis or have questions about its design and capabilities, don’t hesitate to contact us at info@lindylabs.net!