Factori Analysis; Secure and Enhance Smart Contracts

Blockchain Tezos Michelson Factori dApps Static Analysis Safety Security

Published on 2023/04/06

Factori Analysis; Secure and Enhance Smart Contracts

We are releasing an exciting new feature of Factori, which will be very useful for the Tezos developer community at large.

From the start, Factori has been about automatically deriving as much information as possible from the Michelson code of smart contracts. Working on Michelson ensures that we are language-agnostic. Whether you wrote your contract in Ligo, SmartPy, Archetype, Scaml, or any other high-level language doesn't matter. This makes Factori so powerful.

So far, we've been deriving interfaces: SDKs in four languages (Typescript, Python, C#, and OCaml), a web interface which is essentially a basic free Dapp, pre-built configurations for two blockchain indexers. The reasoning was that during and after writing a smart contract, a developer needs a whole lot of things that they have to build and rebuild tediously and that this could be automated. The powerful programming language manipulation features of OCaml ensured that this could be done robustly and securely.

Now, all of these were syntactic features. You know what OCaml is also very well known for? Semantics! Not just rewriting syntax trees ad nauseam, not just reasoning about the general data structures and entrypoints of the program, but extracting, as far as the theory will let us, the meaning of what the program does.

This is only the beginning of this journey, but it's already very exciting: today, we present the analyze and lint commands of factori, and on this occasion, we release version 0.6. If you don't know what Factori is, please check out our previous blog articles about factori. Or better: have a look at our freshly published doc:

https://functori.com/docs/factori/

What `factori lint` does

The lint command checks for the structure of a Michelson file and returns warnings if it does not fit the standard we set. It can be used as follows:

$ factori lint michelson <path-to-a-michelson-file>

For instance, here is a list of warnings the linter could return:

Returns a warning if there is no annotation on a field of the storage

Warning: At line 19 characters 55 to 62, there is no annotation on type address

If the same annotation is used for two different types

Warning: At line 34 characters 100 to 180, the token_metadata annotation was defined like a (big_map %token_metadata nat (pair (nat %token_id) (map %token_info string bytes))). But At line 20 characters 109 to 142 the same annotation is of type (map %token_metadata string bytes)

Factori has a set of forbidden annotations (among them all the Michelson instructions)
```
Warning: At line 38 characters 118 to 119, aDD is a forbidden annotation.
```
Most Michelson smart contracts have the same structure (parameter, storage, code, views). factori lint returns a warning if the contract does not follow this structure
```
Warning: The first section of a contract was expected to be the parameter but got the storage section
```

What `factori analyze` does

The analyze command checks for common issues we have come to identify as part of our auditing work. It currently handles the following:

Unused storage variables detection: It can happen that a field in the storage of a contract is unused, which leads to unnecessary costs;
Use of set, list, map in storage: This might be dangerous as these data structures are unbounded and will be fully deserialized every time the contract is called. If they become too big, deserialization gas limits might be hit, preventing any future call to the smart contract.
ABS detection: The ABS instruction can break a contract's logic if it is (mis)used to convert integers to natural numbers without checking if the integer is negative. IS_NAT should be preferred to handle the failing case explicitly.
Reentrancy detection: To prevent reentrancy, the analysis returns a warning if it detects potential reentrancy with the TRANSFER_TOKENS Michelson instruction.
Permission analysis: Factori detects the conditions which lead to a failure, and then prints them for each entrypoint.

The factori analyze command can be used on a Michelson file:

$ factori analyze michelson <path-to-a-michelson-file>

Or given a KT1 address deployed on some Tezos network:

$ factori analyze KT1 <kt1-address> --network ghostnet

Contract example

An example is worth a thousand words. Let's consider the (toy) mligo contract below. In our next blog post, we will show and examine together the analysis of a bigger real-world contract, a FA2, but for now, we really want to showcase the features in a small and legible setting.

It defines a storage type that contains a set of admin addresses and two natural numbers, x and y. The contract has three entrypoints: Add, Replace, and Reset.

The Add entrypoint adds an integer value to the current value of x, and sets y to the result, as long as the amount of tez sent with the transaction is not zero. The contract fails if the result exceeds a predefined maximum value. The Replace entrypoint replaces the current value of x with an absolute value of an integer parameter, but only if the sender's address is in the set of admin addresses. The contract also sends a transaction with the current contract's balance to the sender's address. The Reset entrypoint sets the value of y to 0.

The contract has been originated on Ghostnet with address: KT1LDZqBXtHjHtUawTwR6jcUSZBNyp2KPVVL

type storage = { admins : address set; x : nat; y : nat }

type parameter =
  | Add of int
  | Replace of int
  | Reset

let (max_nat : nat) = 100n

let lower_than_max_nat (i : nat) : nat =
  if i > max_nat then (failwith "MAX_NAT_REACHED" : nat)
  else i

let is_admin (addr : address) (map : address set) : unit =
  if Set.mem addr map then ()
  else (failwith "NOT_ADMIN" : unit)

(** Main access point that dispatches to the entrypoints according to
    the smart contract parameter. *)
let main ((action,store) : parameter * storage) : operation list * storage =
  match action with
  | Add n ->
    let amount = Tezos.get_amount () in
    let () = if amount = 0mutez then (failwith "NO_TEZ_FOR_ADD" : unit) else () in
    let elt = (abs n) + store.x in
    [], { store with y = lower_than_max_nat elt }
  | Replace n ->
    let addr : address = Tezos.get_sender () in
    let () = is_admin addr store.admins in
    let ctrt : unit contract = Tezos.get_contract addr in
    let balance : tez = Tezos.get_balance () in
    let op : operation = Tezos.transaction () balance ctrt in
    [op], { store with x = abs n }
  | Reset -> [], { store with y = 0n }

Analyzing the toy contract

First, set up or update Factori. If you don't want to install Factori from source, there is a docker image and a script for using it. You should use version 0.6:

$ wget https://gitlab.com/functori/dev/factori/-/snippets/2509259/raw/main/factori.sh -O factori
$ chmod +x factori
$ docker pull registry.gitlab.com/functori/dev/factori:0.6

And now you may use the script factori as an executable. You can check the version:

$ factori --version
0.6 (9f5aef7dccbf885ff52760231882877b9e3a175b)

More detailed instructions can be found in the Factori documentation.

Performing the analyses described above is as simple as running the command:

$ factori analyze kt1 KT1LDZqBXtHjHtUawTwR6jcUSZBNyp2KPVVL --network ghostnet

This should display a bunch of information. Here is a step-by-step analysis of the messages which should be printed in your terminal.

Found unbounded data structure with non-lazy deserialization in the storage
set is used once in the storage
admins field is a set in the storage

This message means that the contract uses a set data structure in its storage, specifically in the admins field. This is dangerous because a set can grow arbitrarily.

There are 2 uses of the ABS instruction
There is a CALL_CONTRACT in this contract, check for reentrancy

The analysis detects two things:

the contract uses the ABS instruction, which is not the right way to convert an int to a nat (one should use IS_NAT);
the contract does a contract call, which could lead to a reentrancy attack.

The following fields of the storage may be unused:
var_5 -> (nat%y)

The message above indicates that the y field of the storage may not be used in the contract. This could suggest that the field is unnecessary and can be removed to optimize the contract.

Result of permission analysis for entrypoint add, parameter : (int%add)

  Fail with "NO_TEZ_FOR_ADD" if ((AMOUNT = 0))
 Else
  Fail with "MAX_NAT_REACHED" if (((ABS ((int%add)) + storage.x) > 100))

The analysis shows that the add entrypoint will fail with the message NO_TEZ_FOR_ADD if the transaction amount is 0. Otherwise, it will fail with the message MAX_NAT_REACHED if the sum of (ABS int%add) and storage.x is greater than 100.

Result of permission analysis for entrypoint replace, parameter : (int%replace)

  Fail with "NOT_ADMIN" if (not (SENDER in storage.admins))
 Else
  Fail with "bad address for get_contract" if (GET_CONTRACT (SENDER) is None)

The replace entrypoint will fail with the message NOT_ADMIN if the sender is not in the admins set of the storage. Otherwise, it will fail with the message bad address for get_contract if the GET_CONTRACT instruction returns None when called on the address of the sender.

Result of permission analysis for entrypoint reset, parameter : (unit%reset)

The reset entrypoint does not appear to have any permission checks.

Future Directions

Before concluding the post, we want to mention some future work that could be conducted to extend the set of analyses:

(Generalized) Dead code detection. Currently, it's only done for failures that are not reachable;
Coverage tests generation;
Refined reentrancy check, taking into account whether the storage is modified by the entrypoint;
Better support of loops (note that for now, the analysis may stumble on them);
Exploring whether invariants remain true after some randomly generated calls to entrypoints (a kind of bounded model checking);
More refined warnings about unbounded structures. E.g.:
- critical when users can actively make them grow;
- medium when only an admin can write into them;
- high when the data structure is inside a value of a big_map indexed by, e.g., a user: they might block themselves out of the contract, without blocking the contract as a whole.

Conclusion

There are many possibilities opened with this new feature. Be aware that it will not work in all cases for all contracts. Also, note that you should always do your due diligence and not rely solely on Factori's output formulas. But it can be very useful as a diagnosis tool for smart contracts!

If you're developing a contract, you can immediately check that what you think are the conditions for your entrypoints to fail or succeed are actually what the Michelson code implements. You can also be quickly warned if your storage has unused fields, or some dangerous data structures (typically: maps instead of big_maps).

If you're working on some contract whose high-level source code you don't have access to, or you're not sure you understand, or you're not sure was compiled properly, you can check what condition just caused a failure, or which conditions need to be specified to call a particular entrypoint.

Please let us know if you enjoy this new feature! Also, remember that Functori provides meticulous smart contract audits based on years of experience in smart contract development and review, Tezos core development, by our engineers with PhDs in formal methods. Here is an example of the FlameDefi audit.