How To Use Custom Detectors Effectively
This page provides examples on how to accomplish common tasks using Vanguard custom detectors and PAQL queries:
- Answering questions such as "Which contracts can I call to transfer ERC20 tokens?"
- Scanning your code base for vulnerabilities specific to your project.
While the mechanics of and interactions between various PAQL features will be explained, the details such as the classes and properties will not. Thus, we recommend that you consult the Solidity dialect reference as you go through each example.
Discovering Behavior With Informational Queries
PAQL queries can be used to provide specific, detailed information about the behavior of a given codebase. Typically, this involves constructing an informational query through the following process:
- Formulating a "problem statement" or "prompt" that specifies what information we wish to know.
- Identifying the various "actors" involved in the prompt.
- Identifying the "assumptions" that these actors must satisfy.
- Identifying the behavior that we are interested in.
- Finally, translating the actors, assumptions, and behavior into a PAQL
query.
The actors become variables in a
FINDclause, and behavior becomeWHEREclauses, and the assumptions becomeINclauses orWHEREclauses.
The examples below demonstrate how this process can be used to construct PAQL queries.
Example: Listing functions that modify storage variables
When reviewing code that involves complicated functions, it is useful to know what storage variables may be modified by those functions. Specifically, a problem statement for this is:
Which functions may contain writes to storage variables? List all such functions and storage writes.
- Actors: a function
f, a storage writew - Assumptions:
woccurs inf. - Target behavior: (none).
FIND
-- actors
Function f,
StorageWrite w IN f, -- assumption
For example, consider the following Solidity code:
contract Example {
uint256 a;
uint256 b;
mapping(address => uint256) m;
function extFun() external {
a = 10;
b = 11;
foo();
bar(10);
}
function foo() internal {
b += 5;
}
function bar(uint256 arg) public {
m[msg.sender] = arg;
}
}
The above query would return the following results:
- The function
extFunand the writea = 10. - The function
extFunand the writeb = 11. - The function
fooand the writeb = b + 5 - The function
barand the writem[msg.sender] = arg
A few other useful variations of this query:
-
To only retrieve the writes to specific variables, add a
WHEREclause that uses theStorageWrite.locationproperty. For example, this will restrict the writes to those involving themmapping:FIND
Function f,
StorageWrite w IN f,
WHERE
w.location == "m[*]", -- an additional assumption -
To only retrieve the writes inside of a specific contract, add a
Contractvariable and changefto iterate over the contract.FIND
Contract c,
Function f IN c,
StorageWrite w IN f,
WHERE
c.name == "Example", -- an additional assumption -
The
StorageWriteclass can iterate over other constructs like contracts, expressions, and statements. This allow us to drop theFunction f, for example:FIND
Contract c,
StorageWrite w IN c,
WHERE
c.name == "Example", -- an additional assumption
Example: Listing external function side effects
In the previous example, the query retrieved pairs of functions and storage
writes that occured directly in that function.
This meant that the query did not include extFun as writing to m[msg.sender] = arg, even though extFun calls g which does perform that write.
When trying to understand the behavior of a contract, we usually are more interested in all side effects that occur when invoking external functions. For example:
For every external function, which writes to storage variables could occur when that function is called? List all such functions and storage writes.
- Actors: a function
f, a storage writew - Assumptions:
fis externally callable,wis reachable fromf. - Target behavior: (none).
FIND
-- actors
Function f,
StorageWrite w IN f.reachable, -- assumption
WHERE
f.isExternallyCallable, -- assumption
This updated query has two changes.
First, the f.isExternallyCallable clause has been added to ensure that only
externally callable functions (i.e., of external or public visibility) are
searched.
Second, the storage writes will be retrieved from the Function.reachable
property of f, which cause internal calls to be "followed" when searching
through each function f.
When running the updated query on the Solidity code in the previous example, the results are:
- The
extFunand the writea = 10. - The
extFunand the writeb = 11. - The
extFunand the writeb = b + 5. - The
extFunand the writem[msg.sender] = arg - The
barand the writem[msg.sender] = arg
To also retrieve the function that directly contains the write, the variable declarations can instead be organized as
Function f,
Function g IN f.reachable,
StorageWrite w IN g,
Example: Listing all token transfers that can be executed
In large code bases, it is often difficult to understand what external calls may actually be executed for contracts with complicated inheritence hierarchies or call stacks. Consider the following Solidity code:
Example Solidity code (expand me)
pragma solidity >= 0.8.0;
interface IERC20 {
function transfer(address to, uint256 amount) external;
}
contract Grandparent {
IERC20 tokenA;
function _doThingCallback() virtual internal {
tokenA.transfer(msg.sender, 100);
}
function doThing() virtual public {
_doThingCallback();
}
}
contract Parent is Grandparent {
IERC20 tokenB;
function _doThingCallback() internal virtual override {
tokenB.transfer(msg.sender, 50);
super._doThingCallback();
}
}
contract Child is Parent {
IERC20 tokenC;
function _doThingCallback() internal virtual override {
tokenC.transfer(msg.sender, 75);
}
function doThing() public virtual override {
_doThingCallback();
Grandparent.doThing();
}
}
A question such as "What ERC20 token transfers does each contract's
toThing() function perform?" may be difficult to answer correctly by only
looking at the code, but it can be answered effectively using a PAQL query:
- Actors: a contract
contract, a functionf, external callcall - Assumptions:
contractcontainsf, which can executecallf's name isdoThing
- Target behavior:
callinvokestransfer.
FIND
-- actors
Contract contract,
Function f IN contract, -- assumption
ExternalCall call IN f.reachable, -- assumption
WHERE
-- assumptions
f.name == "doThing",
-- target behavior
call.name == "transfer",
Executing this query gives the following information:
Grandparent.doThing()executes the transfer totokenAinGrandparent._doThingCallback().Parent.doThing()executes (1) the transfer totokenBinParent._doThingCallback()and (2) the transfer totokenAinGrandparent._doThingCallback().Child.doThing()executes only the transfer totokenCinChild._doThingCallback(). Note that it is reported twice, because the assumptions onfare too lenient: when searchingChild, the query findsf = Child.doThing()(externally visible) as well as the inheritedf = Grandparent.doThing()(internally visible), both of which will invoke the virtualChild._doThingCallback().
It's also possible to retrieve the token storage variables involved in the transfer; see the value flow example for how to do this.
Checking For Errors
Custom detectors are particularly useful for searching for vulnerabilities in code. This is generally achieved by writing informational queries, which were demonstrated in previous section.
Queries that search for vulnerabilities are typically structured using one of two patterns:
- For all actors satisfying assumptions, raise a warning when negative (bad) behavior happens.
- For all actors satisfying assumptions, raise a warning when positive (good) behavior does not happen.
The first pattern only requires writing an informational query expressing the bad behavior to search for.
The second pattern is more complicated: it first requires writing an informational query that finds "good" behavior to search for, and then the query must be transformed into a version that searches for cases where the good behavior is missing.
Examples of these patterns are described below.
Example (positive): Using the EXISTS operator to check for missing access controls
Access controls on functions are typically implemented using require statements like
require(msg.sender == owner).
For complicated functions that may make many nested internal calls, it could be
easy to forget to insert the appropriate access controls.
Queries can provide a very simple way to check that the access controls are indeed present. Consider the following problem statement as an example:
Which
mintexternal functions may execute require statements involving theownerstorage variable? Provide a list of such functions and require statements.
- Actors:
- a function
f - a require statement
req - a storage read
rd
- a function
- Assumptions:
fis themintfunctionfis externally callablereqis reachable fromfreq's condition is influenced by the value read inrd
- Target behavior:
rdreads from theownerstorage variable
FIND
-- actors
Function f,
RequireLike req IN f.reachable,
StorageRead rd IN req.backwardSlices,
WHERE
-- assumptions
f.isExternallyCallable,
f.name == "mint",
-- target behavior
rd.location == "owner",
While this query reports the places where the access controls are present, we are oftentimes more interested when they are not present.
This can be achieved by "negating" the target behavior through a two-step
process.
First, we must translate the query into an EXISTS form over the variables
that match the access controls:
FIND
-- actors
Function f,
WHERE
-- assumptions
f.isExternallyCallable,
f.name == "mint",
-- target behavior
EXISTS
RequireLike req IN f.reachable,
StorageRead rd IN req.backwardSlices
WHERE {
rd.location == "owner",
}
Note that this version searches for functions where the require is present, rather than searching for all occurrences.
Finally, we will insert a ! operator in front of the EXISTS, which results
in a query that is equivalent to the problem statement:
Which
mintfunctions will not execute a require statement involving theowner? Provide a list of such functions.
FIND
-- actors
Function f,
WHERE
-- assumptions
f.isExternallyCallable,
f.name == "mint",
-- target behavior
!EXISTS
RequireLike req IN f.reachable,
StorageRead rd IN req.backwardSlices
WHERE {
rd.location == "owner",
}
The RequireLike req declaration also has to be moved into the EXISTS,
because the presence of the require is part of the target behavior.
Otherwise, the query would flag all require statements in the mint function
that are not influenced by the owner storage variable.
For example, flagging require(amount > 0) would not make sense!
Example (negative): Using value flow to detect arbitrary transferFroms
In many cases involving function calls, we also care about what values are
used as arguments to those calls.
For example, ERC20 transferFroms whose from arguments are supplied by the
external function's caller may be vulnerable to arbitrary transferFrom
attacks, as attackers can exploit ERC20 approvals granted to the contract to
"steal" tokens.
The problem of finding such vulnerabilities can be stated as follows:
Which external functions may invoke external calls to
transferFrom(address,address,uint256)such that thefromargument of the transfer may come from a parameter of the external function? Provide a list of such functions and call sites.
- Actors:
- a function
f - an external call
c - a call argument
callArg
- a function
- Assumptions:
fis externally callablecis reachable fromfcwill invoke the targettransferFrom(address,address,uint256)callArgis the first argument ofc
- Target behavior: there is a function parameter
paramsuch that:parambelongs tof- the value of
callArgdepends onparam
FIND
-- actors
Function f,
ExternalCall c IN f.reachable,
CallArgument callArg IN c,
WHERE
-- assumptions
f.isExternallyCallable,
callArg.argIndex == 0,
c.signature == "transferFrom(address,address,uint256)",
-- target behavior
EXISTS
Argument param IN callArg.backwardSlices,
WHERE {
param.function == f,
},
The above query works by recursively searching the "dependencies"
(callArg.backwardSlices) of the first argument of the transferFrom call
(callArg.argIndex == 0).
In particular, the values searched in the query are restricted only to function
parameters (Argument arg).
Lastly, the query restricts arg.function == f to ensure that the from indeed
comes from the external function, not just in the function containing the
external call.
To also output the function parameters that are involved (rather than only
checking that there is one), combine the EXISTS into the overall FIND.
FIND
...,
Argument param IN callArg.backwardSlices,
WHERE
...,
param.function == f,
Note that we say "function parameters" in plural form because it is possible
that the from address may be computed conditionally, and we would want to
catch all such parameters that may be involved!