This document describes:
- How to set up cross-translation-unit analysis (CTU) with the Clang Static Analyzer (CSA) in Zircon;
- The work done by Kareem Khazem during his internship; and
- The remaining work needed to get CTU fully supported on Zircon.
Summary: Download the source for Clang, and apply several non-mainline patches to it before compiling it. Run my wrapper script around the analysis tools. Download the CodeChecker
tool; use it to digest the results of the analysis, and start a web server to view the results with a web interface.
There are two patchsets to be aware of:
- The Samsung patchset, which is an enormous patch adding AST merging support to Clang. It consists mostly of additions to
lib/AST/ASTImporter.cpp
. There is also a (primitive, not working very well) set of tools for CTU analysis undertools/xtu-build/*
. This patchset is based on an old revision of Clang; this fact, as well as its large size, makes it very difficult to rebase wholesale onto tip-of-tree (ToT) Clang. - The Ericsson patchset, which includes a subset of Samsung’s AST merging work and also adds several new tools (
tools/xtu-build-new/*
andtools/scan-build-py/*
) that allow for CTU analysis. The xtu-build-new tools improve on, and are somewhat different to, Samsung’s xtu-build tools. This patchset is much newer than the Samsung one, and the authors are making an effort to keep it rebased on ToT.
We will be patching Clang with Ericsson’s patchset, since the AST merging work rebases cleanly and we also get the newer analysis tools. However, note that CTU support for Zircon is incomplete; in some cases, the Samsung patchset contains code that provides the required functionality (more details below).
- Download and build Clang and LLVM as usual.
- In a separate directory, clone Ericsson’s fork of Clang and switch to the ctu-master branch.
- Download this script into Ericsson’s fork and run it. It should dump a series of patches into a patches directory. I purposely only dump the commits from the beginning of Ericsson’s changes until 1bb3636, which was the latest revision during my internship.
- If you want more up-to-date changes from Ericsson, you can experiment with changing 1bb3636 to HEAD in the script. Make sure to skip commits that merge upstream commits into the ctu-master branch by specifying additional ranges in the script. git log --graph can be helpful to determine what the upstream commits vs. Ericsson’s commits are, I use
git log --graph --decorate --date=relative --format=format:'%C(green)%h%C(yellow) %s%C(reset)%w(0,6,6)%C(bold green)\n%C(cyan)%G? %C(bold red)%aN%C(reset) %cr%C(reset)%w(0,0,0)\n%-D\n%C(reset)' --all
- Apply the generated patches to upstream Clang (not the Ericsson fork) one at a time.
for p in $(ls $PATCH_DIR/*.patch | sort -n); do git am < $p; done
- Apply Kareem Khazem’s patches that are listed below if they haven’t already landed
- Re-build upstream Clang & LLVM.
Summary: Run my wrapper script. This builds Zircon normally, then builds it again but dumping serialised ASTs instead of object files, and then finally analyses each file using the dumped ASTs to achieve CTU.
First, the story backwards:
Non-CTU static analysis analyzes the AST of each TU; any function calls to external functions are treated as opaque. Roughly, CTU analysis seeks to substitute the opaque function call node with the AST of that function’s implementation.
Thus, a CTU analysis will start analyzing an AST as usual, but when it encounters a function call node, it will try to merge in the AST for that function. This relies on the AST for the function already having been serialized to disk beforehand, so that the analyzer can re-load the AST into memory. It also relies on support for AST merging, which is what the Samsung patch to ASTImporter.cpp
(and the Ericsson patch derived from it) is for.
In order to serialize the ASTs to disk, we need to emulate the real build process. The way to do this is to actually do a real build of Zircon while recording the compiler invocations; this allows us to ‘play back’ the invocations, but with the compiler flags modified to dump AST files rather than object files.
So to summarise, forwards this time:
- Build zircon using Clang, and wrap the build process in a program like bear in order to record compiler invocations and generate a JSON compilation database.
- Replay the same compilation steps, but dumping AST files instead of object files. This is what the xtu-build.py tool does.
- Perform static analysis as usual, but deserialize the AST of every called function when needed. This is what the xtu-analyze.py tool does at the top level, by invoking tools in the scan-build-py/libscanbuild directory through the thin scan-build replacement written by the Ericsson team.
These steps are captured in the Fuchsia wrapper mentioned below. The result of all this is a directory full of reports, in Apple plist format, which contain details of reported bugs.
There are two sets of tools for running cross-translation-unit analysis:
- The tools under
tools/xtu-build-new
are the top-level scripts. Since the underlying analyzer can fail (i.e. due to the CSA crashing), I’ve patchedxtu-analyze.py
(in Ericsson’s branch) so that it dumps the output of the analyzer (stdout/stderr, not the report) to a file. The output goes in$OUT_DIR/{passes,fails}
depending on the return code of the analyzer, where$OUT_DIR
is the directory passed to the-o
argument ofxtu-analyze.py
. The particularly helpful part of those files is the second line that starts withanalyze: DEBUG: exec command in
, which is emitted by thelibscanbuild
tools (next bullet point). That command is the actual invocation to the CSA after the long, tedious process of modifying its command line. Therefore, it’s the command that you will need if you want to run the CSA on a troublesome file using gdb. - The tools under
tools/scan-build-py
are a bird’s nest of tools to wrap around the actual invocation to Clang. They are responsible for modifying the command line. I’m not too familiar with them, and haven’t had to interfere with them in the past.
This very small shell script wraps the Ericsson xtu-build-new
wrappers. To do a complete analysis of Zircon, make sure to clean first, and specify the correct path to your build of Clang. Then, in the zircon directory:
make clean && ./run.sh
In order to build only the kernel, specify a TARGET
as an environment variable:
make clean && TARGET=./build-zircon-pc-x64/zircon.elf ./run.sh
The script also requires clangify.py to be in the zircon directory with executable bit set. After the analysis has finished, there will be a .result-xtu
directory, containing:
- A bunch of Apple plist files, which are the bug reports;
- A fails directory, containing the std{out,err} of analyzer invocations that returned non-zero;
- A passes directory, containing the std{out,err} of analyzer invocations that returned 0.
At the moment, the only way of parsing the plist reports and viewing them with a web interface is by using the CodeChecker tool, which is developed at Ericsson and used for code comprehension and many other tasks. CodeChecker needs a large number of dependencies installed, and it’s best to install them with pip or npm or whatever rather than using apt-get. In short, after performing the analysis and dumping the plists into .result-xtu, you can invoke CodeChecker plist
to parse the plists:
CodeChecker plist -d .result-xtu -n 2016-12-12T21:47_uniq_name -j 48
The argument to -n
needs to be unique on each invocation of CodeChecker plist
, as it represents a single parse run. CodeChecker complains otherwise. Then, run CodeChecker server
to start a webserver on localhost:8001
, which will display the reports of all previous parsing runs.
The Samsung patchset was authored by Aleksei Sidorin and his team. Aleksei is quite knowledgeable about ASTImporter.cpp
and other AST merging aspects, and was very helpful. He and Sean Callanan were happy to review my AST Importer patches. Aleksei also gave a relevant talk about summary-based interprocedural analysis at the 2016 LLVM Developers Meeting.
The Ericsson patchset was authored by Gábor Horváth and his team. Gábor was very helpful with advice on how to run CTU analysis with the xtu-build-new
tools.
I (Kareem Khazem) am also happy to help out where I can.
The LLVM irc channel can also be helpful.
Upstream Clang has been very receptive to receiving patches for Zircon-specific Clang checkers. The MutexInInterruptContext checker is one example (ported from an LLVM pass written by Farid Molazem Tabrizi), as are the SpinLockChecker and MutexChecker. Potential reviewers for Clang checks are Devin Coughlin (from Apple), Artem Dergachev (on Aleksei Sidorin’s team at Samsung) and Anna Zaks (also at Apple).
These checkers are typically opt-in, meaning that you need to pass a flag to the analyzer to enable them: something like -analyzer-checker=optin.zircon.MutexInInterruptContext
.
If those patches haven’t landed in Clang, you will need to apply them. To use them for analyzing Zircon with the Ericsson wrapper scripts, you should modify the Fuchsia wrapper script by adding the option -e optin.zircon.MutexInInterruptContext
to the invocation of xtu-analyze.py
at the end of the file. The patch for MutexInInterruptContext
has a test suite, which can be used as an example of what the analysis is capable of.
The upstream CSA crashes on the vast majority of Zircon files. This section describes some of the problems that Kareem Khazem encountered and their fixes.
The Clang Static Analyzer is unable to import a lot of Zircon code, due to not having implemented support for importing certain kinds of AST nodes. Patches to support these nodes are listed here:
AtomicType | Patch merged into upstream |
CXXDependentScopeMemberExpr | https://reviews.llvm.org/D26904 |
UnresolvedLookupExpr | https://reviews.llvm.org/D27033 |
DependentSizedArray | |
CXXUnresolvedConstructExpr | |
UsingDecl | https://reviews.llvm.org/D27181 |
UsingShadowDecl | https://reviews.llvm.org/D27181 |
FunctionTemplateDecl | https://reviews.llvm.org/D26904 |
In general, when implementing support for new node types, one must implement a VisitNode
function in ASTImporter.cpp
, and also unit tests and functional tests; Kareem’s patches above contain examples. There are still quite a few unsupported AST nodes remaining; grep the analyzer output directory for error: cannot import unsupported AST node
.
The Ericsson patchset contains only a subset of the ASTImporter
code in the Samsung patchset. In some cases, the Visit
function for an unsupported node can be taken straight from the Samsung patchset. However, the Samsung patchset does not include any tests, so it will still be necessary to write tests before the support for that node is upstreamed.
A lot of the code in ASTImporter.cpp
is buggy. Sometimes Aleksei has private patches for issues, like this one, so it’s worth giving him (a-sid) a quick ping on IRC. My strategy for debugging is to look through the wrapper output for the second string starting with analyze: DEBUG: exec command in
(followed by the actual command line of the analyzer), and running that command line through gdb. It often takes only a few hours to track down where a segfault is coming from.
This is a double-free of oldparent
, which is declared uninitialized on system/ulib/fs/vfs.c:vfs_rename
. Two lines later, vfs_walk
(same file) is called with oldparent
as its second argument. It is possible to return from vfs_walk
without assigning to oldparent
by entering the for loop and hitting the return r
statement on the first loop. If the value of r
is greater than zero, then we go to the else if
statement, which calls vn_release
on oldparent
(which is still uninitialized).
This is a use-after-free. The path is:
kernel/kernel/thread.c:thread_detach_and_resume
- Call
thread_detach(t)
- Return
thread_join(t, NULL, 0)
- free
t
and returnNO_ERROR
- free
- Return
NO_ERROR
- Return
- Check for error is 1false1
- Call
thread_resume(t)
, which has been freed.thread_resume
then accessest
’s fields.
- Call
- The CSA cannot resolve the implementation of functions that are called through function pointers. This means that it cannot make any assumptions about what the return value of the function might be, nor any effects that the function might have on output parameters.
- There are several classes of function whose implementations are not accessible to the analyzer. Again, the analyzer cannot know that such functions touch their output arguments, so they will spuriously report that the following code reads from a garbage value:
struct timeval tv; gettimeofday(&tv, NULL); printf("%d\n", tv.tv_usec); // [SPURIOUS REPORT] Access to // uninitialized variable ‘tv’
- Some kinds of functions that are vulnerable to this kind of imprecision include:
- System calls (like
gettimeofday
) - Compiler builtins (like
memcpy
)
- System calls (like