GuBPI is a tool for automatically computing guaranteed bounds on the posterior distributions denoted by probabilistic programs, as presented at PLDI 2022 [1]. GuBPI (pronounced "guppy") stands for Guaranteed Bounds for Posterior Inference.
You can read the details in our paper:
Beutner, Ong, Zaiser. Guaranteed Bounds for Posterior Inference in Universal Probabilistic Programming. PLDI 2022. https://arxiv.org/abs/2204.02948
GuBPI enables the computation of sound bounds on the denotation of a program. The input to GuBPI is a file that contains both the probabilistic program in question (we discuss the syntax of our language, SPCF, below) and additional configuration parameters that determine the depth and precision of the analysis. GuBPI performs a symbolic analysis of the input program. In particular, it analyses each symbolic path with respect to an interval-based semantics (see Sections 3 and 4 in [1]). In case linear optimization is applicable, GuBPI performs analysis with an optimized interval-based semantics (see Section 6.4 in [1]) and uses Vinci to discharge volume computations.
src/
: the source code of GuBPI (written in F#).vinci/
: a modified version of the VINCI tool (written in C/C++), used to compute the volume of polytopes.benchmarks/
: examples of probabilistic programs with expected output.plotting/
: a tool to generate prettier plots of the computed bounds. (GuBPI itself produces different plots using XPlot.)
To make use of the optimized interval-based semantics, GuBPI makes use of VINCI to compute the volume of a convex polytope. VINCI is run as a subprocess, so the VINCI executable should be located in the same folder as the GuBPI executable (see the build instructions below).
In order to build this project, you need the .NET 6 SDK (we tested version 6.0.200), a C/C++ Complier (we tested gcc) and GNU make (we tested GNU Make 3.81).
To install make and gcc, you can install the build-essential
package (on Linux) or Xcode (on macOS).
You can find instructions on how to install the .NET SDK here.
To build GuBPI, run the following.
cd vinci
make
cd ..
cd src
dotnet build -c "Release" -o ../app
cd ..
cp vinci/vinci ./app
Afterward, you can run GuBPI via
./app/GuBPI inputfile.spcf
where inputfile.spcf
is the (path to the) file to be analysed.
GuBPI will place the output files in the output/
folder.
Explanation The above commands build VINCI (located in vinci/
) using make
.
In the second step we build GuBPI located in src/
and place the executable in the app/
folder (by running dotnet build -c "Release" -o ../app
).
Note that the executable (app/GuBPI
) is not standalone and needs the .dll
files located in the same folder.
In particular, to copy the executable to different location, you need to copy the entire app/
folder.
In order for GuBPI to run the vinci executable (built in the first step), vinci must be in the same directory as GuBPI (ensured by cp vinci/vinci ./app
).
Now the tool can be executed by running the ./app/GuBPI
executable provided with the SPCF file to be analyzed.
We also provide a Dockerfile to build GuBPI.
This requires a working version of docker (we tested version 20.10.2).
To construct a docker image, run docker build -t gubpi .
Afterwards, the image gubpi
should be listed when you run docker images
.
If you get a permission denied
error, you may have to run docker with superuser rights, i.e. sudo docker ...
.
The Docker image can be used similarly to the executable above.
To run GuBPI on a example program, run docker run -v $PWD/benchmarks/:/benchmarks/ -v $PWD/output/:/output/ -it gubpi inputfile.spcf
where inputfile.spcf
is (the path to) the SPCF program for which you want to compute bounds.
Note that for this command the input must be (a path to) a file in the benchmarks/
folder.
The output of GuBPI will, again, be located in the ./output
folder.
To analyse the problem in benchmarks/NonRecursive/coinBias/coinBias.spcf
, you can either run
./app/GuBPI ./benchmarks/NonRecursive/coinBias/coinBias.spcf
in case you build GuBPI manually. Or run
docker run -v $PWD/benchmarks/:/benchmarks/ -v $PWD/output/:/output/ -it gubpi /benchmarks/NonRecursive/coinBias/coinBias.spcf
in case you have build the gubpi
Docker image.
The 5 output files of GuBPI will be located in the output/
folder (see details on the output below). To have a first look at the bounds computed by GuBPI, open the coinBias-norm.html
file which displays the plots in your browser.
In order to plot the bounds the way they look in [1], you can use the plotting script in the plotting/
folder (see plotting/README.md
for details).
GuBPI analyses programs in a functional language called Statistical PCF (SPCF).
The .spcf
file passed to GuBPI contains both the program to be analysed and the parameters used to guide the analysis by GuBPI.
We explain the syntax of SPCF, the supported hyperparameters and the output below.
The syntax of SPCF supports the following features.
- Numerical Constants:
5.3
- Variables:
x
- Lambda-Abstraction:
\x. x
- Fixpoint-Abstraction:
fix f x. x
- Application:
(\x. x) 4
- Conditional
if x then y else z
- Samples from different distributions:
sample uniform(0, 1)
. The supported distribution includeuniform(a, b)
,normal(mean, std_dev)
,truncnormal(mean, std_dev, left, right)
,beta(alpha, beta)
. Heretruncnormal(mean, std_dev, left, right)
samples from a normal distribution that is truncated to the interval[left, right]
. - Scores:
score(4)
- Primitive Functions:
add(4, 5)
. The supported functions areneg(x)
,add(x, y)
,sub(x, y)
,mul(x, y)
,div(x, y)
,exp(x)
,log(x)
,pdfnormal(mean, std_dev, x)
. As syntactic sugar it is also possible to write addition, subtraction and multiplication via+, -, *
in infix notation. - Lets:
let x = 3 in x + 5
- recursive lets:
letrec f x = f x in f 0
- Tuples:
(|1, 2|)
- Matches on tuples:
let x, y = (|1, 2|) in x + y
- Lists:
[1, 1, 0, 1, 0]
,[1, 2 | restList]
- Matches on Lists:
match [1, 1, 2] | [] -> 0 | [x | xs] -> x
Lines beginning with a #
are comments
The language constructs should be self-explanatory.
Note that SPCF is a strongly typed language and GuBPI enforces that the program is well-typed.
We recommend having a look at the various examples in benchmarks/
to get an overview.
Note that the performance of GuBPI depends on the structure of the program, the used distributions, and primitive functions.
There are programs that GuBPI cannot currently handle.
Comment lines at the beginning of a .spcf
file determine the the configuration parameters for GuBPI.
The following options are available:
-
# method boxes
: forces GuBPI to use the pure interval-based semantics. If this option is not set, GuBPI uses the linear optimization when applicable and otherwise resorts to the pure interval-based semantics. If# method boxes
is set,# splits 100000
gives the number of splits (here: 100000) that should be performed by GuBPI. -
# depth 20
: specifies the depth of the symbolic exploration (here: 20). -
# discretization -5 5 0.25
: specifies the histogram bins for which we would like to obtain bounds. The first number is the left bound, the second number the right bound and the third number is the step size (the size of each bucket). So the example# discretization -5 5 0.25
will analyse the buckets[-5, -4.75], [-4.75, -4.5], ..., [4.75, 5]
. -
# epsilonScore 0.1
: sets the precision with which to analyse the scoring in the optimized semantics. Each score statement is partition into boxes of sizeepsilonScore
. The smallerepsilonScore
, the finer the scores are split. -
# epsilonVar 0.2
: sets the precision with which to analyse sample variables from non-uniform distributions in the optimized semantics. The smallerepsilonVar
, the finer the variables are split. -
# outputSplitProgress 0
,# outputCurrentPath 0
,# outputCurrentArea 0
: control the amount of progress information GuBPI prints to the terminal, i.e., outputs which symbolic path is currently analysed (outputCurrentPath
), which bucket of the discretization is analysed (outputCurrentArea
) and what split of the variables is currently worked on (outputSplitProgress
). A0
turns the output off, a1
turns it on.
If no parameters are set, GuBPI uses the default ones.
The parameters have a direct impact on the speed and results of GuBPI.
If you increase the depth
and decrease epsilonScore
and epsilonVar
, then precision will increase (but so will the running time).
Making them too small may cause GuBPI to fail to terminate within a reasonable amount of time.
For the benchmarks from the paper, we already provide good values for those parameters in the .spcf
files.
When called on an SPCF program {problem}.spcf
, GuBPI will output 5 files.
{problem}-unnorm.bounds
: contains the bounds for the unnormalized denotation computed for each bucket.{problem}-norm.bounds
: contains the bounds for the normalized denotation computed for each bucket.{problem}-unnorm.html
: contains a plot of the bounds on the unnormalized denotation (using XPlot, a plotting library in F#). The plots from the paper can be generated from the bounds file using the tool in theplotting/
folder.{problem}-norm.html
: similar to{problem}-unnorm.html
but using the normalized denotation.{problem}.clj
: contains a translation of the SPCF term to an Anglican program. Anglican is a probabilistic programming language implemented in Clojure. This allows one to experiment with Anglican's inference algorithms on the example programs without the need for a manual translation.
[1] Beutner, Ong, Zaiser. Guaranteed Bounds for Posterior Inference in Universal Probabilistic Programming. PLDI 2022. https://arxiv.org/abs/2204.02948