SyReNN: Symbolic Representations for Neural Networks
SyReNN (pronounced Siren) is a library for analyzing deep neural networks by enumerating the linear regions of piecewise-linear (eg. ReLU) functions on one- and two-dimensional input "restriction domains of interest."
In particular, this repository contains the code described and utilized in the papers:
Computing Linear Restrictions of Neural Networks, Conference on Neural Information Processing Systems (NeurIPS) 2019
A Symbolic Neural Network Representation and its Application to Understanding, Verifying, and Patching Networks
Links: arXiv
Correcting Deep Neural Networks with Small, Generalizing Patches, NeurIPS 2019 Workshop on Safety and Robustness in Decision Making
SyReNN: A Tool for Analyzing Deep Neural Networks, Tools and Algorithms for the Construction and Analysis of Systems (TACAS), 2021
Provable Repair of Deep Neural Networks, Programming Language Design and Implementation (PLDI), 2021
In the papers, we often use mathematical notation that can be hard/impossible to use in ASCII-encoded code. Because of this, we use slightly different terminology in this repository than in the papers. Here is a rough translation dictionary
We provide two ways to build the code, either locally or in a Docker container. See each sub-section below for the prerequisites.
We recommend at least 16 GB of memory and 8 physical processor cores for
running the library, although smaller machines should be able to run it. The
reference environment is an AWS EC2 c5.metal
instance using runexec
to
limit use to only the first 16 cores as shown in .bazelrc
.
Note that because we compile (almost) everything from source, the first build
will likely take a long time; at least a few minutes. Subsequent builds
should be much faster. By default, each make
initiated by Bazel will try to use
$(nproc) / 4
processors available on your machine. However, Bazel will also
parallelize the build, so this could result in a multiple of that number of
threads. If you find your machine hanging or crashing, modify .bazelrc
to specify a lower number of cores.
build-essential
for Ubuntu).libcairo2
, libffi-dev
, zlib1g-dev
, zip
, and
libgmp3-dev
packages are required for the Python code (but usually come
pre-installed with most desktop Linux distributions).pysyrenn
package from PyPI. It is, however,
required to run any of the Python code in this repository with Bazel,
including the Python tests.Alternatively, a Docker container is provided to simplify the build and running
process. To use it, first build the image with ./docker_build.sh
then
prepend ./docker_run.sh
to all of the commands below. For example, instead
of make start_server
, use ./docker_run.sh make start_server
. Everything
should be handled transparently.
NOTE: Benchexec is currently not supported under the Docker container due to permission errors. It is possible to resolve this by removing all of the user-related lines from Dockerfile, but that will likely cause issues with Bazel and generated file permissions.
NOTE: Docker-in-docker does not seem to currently be working. This means
some builds will fail, although you should be able to run make start_server
,
bazel test //pysyrenn/...
, and bazel test //syrenn_server/...
. See this
issue for a suggested fix.
There are two major things to configure before using the code:
syrenn_server/server.cc
, modify the consexpr size_t memory_usage = ...;
threshold to suit the desired memory usage. Please
note that this is a rough upper-bound on memory usage per-call by the server
and not exact. A reasonable default is half of the RAM available on your
machine.benchexec
installed and would like run under
runexec
, the last line in .bazelrc
can be uncommented and updated
with the desired parameters. WARNING: benchexec
does not seem to be
correctly flushing terminal input to the process, so if you use this you
will need to pipe the input in (eg. echo -e "1\n" | make integrated_gradients_experiment
). WARNING: The rules in
third_party/ use benchexec
within their Docker
containers, so please comment out the runexec
line in .bazelrc
before running any of the rules in third_party/.If you are only interested in reproducing the experiments from our papers, they are all included in the experiments/ directory and can be run using Bazel:
make start_server
to start the C++ server.make {experiment}_experiment
where {experiment} is the name of the experiment
you wish to run. See Makefile for a list of supported
experiments. make all_experiments
will run all experiments, but be
warned this may take some time.After running an experiment, results should appear in the directory
experiments/results
as a .exp.tgz
archive (currently-running,
incomplete, or errored experiments may store partial results in a sub-directory
of experiments/results
). See experiments/README.md
for more information about the experiments.
Portions of the experiments from [2] use prior work
ReluPlex and
ERAN; scripts and documentation for running
those are under third_party/
.
SyReNN can also be used as a library ("PySyReNN") in your own Python project. Please note that we only test against Python 3.7.4.
make start_server
to start the C++
server.pip3 install --user pysyrenn
to install the Python front-end and client
library.You should now be able to import pysyrenn
and use the front-end. See
pysyrenn/frontend/README.md,
pysyrenn/helpers/README.md, and
experiments for usage examples.
We have included a number of correctness tests for the library. All tests can
be run with the command bazel test //...
. HTML coverage reports for the
PySyReNN front-end can be generated with make pysyrenn_coverage
(an
htmlcov/
directory should be created).
There are three main pieces to the library:
pysyrenn
pip package. It is subdivided into two sub-directories:
Layer
s and a Network
containing a sequence of layers. It
supports loading a Network from ERAN
and ONNX file formats. The symbolic
representation of a network can then be computed, given a bounded
restriction domain of interest in V-representation.pysyrenn
.A few other directories also exist:
models/BUILD
. ACAS Xu models are downloaded and converted to the ERAN
format by the Bazel genrule translate-acas-models
and Python script
translate_acas_model.py
. SyReNN-compatible VRL models are included in
this repository under vrl/eran/*.eran
along with the code used to
generate them from the official VRL repository.pysyrenn
.As mentioned in our papers, we utilize floating point arithmetic throughout the codebase. This can cause small errors in the computation of vertex positions, meaning that the symbolic representation produced by the server may not be perfectly accurate (although, in practice, it is quite accurate --- see our papers for examples).
Furthermore, as the MKL-DNN library used for efficient convolutions utilizes different algorithms and thus produces different results on different machines, exact numerical reproducibility is not guaranteed when executing on different machines. However, all reported results should be reproducible within a reasonable tolerance. If you are unable to reproduce some results, please contact us.
Finally, we note that many models have "non-linear hotspots" where floating point issues become particularly important, usually when the restriction domain of interest includes the origin (which causes most activations throughout the network to hover about 0, right on the edge between linear regions). For both numerical precision and performance, when possible, we recommend modifying queries to the transformer server to avoid the origin. An example of this is in experiments/model_checking.py from [2], where we can a-priori guarantee a particular region about the origin maps back into the initial space, and thus avoid computing the symbolic representation in that region.
Please see pysyrenn/frontend/README.md and pysyrenn/helpers/README.md for documentation. The best usage examples are in experiments/, however there are also useful examples in pysyrenn/frontend/tests and pysyrenn/helpers/tests.
Information about supported layers can be found in syrenn_server/LayerSupport.md