Shadow is a discrete-event network simulator that directly executes real application code, enabling you to simulate distributed systems with thousands of network-connected processes in realistic and scalable private network experiments using your laptop, desktop, or server running Linux.
The primary user-facing changes that we've made in this release include:
fork
, vfork
, execve
, and other related syscalls.--use-new-tcp
. The stack is not yet recommended for default use because it is still missing important TCP features such as congestion control, but work on it continues.no_std
Rust code.More details about specific changes are below. We also have a much more detailed writeup of many of these changes in our most recent discussion post https://github.com/shadow/shadow/discussions/3187
ERROR
-level log lines are now logged to stderr
in addition to stdout
if stdout
is not a tty but stderr
is. This helps make errors more visible in the common case that stdout
is redirected to a log file but stderr
is not. This can currently be disabled via the (unstable) option log-errors-to-tty
.fork
syscall and fork
-like invocations of the clone
and clone3
syscalls.execve
syscall.sendmsg
, recvmsg
, and shutdown
for UDP sockets.MSG_TRUNC
and MSG_PEEK
as recv
syscall argument flags for UDP sockets.MSG_TRUNC
as a recv
syscall return flag for UDP and Unix sockets.SO_DOMAIN
, SO_PROTOCOL
, and SO_ACCEPTCONN
socket options for TCP and UDP sockets.SIOCGSTAMP
ioctl for TCP and UDP sockets./dev/shm
to be executable. (This requirement was actually removed in v3.0.0)sched_getaffinity
. This bug was previously mostly latent due to an incorrectly generated libc syscall wrapper, though would have affected managed programs that made the syscall without going through libc.sched_yield
syscalls.recv
(or similar syscalls) on a TCP or UDP socket with an invalid memory address.FIONREAD
ioctl for UDP sockets.read
and recv
syscalls when called with 0-length buffers.connect
is called on a listening unix or tcp socket.
(#3191)Thanks to @stevenengler, @sporksmith, @robgjansen, @RWails for their contributions to this release!
The dev team had accumulated a large set of breaking changes that would require a major version bump. In this release, we have focused on clearing our breaking changes queue and merging those improvements. Because these are breaking changes, this release has bumped our major version from 2 to 3. This release also significantly improves the runtime performance compared to Shadow 2.5.0.
Shadow no longer implicitly searches its working directory for executables to be run under the simulation. If you wish to specify a process path relative to Shadow's working directory, prefix that path with ./
.
Shadow now supports YAML merge keys and extension fields. This allows you to combine YAML maps using the <<
key.
Example:
# an "extension field" that we use to store common host options
x-host-client: &host-client
bandwidth_up: 10Mbps
bandwidth_down: 10Mbps
hosts:
client1:
# merge the fields from the extension field above
<<: *host-client
processes: ...
client2:
<<: *host-client
processes: ...
Removed the quantity options for hosts and processes. It's now recommended to use YAML anchors and merge keys instead.
Shadow 2.x:
hosts:
client:
quantity: 3
processes: ...
Shadow 3.x:
hosts:
client1: &client
processes: ...
# copy all fields from 'client1'
client2: *client
# copy all fields from 'client1' and add additional fields
client3:
<<: *client
ip_addr: 152.21.4.24
Renamed the host_defaults
field to host_option_defaults
and renamed the host's options
field to host_options
.
Shadow 2.x:
host_defaults:
...
hosts:
client:
options:
...
Shadow 3.x:
host_option_defaults:
...
hosts:
client:
host_options:
...
Removed the host pcap_directory
configuration option and replaced it with a new pcap_enabled
option.
Shadow 2.x:
hosts:
client:
options:
pcap_directory: ./
Shadow 3.x:
hosts:
client:
host_options:
pcap_enabled: true
Host names are restricted to the patterns documented in hostname(7).
The process environment
configuration option now takes a map instead of a semicolon-delimited string.
Shadow 2.x:
hosts:
client:
processes:
- path: curl
environment: ENV_A=1;ENV_B=foo
Shadow 3.x:
hosts:
client:
processes:
- path: curl
environment:
- ENV_A: "1"
- ENV_B: foo
The per-process option stop_time
has been replaced with shutdown_time
. When set, the signal specified by shutdown_signal
(a new option) will be sent to the process at the specified time. While shadow previously sent SIGKILL
at a process's stop_time
, the default shutdown_signal
is SIGTERM
to better support graceful shutdown.
Shadow 2.x:
hosts:
client:
processes:
- path: curl
stop_time: 10s
Shadow 3.x:
hosts:
client:
processes:
- path: curl
shutdown_time: 10s
shutdown_signal: SIGKILL
A new expected_final_state
allows you to specify the expected state of the process at the end of the simulation. The supported states are exited
, signaled
, or running
. If any process is not in the correct state at the end of the simulation, Shadow will return a non-zero exit code. The default expected_final_state
is exited with code 0.
In Shadow 2.x the behaviour was to consider any processes which exited with code 0, OR which were still running at the end of the simulation, as a success. Shadow 3.x does not support this specific behaviour, and you must choose a single state.
Example:
hosts:
server:
processes:
- path: nginx
# we expect nginx to run until the end of the simulation
expected_final_state: running
Added support for a parallelism
value of 0, which allows Shadow to choose a reasonable parallelism (we currently use the number of physical cores in Shadow's affinity/cgroup). The default value for parallelism
has also been changed from 1 to 0.
It is now an error to set a process' shutdown_time
or start_time
to be after the simulation's stop_time
.
Sub-second configuration values are now allowed for all time-related options, including start_time
, stop_time
, etc.
Removed and updated various experimental options including use_shim_syscall_handler
, interface_qdisc
, and use_extended_yaml
.
<data-dir>/hosts/<hostname>/
) are no longer prefixed with the hostname. For example a file that was previously named shadow.data/hosts/server/server.curl.1000.stdout
is now named shadow.data/hosts/server/curl.1000.stdout
..exitcode
file has been removed due to its confusing semantics, and the new expected_final_state
attribute replacing its primary use-case.Shadow's scheduler is very performance-sensitive and needs to run tasks on worker threads with low latency. We added a spinloop in the scheduler that significantly improves Shadow's runtime performance. Some simulations see more than a 2x runtime performance improvement (for example 160 minutes to 47 minutes in a 5% Tor network simulation).
We have removed several of our supported platforms. Specifically, we've dropped support for Ubuntu 18.04, Fedora 34/35/36, and CentOS Stream 8. We've also dropped support for Clang, and set a minimum-supported Linux kernel version of 5.4, which requires installing a backports kernel on Debian 10.
We've updated our "stability guarantees" document with the following changes:
MSG_TRUNC
flag for unix sockets. https://github.com/shadow/shadow/pull/2841
TIMER_ABSTIME
flag for clock_nanosleep
. https://github.com/shadow/shadow/pull/2854
--profile
, --include
, and --library
setup script options.epoll_pwait2
syscall.clone3
syscall. Thread libraries we're aware of that use clone3
were gracefully falling back to clone
, but eventually they may not do so. This also reduces noise in shadow's log about an unimplemented syscall being attempted./dev/shm
to be executable.fork
, so any simulation has a fixed number of processes, all of which are explicitly specified in shadow's config.clear_child_tid
attribute set. This is unlikely to have affected most software running under Shadow, since most thread APIs use this attribute.clock_nanosleep
and nanosleep
from ENOSYS
to ENOTSUP
.execve
syscall will now get an error instead of escaping the Shadow simulation. https://github.com/shadow/shadow/issues/2718
getcwd
with an incorrect wrapper that was returning -1
instead of NULL
on errors.epoll_ctl
with an unknown operation will return EINVAL
.host_options
to undo any changes made to host_option_defaults
.Thanks to contributions from @robgjansen, @stevenengler, @sporksmith, @jtracey, @dependabot
The dev team had accumulated a large set of breaking changes that would require a major version bump. In this release, we have focused on clearing our breaking changes queue and merging those improvements. Because these are breaking changes, this release has bumped our major version from 2 to 3.
This release is marked as a pre-release because, although our CI tests are passing, we haven't had as long of a testing period as we usually do. Additionally, we have some additional internal improvements we intend to make prior to the full 3.0.0 release. We believe this pre-release should be stable, but please file issues for any bugs you find. Thanks!
./
.use_extended_yaml
, has been removed.pcap_directory
configuration option and replaced it with a new pcap_enabled
option.<data-dir>/hosts/<hostname>/
) are no longer prefixed with the hostname. For example a file that was previously named shadow.data/hosts/server/server.curl.1000.stdout
is now named shadow.data/hosts/server/curl.1000.stdout
.clang
C compiler is no longer supported.stop_time
has been replaced with shutdown_time
. When set, the signal specified by shutdown_signal
(a new option) will be sent to the process at the specified time. While shadow previously sent SIGKILL
at a process's stop_time
, the default shutdown_signal
is SIGTERM
to better support graceful shutdown.cmake
has been bumped from 3.2 to 3.13.4.glib
has been bumped from 2.32 to 2.58.environment
configuration option now takes a map instead of a semicolon-delimited string.quantity
options for hosts and processes. It's now recommended to use YAML anchors and merge keys instead.host_defaults
configuration field to host_option_defaults
and renamed the host's options
field to host_options
.expected_final_state
. https://github.com/shadow/shadow/pull/2886
.exitcode
file has been removed due to its confusing semantics, and the new expected_final_state
attribute replacing its primary use-case. https://github.com/shadow/shadow/pull/2906
MSG_TRUNC
flag for unix sockets. https://github.com/shadow/shadow/pull/2841
TIMER_ABSTIME
flag for clock_nanosleep
. https://github.com/shadow/shadow/pull/2854
use_shim_syscall_handler
has been removed. This optimization is now always enabled.stop_time
or start_time
to be after the simulation's stop_time
.start_time
, stop_time
, etc.--profile
, --include
, and --library
setup script options.epoll_pwait2
syscall.clear_child_tid
attribute set. This is unlikely to have affected most software running under Shadow, since most thread APIs use this attribute.clock_nanosleep
and nanosleep
from ENOSYS
to ENOTSUP
.execve
syscall will now get an error instead of escaping the Shadow simulation. https://github.com/shadow/shadow/issues/2718
getcwd
with an incorrect wrapper that was returning -1
instead of NULL
on errors.epoll_ctl
with an unknown operation will return EINVAL
.fork
, so any simulation has a fixed number of processes, all of which are explicitly specified in shadow's config.sendmsg()
and recvmsg()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2805
sendmsg()
and recvmsg()
by @stevenengler in https://github.com/shadow/shadow/pull/2811
msghdr
by @stevenengler in https://github.com/shadow/shadow/pull/2815
ForeignPtr
by @stevenengler in https://github.com/shadow/shadow/pull/2827
read_vals
by @stevenengler in https://github.com/shadow/shadow/pull/2829
write
method to the memory manager by @stevenengler in https://github.com/shadow/shadow/pull/2830
tid_address
by @stevenengler in https://github.com/shadow/shadow/pull/2834
io::write_partial
by @stevenengler in https://github.com/shadow/shadow/pull/2838
pcap_directory
config option to pcap_enabled
by @stevenengler in https://github.com/shadow/shadow/pull/2840
MSG_TRUNC
support for unix sockets by @stevenengler in https://github.com/shadow/shadow/pull/2841
RecvmsgReturn.bytes_read
field to return_val
by @stevenengler in https://github.com/shadow/shadow/pull/2845
TIMER_ABSTIME
flag for clock_nanosleep
by @stevenengler in https://github.com/shadow/shadow/pull/2854
ALL_SHADOW_TESTS
from cmake by @stevenengler in https://github.com/shadow/shadow/pull/2855
use_extended_yaml
config option from docs by @stevenengler in https://github.com/shadow/shadow/pull/2861
use_legacy_working_dir
option by @stevenengler in https://github.com/shadow/shadow/pull/2862
&AtomicI32
, not the &Arc
by @stevenengler in https://github.com/shadow/shadow/pull/2866
quantity
config option by @stevenengler in https://github.com/shadow/shadow/pull/2868
quantity
config option by @stevenengler in https://github.com/shadow/shadow/pull/2873
host_defaults
and options
config fields by @stevenengler in https://github.com/shadow/shadow/pull/2883
CStr::from_bytes_until_nul
by @stevenengler in https://github.com/shadow/shadow/pull/2896
epoll_pwait2
by @stevenengler in https://github.com/shadow/shadow/pull/2894
EINVAL
for unknown epoll_ctl
option by @stevenengler in https://github.com/shadow/shadow/pull/2903
In this release, we continue our transition from C to Rust. Most of the changes included in the release are backend changes that support our continued Rust migration. In particular, we've made important progress on migrating some of Shadow's core components, including Host, Process, Thread, and networking code. We also fixed some bugs and made some other changes to improve the experience for users as described below.
This release is intended to be the last stable release in the v2.x series. We have accumulated a fair number of issues that require a major version bump to complete as described in this discussion post, so we intend to take care of these issues within the next couple of weeks. As a result, the next stable release will mark the start of the v3.x series.
preload_spin_max
and use_explicit_block_message
.
These options were to support an execution model where Shadow workers ran on different
CPU cores than the managed threads they were controlling, and each side would "spin"
while waiting for a message from the other side. After extensive benchmarking we found
that this was rarely a significant win, and dropped support for this behavior while
migrating the core IPC functionality to Rust.DescriptorHandle
and return Result
from descriptor table methods by @stevenengler in https://github.com/shadow/shadow/pull/2691
LegacyFile
when the last LegacyFileCounter
is dropped by @stevenengler in https://github.com/shadow/shadow/pull/2698
plot-shadow.py
by @stevenengler in https://github.com/shadow/shadow/pull/2702
InetSocket
s in the network interface by @stevenengler in https://github.com/shadow/shadow/pull/2713
LegacyFile
s by @stevenengler in https://github.com/shadow/shadow/pull/2715
TcpSocket
to LegacyTcpSocket
by @stevenengler in https://github.com/shadow/shadow/pull/2741
getsockopt()
and setsockopt()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2742
Worker::thread_id()
to Worker::worker_id()
by @stevenengler in https://github.com/shadow/shadow/pull/2746
shutdown()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2745
listen()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2763
connect()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2764
accept()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2768
Option
by @stevenengler in https://github.com/shadow/shadow/pull/2784
SysCallHandler
by @stevenengler in https://github.com/shadow/shadow/pull/2789
Socket
to use sendmsg
and recvmsg
by @stevenengler in https://github.com/shadow/shadow/pull/2797
File
to use readv
and writev
by @stevenengler in https://github.com/shadow/shadow/pull/2798
In this release, we continue our transition from C to Rust. Most of the changes included in the release are backend changes that support our continued Rust migration. However, we also fixed many bugs and made some other changes to improve the experience for users as described below.
We intend additional work following this release to focus on changes to some of Shadow's core networking components, including the TCP stack and other facilities for forwarding packets between nodes. This is somewhat higher risk work that could result in bugs that affect Shadow's network performance and stability. We are issuing this v2.4.0 release now to ensure that users have a stable version of Shadow that they can use while we work on the high risk networking code.
epoll_ctl
. https://github.com/shadow/shadow/pull/2586
$PATH
and not
~/.local/bin
. https://github.com/shadow/shadow/pull/2572
rust-toolchain.toml
file. https://github.com/shadow/shadow/pull/2614
sched_{get,set}affinity
syscalls. https://github.com/shadow/shadow/pull/2602
/sys/devices/system/cpu/possible
and
/sys/devices/system/cpu/online
. https://github.com/shadow/shadow/pull/2602
examples/
directory. https://github.com/shadow/shadow/pull/2637, https://github.com/shadow/shadow/pull/2659
Transport
by @stevenengler in https://github.com/shadow/shadow/pull/2578
SyscallHandler::legacy_syscall
helper function by @stevenengler in https://github.com/shadow/shadow/pull/2588
InetSocket
enum and placeholder TcpSocket
struct by @stevenengler in https://github.com/shadow/shadow/pull/2589
TCP
wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2595
getsockname
/getpeername()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2599
ioctl()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2600
Host::setup()
by @stevenengler in https://github.com/shadow/shadow/pull/2606
NetworkNamespace
object by @stevenengler in https://github.com/shadow/shadow/pull/2607
sysconf(_SC_NPROCESSORS_*)
by @trinity-1686a in https://github.com/shadow/shadow/pull/2602
bind()
for tcp wrapper by @stevenengler in https://github.com/shadow/shadow/pull/2611
TcpSocket
to network interface as LegacySocket
by @stevenengler in https://github.com/shadow/shadow/pull/2626
_test_implicit_bind
test by @stevenengler in https://github.com/shadow/shadow/pull/2630
Process
by @stevenengler in https://github.com/shadow/shadow/pull/2649
sched_{get,set}affinity
bug by @stevenengler in https://github.com/shadow/shadow/pull/2657
syscallhandler_epoll_ctl
by @stevenengler in https://github.com/shadow/shadow/pull/2660
InetSocket
in the tracker by @stevenengler in https://github.com/shadow/shadow/pull/2671
Full Changelog: https://github.com/shadow/shadow/compare/v2.3.0...v2.4.0
Shadow v2.3.0 is a minor release that contains many bug fixes as well as a large push to convert more code from C to Rust; ~54% of our code is now written in Rust compared to just 39% in C. We have incorporated many improvements to Shadow's design as we migrate to Rust, making the code easier to understand, better tested, and easier to maintain. We plan to continue our focus on migrating code to Rust in our next release.
--tmpfs /dev/shm:rw,nosuid,nodev,exec,size=1024g
rather than
--shm-size=1024g
to mount /dev/shm
as executable. This fixes errors when
the managed process maps executable pages. https://github.com/shadow/shadow/issues/2400
pkg-config
to locate glib, instead of a custom
cmake module. This is the recommended way of getting the
appropriate glib compile flags, and works better in non-standard layouts such
as in a guix environment.setup
script now has a --search
option, which can be used to add
additional directories to search for pkg-config files, C headers, and
libraries. It obsoletes the options --library
and --include
.mmap
to fail when called on a file descriptor that was
opened with O_NOFOLLOW
. https://github.com/shadow/shadow/pull/2353
PATH
.
Previously these were interpreted as relative to the current directory. For
backwards compatibility, Shadow will currently prefer a binary in that location
if one is found but log a warning. Such cases should be disambiguated by using
an absolute path or prefixing with ./
.thread-per-host
to thread-per-core
,
which has better performance on most machines.experimental.host_heartbeat_interval
defaults to "1 sec"
), but the format
of these messages is not stable.shadow.data/sim-stats.json
file.brk
, mmap
, munmap
, mremap
,
mprotect
, open
, and openat
syscalls.examples/
directory.O_WRONLY
to O_RDONLY
).readv
and writev
syscalls, and added support for
preadv
and pwritev
.ifa_netmask
field in getifaddrs()
to improve compatibility with
Node.js applications. https://github.com/shadow/shadow/pull/2456
PR_SET_DUMPABLE
, allowing it to work for programs that
try to disable memory inspection. https://github.com/shadow/shadow/pull/2370
Thanks also to Shadow devs @sporksmith, @stevenengler, and @robgjansen!
The full changelog can be viewed here: https://github.com/shadow/shadow/compare/v2.2.0...v2.3.0
Shadow v2.2.0 is a rather small minor release that contains mostly bug fixes but also some new support for dup()
ing file descriptors. We believe that our bug fixes improved Shadow's stability enough to warrant a release.
Rust became the majority language in Shadow in this release, and we plan to focus the next release on continuing our Rust migration.
Here is log of the primary user-facing changes we made since the previous release:
We have removed ptrace-mode, and the associated experimental options
use-o-n-waitpid-workaround
and --interpose-method
. ptrace-mode was an
alternative to Shadow's current interposition mechanism that uses LD_PRELOAD
and seccomp
. This change should be transparent to most users, since it hasn't
been the default for several releases, and was only accessible via experimental
options. See https://github.com/shadow/shadow/issues/1945
dup()
and related syscalls are now supported for all file descriptors
Fixed behavior when multiple threads are blocked in epoll_wait
on the same epoll
file description. https://github.com/shadow/shadow/issues/2260
Fixed bugs causing timerfd_settime
to not reset the internal timer's
expiration count (https://github.com/shadow/shadow/pull/2279), and not cancel
previously scheduled timer-fire events (https://github.com/shadow/shadow/pull/2282).
Fixed a panic when patching the VDSO in newer kernels, such as those in Ubuntu 22.04. https://github.com/shadow/shadow/issues/2273
Fixed the errno returned from calling connect()
on a unix socket. This
fixes a getaddrinfo()
test failure on some systems.
https://github.com/shadow/shadow/issues/2286
Fixed minor memory leaks. https://github.com/shadow/shadow/pull/2249
LegacyDescriptor
by @stevenengler in https://github.com/shadow/shadow/pull/2240
descriptor_
functions to legacydesc_
by @stevenengler in https://github.com/shadow/shadow/pull/2241
ownerProcess
field from LegacyDescriptor
by @stevenengler in https://github.com/shadow/shadow/pull/2245
Manager
with a rust version by @stevenengler in https://github.com/shadow/shadow/pull/2277
LegacyDescriptor
to LegacyFile
by @stevenengler in https://github.com/shadow/shadow/pull/2284
ENOENT
for unix socket connect()
with pathname address by @stevenengler in https://github.com/shadow/shadow/pull/2287
getaddrinfo()
error on some systems by @stevenengler in https://github.com/shadow/shadow/pull/2292
debug_panic
macro by @stevenengler in https://github.com/shadow/shadow/pull/2294
Full Changelog: https://github.com/shadow/shadow/compare/v2.1.0...v2.2.0
Shadow v2.1.0 is a minor release following our significant redesign of Shadow in v2.0.0. See the v2.0.0 release notes for more details about Shadow's new multi-process architecture.
This v2.1.0 release has largely focused on improving support for running various types of applications in Shadow while smoothing some of the rough edges introduced in v2.0.0.
We plan to focus our next release on rust migration, and we expect that Rust will become Shadow's primary programming language in v2.2.0!
general.progress
)host_defaults.pcap_capture_size
)general.model_unblocked_syscall_latency
). This feature allows Shadow to escape some "busy loops" it couldn't before, avoiding deadlock in e.g. some versions of curl, iperf, libopenblas, and the golang runtime.--debug-hosts
option to make debugging managed processes easierselect()
getitimer()
setitimer()
SYS_rseq
O_DIRECT
flag (packet mode) support for pipesioctl()
support for pipeslisten()
can be called more than once for TCP sockets to set the backlogioctl()
file flag handling for regular filesTCP_NODELAY
can be enabled for TCP socketsTCP_CONGESTION
socket optiongetservbyname_r()
getaddrinfo()
We've made many other internal improvements, added new test cases, and expanded our documentation.
Version 2 is the first major version bump since Shadow v1.0.0 was tagged over a decade ago!
In Shadow v2, we completely redesigned the architecture for executing and interacting with applications running in Shadow. To understand the importance of the redesign, let's first look at the previous design and its limitations.
In the previous version 1, the underlying architecture was that Shadow would load applications as plugins into the Shadow process space, and as of v1.12.0 the plugins were loaded into independent namespaces in the Shadow process space. This underlying plugin-based architecture had several limitations:
Compatibility: The domain of supported applications was limited to those that are compiled as position-independent libraries (PIC) or executables (PIE) that export their symbols to the dynamic symbol table (rdynamic), are dynamically linked to libc, and make all system calls through libc. Rebuilding applications so that they could be loaded into Shadow was tedious, and impossible if the source code was not available (e.g., closed-source software).
Correctness: Relying solely on preloading as a mechanism to intercept system calls is unreliable because only dynamically linked functions (e.g., those in libc) can be intercepted using LD_PRELOAD; system calls invoked via statically linked code or assembly instructions could leak outside of the simulation and cause errors.
Maintainability: A custom dynamic loader was required to load more than 16 plugin namespaces at once, and a portable threading library was used to support multi-threaded applications (these used to account for 62k LoC in Shadow). libc functions with nontrivial functionality would need to be reimplemented in order to intercept the system calls they make.
All of these issues lead to stability problems in Shadow and limited its use to niche applications like Tor.
In version 2, we designed a new architecture for executing and interacting with applications running in Shadow.
In our new design, applications are executed as standard Linux processes and hooked into the simulation through the system call interface using standard kernel facilities (primarily using preloading with seccomp as a backstop). This design overcomes many of the limitations of the previous plugin architecture:
the simulator can now execute any existing application without rebuilding it, given just a binary executable and its command line arguments.
Linux kernel subsystems guarantee reliable process isolation and correct system call interception.
The maintenance of a custom loader, threading libraries, and reimplemented libc functions is no longer required.
The new design allows Shadow to focus on supporting core functionality, i.e. system calls, rather than designing and maintaining code to work around issues brought about by the v1 design limitations. Separating Shadow from applications through a system call interface will enable Shadow to become a much more general purpose tool supporting a large number of use-cases.
One of our primary concerns while considering a new design was performance: Shadow is designed to run large-scale distributed systems, and we wanted to make sure that any inter-process communication overhead necessitated by the new multi-process design would not significantly detract from Shadow's target use-cases.
We're happy to report that the performance in our new v2 design is in most cases comparable to or faster than the performance of the v1 design! This means the v2 design is an all-around win relative to the v1 design, and it should significantly improve the Shadow user experience.
During the development of the new architecture, we began the process of migrating Shadow's programming language from C to Rust to further improve stability and correctness. We have made big improvements in this regard, prioritizing user-facing changes into this v2.0.0 release. The most noticeable user-facing change is that we updated our subsystems for specifying command line arguments and Shadow config files. We hope our new yaml-based config files are easier to manually read and edit than the old xml format.
We have further Rust migration tasks planned in future releases. We don't think the remaining migration work will be very noticeable from a user perspective, but please bear with us as we dust some cobwebs and continue our transition to a safer language.
Our new design is the focus of a research paper that will appear at the 2022 USENIX Annual Technical conference. See our design document for more details on the new v2 design and a reference to our published research article.