Lab 4: Isolation
Problem 1: Python
To get some intuition for what we mean by strong isolation, in this
exercise we will take a look at Python. Python does not provide
isolation, but to clarify what exactly that means, your job in this
assignment will be to demonstrate this in several different
scenarios. Specifically, the grader for this problem,
grader.py
,
chooses a random value secret
, and calls one of your attack functions,
attack_one()
through attack_four()
, that you will need to implement in
attack.py
.
Your attack function must figure out the correct secret value and
return it.
Solve the four attack puzzles. You can do anything you want in attack.py
,
but you may not modify grader.py
. This problem will require you to
understand what is shared between your function’s execution environment and
the secret chosen by the grader scenario code, and take advantage of it.
There are many ways to solve each scenario. Some solutions could work
for multiple scenarios. There are more than 4 substantially different
ways to solve the scenarios, though, so try to find distinct solutions.
You may need to look into Python internals to figure out how to solve these puzzles. Some attacks can be constructed using Python’s inspect and gc packages, though there are also attacks that don’t use either of these packages.
Problem 2: Web Assembly
In this lab, you will use and escape from a WebAssembly sandbox. This will teach you to think about the kinds of vulnerabilities that can arise in trying to isolate untrusted code.
Code base
The code for this lab consists of three major parts:
-
The WebAssembly runtime. The runtime is responsible for executing WebAssembly code, in a way that is isolated from the rest of the system except for well-defined interfaces. WebAssembly code consists of well-defined instructions, executed by this runtime; if you are curious, you can play around with individual WebAssembly instructions in MDN’s interactive documentation, such as this page on the
xor
instruction, but this level of understanding is not necessary for this lab.We are using a somewhat inefficient but pure Python-based WebAssembly interpreter.
-
The WASI (WebAssembly System Interface) module. This interface gives the WebAssembly module access to things outside of its isolated box, such as being able to access files. The functions provided by the WASI module are precisely the “well-defined interfaces” that the WebAssembly module is allowed to invoke. WASI is nominally specified here, but you will probably find it easier to just read our implementation instead.
For security, the WASI module ensures that the WebAssembly code cannot access arbitrary parts of the system. More specifically, the WASI module is given a root directory for the sandbox (say, something like
/tmp/sandbox
), and the WebAssembly code should have access to all of the files and directories under/tmp/sandbox
but should not be able to get out of that directory.The WASI module is implemented in wasi.py. Our WASI module has some security issues in it, and it will be your goal to exploit them.
-
The shell that runs inside the WebAssembly runtime. We have provided a simple shell, resembling the Unix shell, that will run inside the WebAssembly sandbox. The shell implementation is in tinysh.c, for your reference, but you will run the pre-compiled WebAssembly executable of this shell,
tinysh.wasm
.The shell is compiled together with the wasi-libc library, which turns standard C and POSIX operations like
malloc
andopen
into appropriate calls to the (simpler and narrower) WASI interface, but you should not need to dig into wasi-libc for this lab.
Shell commands
Your specific job will be to run the shell inside the WebAssembly sandbox
and come up with shell commands that will let you access a secret file
called secret.txt
one level of directory up from the sandbox. To get
started, run make shell
and try entering some commands:
nickolai@sonora:~/6.1600/lab-master/lab4$ make shell
. venv/bin/activate && python3 interactive.py
$ ls
.
..
$ cat ../secret.txt
open: Operation not permitted
$ help
unknown command help; available commands are:
echo pwd cd ls cat mkdir rmdir rm touch mv cp ln fd_list fd_open fd_openat fd_close
fd_read
$
Here you can see that the shell starts out with an empty directory, and
trying to naively read ../secret.txt
does not work: the WASI module
prevents it. You can also see there are a number of Unix-like commands
available to you, as well as some lower-level commands that manipulate
file descriptors:
fd_list
lists the currently open file descriptors.fd_open
opens a path name as a new file descriptor.fd_openat
opens a path name relative to an existing directory file descriptor, as a new file descriptor.fd_close
closes a file descriptor.fd_read
reads and prints the data from a file descriptor (much likecat
).
Part 1: warm-up
For part 1, come up with a sequence of shell commands that read the
contents of the secret from ../secret.txt
, and save your commands to
solution-1.txt
. Hint: think about using a symlink. You can check
your answer using make grade
.
Part 2: file descriptor invariants
For part 2, come up with a sequence of shell commands that read the contents
of the secret from ../secret.txt
without using the ln
command to
create any symlinks, and save your commands to solution-2.txt
.
In the absence of symlinks, you will have to uncover and exploit a deeper problem in how our WASI module works. The problem is more of an issue with the design rather than the low-level implementation, so you might be able to figure out the attack just from the following description of how it works, although you may find it useful to refer to its source code if something is unclear. The mistake is related to the invariant that the WASI module tries to maintain about file descriptors.
The WASI module maintains a map of open file descriptors in self.fds
,
translating from an integer value (which is seen by the code running
inside the sandbox) to a Python object representing that file. For open
files and directories, that Python object is an OpenFile()
. The key
invariant is that OpenFile.depth
is supposed to represent the number
of levels of directory from that file or directory to the root of the
sandbox. This depth
value is used by OpenFile.check_path()
to make
sure that, whenever the sandbox asks to open a file, the path name being
opened does not contain more ..
components than the current depth,
so that opening the path does not escape the sandbox’s root.
In WASI, all operations that open a file by pathname use the path_open()
function provided by WASI to the sandbox. This function always works
relative to a file descriptor of a starting directory. When the sandboxed
code opens an absolute pathname, such as open("/hello/world.txt", ...)
,
the wasi-libc
implementation finds the file descriptor of the sandbox’s
root directory (corresponding to the Preopen()
file descriptor in
the Wasi()
constructor), and invokes, roughly, path_open(root_fd,
"hello/world.txt")
. But if the sandboxed code invokes openat(dirfd,
"world.txt")
, which means open the name world.txt
in whatever directory
corresponds to dirfd
, wasi-libc
does not need to find the sandbox
root fd, and passes the arguments directly to WASI’s path_open()
.
You can see that the depth
value for the sandbox root directory starts
out at 0, as specified in the Preopen()
constructor.
To help you figure out what the bug is, and how to exploit it, first think about what WASI operations might violate the above invariant, and second, how would you take advantage of this invariant being violated?
Part 3: sandboxing using WebAssembly
For this last part, your job is to use WebAssembly to execute some code
in isolation from the rest of your system, so that if this code was
buggy, it would not be able to access anything else on your system.
We provide you with an implementation of SHA-256 in C, in sha.c.
This code is also compiled into a WebAssembly module, sha-export.wasm
.
The C implementation of SHA-256 provides a function, SHA256(const
unsigned char *d, size_t n, unsigned char *md)
, which computes the hash
of n
bytes of input data at address d
and stores the resulting hash
at address md
. Your job is to implement the sha256()
function in
sandbox.py to call SHA256()
from sha-export.wasm
and return the
result. You will need to think carefully about how to provide the input
data and obtain the output data from this function, given the strong
isolation of the WebAssembly module’s memory.