From d0c9729af21c0b47339c3357d995885b3eab017f Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Mon, 22 Jul 2019 17:27:39 +0200
Subject: [PATCH 01/12] add basic memory interface

---
 wip/memory-interface.md | 110 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)
 create mode 100644 wip/memory-interface.md
diff --git a/wip/memory-interface.md b/wip/memory-interface.md
new file mode 100644
index 00000000..877ca2d2
--- /dev/null
+++ b/wip/memory-interface.md
@@ -0,0 +1,110 @@
+# Rust Memory Interface
+
+**Note:** This document is not normative nor endorsed by the UCG WG. Its purpose is to be the basis for discussion and to set down some key terminology.
+
+The purpose of this document is to describe the interface between a Rust program and memory.
+This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
+* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order.
+* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads.
+
+The interface is also opinionated in several ways; this is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
+For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult non-normative defect report responses, to see this).
+Another key property of the interface presented below is that it is *untyped*.
+This encodes the fact that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
+
+## Pointers
+
+One key question a memory model has to answer is *what is a pointer*.
+It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
+So we will leave this question open, and treat `Pointer` as an "associated type" of the memory interface
+
+## Bytes
+
+The unit of communication between the memory model and the rest of the program is a *byte*.
+Again the question of "what is a byte" is not as trivial as it might seem; beyond `u8` values we have to represent `Pointer`s and [uninitialized memory][uninit].
+We define the `Byte` type (in terms of an arbitrary `Pointer` type) as follows:
+
+```rust
+enum Byte<Pointer> {
+    /// The "normal" case: a (frozen, initialized) integer in `0..256`.
+    Raw(u8),
+    /// An uninitialized byte.
+    Uninit,
+    /// One byte of a pointer.
+    Pointer {
+        /// The pointer of which this is a byte.
+        ptr: Pointer,
+        /// Which byte of the pointer this is.
+        /// `idx` will always be in `0..size_of::<usize>()`.
+        idx: u8,
+    }
+}
+```
+
+## Memory interface
+
+The Rust memory interface is described by the following (not-yet-complete) trait definition:
+
+```rust
+/// *Note*: All memory operations can be non-deterministic, which means that
+/// executing the same operation on the same memory can have different results.
+/// We also let all operations potentially mutated memory. For example, reads
+/// actually do change the current state when considering concurrency or
+/// Stacked Borrows.
+trait Memory {
+    /// The type of pointer values.
+    type Pointer;
+
+    /// The type of memory errors (i.e., ways in which the program can cause UB
+    /// by interacting with memory).
+    type Error;
+
+    /// Create a new allocation.
+    fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Self::Error>;
+
+    /// Remove an allocation.
+    fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result<(), Self::Error>;
+
+    /// Write some bytes to memory.
+    fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result<(), Self::Error>;
+
+    /// Read some bytes from memory.
+    fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Self::Error>;
+
+    /// Offset the given pointer.
+    fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Self::Error>;
+
+    /// Cast the given pointer to an integer.
+    fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Self::Error>;
+
+    /// Cast the given integer to a pointer.
+    fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Self::Error>;
+}
+
+/// The rules applying to this pointer offset operation.
+enum OffsetMode {
+    /// Wrapping offset; never UB.
+    Wrapping,
+    /// Non-wrapping offset; UB if it wraps.
+    NonWrapping,
+    /// In-bounds offset; UB if it wraps or if old and new pointer are not both
+    /// in-bounds of the same allocation (details are specified by the memory
+    /// model).
+    Inbounds,
+}
+```
+
+This is a very basic memory interface that is incomplete in at least the following ways:
+
+* To implement rules like "dereferencing a null, unaligned, or dangling raw pointer is UB" (even if no memory access happens), there needs to be a way to do an "alignment, bounds and null-check".
+* There needs to be some way to do alignment checks -- either using the above operation, or by adding `align` parameters to `read` and `write`.
+* To represent concurrency, many operations need to take a "thread ID" and `read` and `write` need to take an [`Ordering`].
+* To represent [Stacked Borrows], there needs to be a "retag" operation, and that one will in fact be "lightly typed" (it cares about `UnsafeCell`).
+* Maybe we want operations that can compare pointers without casting them to integers.
+
+But I think it can still be useful to provide some basic terminology and grounds for further discussion.
+
+[pointers-complicated]: https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html
+[uninit]: https://www.ralfj.de/blog/2019/07/14/uninit.html
+[`Ordering`]: https://doc.rust-lang.org/nightly/core/sync/atomic/enum.Ordering.html
+[Stacked Borrows]: stacked-borrows.md

From 499fa9ee1090f489c9f1d4ecc95a4b252dcd780d Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Wed, 24 Jul 2019 21:39:36 +0200
Subject: [PATCH 02/12] begin with value domain

---
 wip/memory-interface.md |  29 ++++++----
 wip/value-domain.md     | 120 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+), 12 deletions(-)
 create mode 100644 wip/value-domain.md

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index 877ca2d2..bfde5c61 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -16,6 +16,7 @@ This encodes the fact that in Rust, *operations are typed, but memory is not*---
 
 One key question a memory model has to answer is *what is a pointer*.
 It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
+This becomes even more prominent with aliasing models such as [Stacked Borrows].
 So we will leave this question open, and treat `Pointer` as an "associated type" of the memory interface
 
 ## Bytes
@@ -31,7 +32,7 @@ enum Byte<Pointer> {
     /// An uninitialized byte.
     Uninit,
     /// One byte of a pointer.
-    Pointer {
+    PtrFragment {
         /// The pointer of which this is a byte.
         ptr: Pointer,
         /// Which byte of the pointer this is.
@@ -41,6 +42,12 @@ enum Byte<Pointer> {
 }
 ```
 
+The purpose of `PtrFragment` is to be able to have a byte-wise representation of a `Pointer`.
+On a 32-bit system, the sequence of 4 bytes representing `ptr: Pointer` is:
+```
+[PtrFragment { ptr, idx: 0 }, PtrFragment { ptr, idx: 1 }, PtrFragment { ptr, idx: 2 }, PtrFragment { ptr, idx: 3 }]
+```
+
 ## Memory interface
 
 The Rust memory interface is described by the following (not-yet-complete) trait definition:
@@ -51,34 +58,32 @@ The Rust memory interface is described by the following (not-yet-complete) trait
 /// We also let all operations potentially mutated memory. For example, reads
 /// actually do change the current state when considering concurrency or
 /// Stacked Borrows.
+/// And finally, all operations are fallible (they return `Result`); if they
+/// fail, that means the program caused UB.
 trait Memory {
     /// The type of pointer values.
     type Pointer;
 
-    /// The type of memory errors (i.e., ways in which the program can cause UB
-    /// by interacting with memory).
-    type Error;
-
     /// Create a new allocation.
-    fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Self::Error>;
+    fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Error>;
 
     /// Remove an allocation.
-    fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result<(), Self::Error>;
+    fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result<(), Error>;
 
     /// Write some bytes to memory.
-    fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result<(), Self::Error>;
+    fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result<(), Error>;
 
     /// Read some bytes from memory.
-    fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Self::Error>;
+    fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Error>;
 
     /// Offset the given pointer.
-    fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Self::Error>;
+    fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Error>;
 
     /// Cast the given pointer to an integer.
-    fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Self::Error>;
+    fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Error>;
 
     /// Cast the given integer to a pointer.
-    fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Self::Error>;
+    fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Error>;
 }
 
 /// The rules applying to this pointer offset operation.
diff --git a/wip/value-domain.md b/wip/value-domain.md
new file mode 100644
index 00000000..e63e7e48
--- /dev/null
+++ b/wip/value-domain.md
@@ -0,0 +1,120 @@
+# Rust Value Domain
+
+**Note:** This document is not normative nor endorsed by the UCG WG. Its purpose is to be the basis for discussion and to set down some key terminology.
+
+The purpose of this document is to describe what the set of *all possible values* is in Rust.
+This is an important definition: one key element of a Rust specification will be to define the [representation relation][representation] of every type.
+This relation relates values with lists of bytes, so before we can even start specifying the relation we have to specify the involved domains.
+`Byte` is defined as part of [the memory interface][memory-interface]; this document is about defining `Value`.
+
+[representation]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#representation
+[memory-interface]: memory-interface.md
+
+## Values
+
+The Rust value domain is described by the following (incomplete) type definition (in terms of an arbitrary `Pointer` type instantiated by [the memory interface][memory-interface]):
+
+```rust
+enum Value<Pointer> {
+    /// A mathematical integer.
+    Int(BigInt),
+    /// A Boolean value.
+    Bool(bool),
+    /// A pointer.
+    Ptr(Pointer),
+    /// A zero-sized "unit".
+    Unit,
+    /// An uninitialized value.
+    Uninit,
+    /// An n-tuple.
+    Tuple(Vec<Self>),
+    /// A variant of a sum type.
+    Variant {
+        idx: u64,
+        data: Box<Self>,
+    },
+    /* ... */
+}
+```
+
+As Rust grows, we might expand this definition. That is okay; all previously defined representation relations are still well-defined when the domain grows, the newly added values will just not be valid for old types as one would expect.
+
+## Example value relations
+
+We show some examples for how one might want to use this `Value` domain to define the value relation for a type.
+
+### `bool`
+
+The value relation for `bool` relates `Bool(true)` with `[Raw(0)]` and `Bool(false)` with [`Raw(1)`], and that's it.
+
+### `()`
+
+The value relation for the `()` type relates `Unit` with the empty list `[]`, and that's it.
+
+### `!`
+
+The value relation for the `!` type is empty: nothing is related to anything at this type.
+
+### `#[repr(C)] struct Pair<T, U>(T, U)`
+
+The value relation for `Pair` us based on the value relations for `T` and `U`.
+A value `Tuple([t, u])` (assuming we can use array notation to "match" on `Vec`) is represented by a list of bytes `rt ++ pad1 ++ ru ++ pad2` (using `++` for list concatenation) if:
+
+* `t` is represented by `rt` at type `T`.
+* `u` is represented by `ru` at type `U`.
+* The length of `rt ++ pad1` is equal to the offset of the `U` field.
+* The length of the entire list `rt ++ pad1 ++ ru ++ pad2` is equal to the size of `Pair<T, U>`.
+
+This relation demonstrates that value of type `Pair` are always 2-tuples (aka, pairs).
+It also shows that the actual content of the padding bytes is entirely irrelevant, we only care to have the right number of them to "pad" `ru` to the right place and to "pad" the entire list to have the right length.
+
+### `&T`/`&mut T`
+
+Reference types are tricky.
+But a possible value relation for sized `T` is:
+A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment { ptr, idx: N-1 }]` where `N == size_of::<usize>()` if `ptr` is non-NULL and aligned to `align_of::<T>()`.
+
+### `u8`
+
+For the value representation of integer types, there are two different reasonable choices.
+Certainly, a value `Int(i)` where `i` in `0..256` is related to `[Raw(i as u8)]`.
+
+And then, maybe, we also want to additionally say that value `Uninit` is related to `[Uninit]`.
+This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).
+
+## The role of the value representation in the operational semantics
+
+One key use of the value representation is to define a "typed" interface to memory:
+
+```rust
+trait TypedMemory: Memory {
+    /// Write a value of the given type to memory.
+    fn typed_write(&mut self, ptr: Self::Pointer, val: Value, ty: Type) -> Result<(), Error>;
+
+    /// Read a value of the given type.
+    fn typed_read(&mut self, ptr: Self::Pointer, ty: Type) -> Result<Value, Error>;
+}
+```
+
+here, `Type` is some representation of the Rust type system (akin to [`Ty`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/type.Ty.html) in the compiler).
+We can implement `TypedMemory` for any `Memory` as follows:
+* For `typed_write`, pick any representation of `val` for `ty`, and call `Memory::write`. If no representation exists, we have UB.
+* For `typed_read`, read `ty.size()` many bytes from memory, and then determine which value this list of bytes represents. If it does not represent any value, we have UB.
+
+In particular, this defines the semantics of a "typed copy": when an assignment is executed in Rust (at some type `T`), or when an argument of type `T` is passed to a function or a return value of type `T` returned from a function, this is called a "typed copy".
+Its semantics is basically to do a `typed_read` at `T` followed by a `typed_write` at `T`.
+In particular, this means doing a "typed copy" at `T` of some memory that has no valid representation at `T` is undefined behavior!
+This also means that for types that have padding, the "typed copy" does not preserve the padding bytes.
+
+## Relation to validity invariant
+
+One way we *could* also use the value representation (and the author things this is exceedingly elegant) is to define the validity invariant.
+Certainly, it is the case that if a list of bytes is not related to any value for a given type `T`, then that list of bytes is *invalid* for `T` and it should be UB to produce such a list of bytes at type `T`.
+We could decide that this is an "if and only if", i.e., that the validity invariant for a type is exactly "must be in the value representation".
+For many types this is likely what we will do anyway (e.g., for `bool` and `!` and `()` and integers), but for references, this choice would mean that *validity of the reference cannot depend on what memory looks like*---so "dereferencable" and "points to valid data" cannot be part of the validity invariant for references.
+The reason this is so elegant is that, as we have seen above, a "typed copy" already very naturally is UB when the memory that is copied is not a valid representation of `T`.
+This means we do not even need a special clause in our specification for the validity invariant---in fact, the term does not even have to appear in the specification---as everything juts falls out of how a "typed copy" applies the value representation twice.
+
+Justifying the `dereferencable` LLVM attribute is, in this case, left to the aliasing model (e.g. [Stacked Borrows]), just like that is needed to justify the `noalias` attribute.
+
+[Stacked Borrows]: stacked-borrows.md

From 648ec0226f6fb2ebc791b4f6df257ecd25ee748e Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 11:28:40 +0200
Subject: [PATCH 03/12] ptr-to-byte

---
 wip/memory-interface.md | 34 ++++++++++++++++++++++++++++------
 wip/value-domain.md     |  9 +++++----
 2 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index bfde5c61..ccc292bb 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -11,6 +11,7 @@ The interface is also opinionated in several ways; this is not intended to be ab
 For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult non-normative defect report responses, to see this).
 Another key property of the interface presented below is that it is *untyped*.
 This encodes the fact that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
+At the same time, the memory model provides a *side-effect free* way to turn pointers into "raw bytes", which is *not* [the direction C++ is moving towards](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf), so we might have to revisit this choice later.
 
 ## Pointers
 
@@ -36,7 +37,7 @@ enum Byte<Pointer> {
         /// The pointer of which this is a byte.
         ptr: Pointer,
         /// Which byte of the pointer this is.
-        /// `idx` will always be in `0..size_of::<usize>()`.
+        /// `idx` will always be in `0..PTR_SIZE`.
         idx: u8,
     }
 }
@@ -48,6 +49,21 @@ On a 32-bit system, the sequence of 4 bytes representing `ptr: Pointer` is:
 [PtrFragment { ptr, idx: 0 }, PtrFragment { ptr, idx: 1 }, PtrFragment { ptr, idx: 2 }, PtrFragment { ptr, idx: 3 }]
 ```
 
+Based on the `PtrToInt` trait (see below), we can turn every initialized `Byte` into an integer in `0..256`:
+
+```rust
+impl<Pointer: PtrToInt> Byte<Pointer> {
+    fn as_int(self) -> Option<u8> {
+        match self {
+            Byte::Raw(int) => Some(int),
+            Byte::Uninit => None,
+            Byte::PtrFragment { ptr, idx } =>
+                ptr.get_byte(idx),
+        }
+    }
+}
+```
+
 ## Memory interface
 
 The Rust memory interface is described by the following (not-yet-complete) trait definition:
@@ -55,14 +71,17 @@ The Rust memory interface is described by the following (not-yet-complete) trait
 ```rust
 /// *Note*: All memory operations can be non-deterministic, which means that
 /// executing the same operation on the same memory can have different results.
-/// We also let all operations potentially mutated memory. For example, reads
+/// We also let all operations potentially mutate memory. For example, reads
 /// actually do change the current state when considering concurrency or
 /// Stacked Borrows.
 /// And finally, all operations are fallible (they return `Result`); if they
 /// fail, that means the program caused UB.
 trait Memory {
     /// The type of pointer values.
-    type Pointer;
+    type Pointer: Copy + PtrToInt;
+
+    /// The size of pointer values.
+    const PTR_SIZE: u64;
 
     /// Create a new allocation.
     fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Error>;
@@ -79,13 +98,16 @@ trait Memory {
     /// Offset the given pointer.
     fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Error>;
 
-    /// Cast the given pointer to an integer.
-    fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Error>;
-
     /// Cast the given integer to a pointer.
     fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Error>;
 }
 
+/// The `Pointer` type must know how to extract its bytes, *without any access to the `Memory`*.
+trait PtrToInt {
+    /// Get the `idx`-th byte of the pointer.  `idx` must be in `0..PTR_SIZE`.
+    fn get_byte(self, idx: u8) -> u8;
+}
+
 /// The rules applying to this pointer offset operation.
 enum OffsetMode {
     /// Wrapping offset; never UB.
diff --git a/wip/value-domain.md b/wip/value-domain.md
index e63e7e48..b637b451 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -4,7 +4,8 @@
 
 The purpose of this document is to describe what the set of *all possible values* is in Rust.
 This is an important definition: one key element of a Rust specification will be to define the [representation relation][representation] of every type.
-This relation relates values with lists of bytes, so before we can even start specifying the relation we have to specify the involved domains.
+This relation relates values with lists of bytes: it says, for a given value and list of bytes, if that value is represented by that list.
+However, before we can even start specifying the relation, we have to specify the involved domains.
 `Byte` is defined as part of [the memory interface][memory-interface]; this document is about defining `Value`.
 
 [representation]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#representation
@@ -45,7 +46,7 @@ We show some examples for how one might want to use this `Value` domain to defin
 
 ### `bool`
 
-The value relation for `bool` relates `Bool(true)` with `[Raw(0)]` and `Bool(false)` with [`Raw(1)`], and that's it.
+The value relation for `bool` relates `Bool(b)` with `[bb]` if and only if `bb.as_int() == Some(if b { 1 } else { 0 })`.
 
 ### `()`
 
@@ -72,12 +73,12 @@ It also shows that the actual content of the padding bytes is entirely irrelevan
 
 Reference types are tricky.
 But a possible value relation for sized `T` is:
-A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment { ptr, idx: N-1 }]` where `N == size_of::<usize>()` if `ptr` is non-NULL and aligned to `align_of::<T>()`.
+A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment { ptr, idx: PTR_SIZE-1 }]` if `ptr` is non-NULL and appropriately aligned (defining alignment is left open for now).
 
 ### `u8`
 
 For the value representation of integer types, there are two different reasonable choices.
-Certainly, a value `Int(i)` where `i` in `0..256` is related to `[Raw(i as u8)]`.
+Certainly, a value `Int(i)` where `i` in `0..256` is related to `[b]` if `b.as_int() == Some(i)`.
 
 And then, maybe, we also want to additionally say that value `Uninit` is related to `[Uninit]`.
 This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).

From ca891ac9199afecce16a752487194aa47fc1bc15 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 11:58:49 +0200
Subject: [PATCH 04/12] more editing

---
 wip/memory-interface.md | 22 +++++++++++-----------
 wip/value-domain.md     | 31 +++++++++++++++++++------------
 2 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index ccc292bb..8cd6dabe 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -4,21 +4,22 @@
 
 The purpose of this document is to describe the interface between a Rust program and memory.
 This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
-* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order.
+* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
 * The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads.
 
-The interface is also opinionated in several ways; this is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
-For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult non-normative defect report responses, to see this).
+The interface shown below is also opinionated in several ways.
+It is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
+For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult [non-normative defect report responses](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm), to see this).
 Another key property of the interface presented below is that it is *untyped*.
-This encodes the fact that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
-At the same time, the memory model provides a *side-effect free* way to turn pointers into "raw bytes", which is *not* [the direction C++ is moving towards](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf), so we might have to revisit this choice later.
+This implies that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
+At the same time, the memory model provides a *side-effect free* way to turn pointers into "raw bytes", which is *not* [the direction C++ is moving towards](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf), and we might have to revisit this choice later if it turns out to not be workable.
 
 ## Pointers
 
 One key question a memory model has to answer is *what is a pointer*.
 It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
 This becomes even more prominent with aliasing models such as [Stacked Borrows].
-So we will leave this question open, and treat `Pointer` as an "associated type" of the memory interface
+So the interface will leave it up to the concrete instance to answer this question, and carry `Pointer` as an associated type.
 
 ## Bytes
 
@@ -58,7 +59,7 @@ impl<Pointer: PtrToInt> Byte<Pointer> {
             Byte::Raw(int) => Some(int),
             Byte::Uninit => None,
             Byte::PtrFragment { ptr, idx } =>
-                ptr.get_byte(idx),
+                Some(ptr.get_byte(idx)),
         }
     }
 }
@@ -96,9 +97,10 @@ trait Memory {
     fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Error>;
 
     /// Offset the given pointer.
-    fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Error>;
+    fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode)
+        -> Result<Self::Pointer, Error>;
 
-    /// Cast the given integer to a pointer.
+    /// Cast the given integer to a pointer.  (The other direction is handled by `PtrToInt` below.)
     fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Error>;
 }
 
@@ -129,8 +131,6 @@ This is a very basic memory interface that is incomplete in at least the followi
 * To represent [Stacked Borrows], there needs to be a "retag" operation, and that one will in fact be "lightly typed" (it cares about `UnsafeCell`).
 * Maybe we want operations that can compare pointers without casting them to integers.
 
-But I think it can still be useful to provide some basic terminology and grounds for further discussion.
-
 [pointers-complicated]: https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html
 [uninit]: https://www.ralfj.de/blog/2019/07/14/uninit.html
 [`Ordering`]: https://doc.rust-lang.org/nightly/core/sync/atomic/enum.Ordering.html
diff --git a/wip/value-domain.md b/wip/value-domain.md
index b637b451..a8fa24e4 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -23,8 +23,6 @@ enum Value<Pointer> {
     Bool(bool),
     /// A pointer.
     Ptr(Pointer),
-    /// A zero-sized "unit".
-    Unit,
     /// An uninitialized value.
     Uninit,
     /// An n-tuple.
@@ -34,6 +32,8 @@ enum Value<Pointer> {
         idx: u64,
         data: Box<Self>,
     },
+    /// A "bag of raw bytes".
+    RawBag(Vec<Byte<Pointer>>),
     /* ... */
 }
 ```
@@ -46,11 +46,12 @@ We show some examples for how one might want to use this `Value` domain to defin
 
 ### `bool`
 
-The value relation for `bool` relates `Bool(b)` with `[bb]` if and only if `bb.as_int() == Some(if b { 1 } else { 0 })`.
+The value relation for `bool` relates `Bool(b)` with `[r]` if and only if `r.as_int() == Some(if b { 1 } else { 0 })`.
+(`as_int` is defined in [the memory interface][memory-interface].)
 
 ### `()`
 
-The value relation for the `()` type relates `Unit` with the empty list `[]`, and that's it.
+The value relation for the `()` type relates the empty tuple `Tuple([])` (assuming we can use array notation to "match" on `Vec`) with the empty byte list `[]`, and that's it.
 
 ### `!`
 
@@ -58,16 +59,17 @@ The value relation for the `!` type is empty: nothing is related to anything at
 
 ### `#[repr(C)] struct Pair<T, U>(T, U)`
 
-The value relation for `Pair` us based on the value relations for `T` and `U`.
-A value `Tuple([t, u])` (assuming we can use array notation to "match" on `Vec`) is represented by a list of bytes `rt ++ pad1 ++ ru ++ pad2` (using `++` for list concatenation) if:
+The value relation for `Pair` is based on the value relations for `T` and `U`.
+A value `Tuple([t, u])` is represented by a list of bytes `rt ++ pad1 ++ ru ++ pad2` (using `++` for list concatenation) if:
 
 * `t` is represented by `rt` at type `T`.
 * `u` is represented by `ru` at type `U`.
 * The length of `rt ++ pad1` is equal to the offset of the `U` field.
 * The length of the entire list `rt ++ pad1 ++ ru ++ pad2` is equal to the size of `Pair<T, U>`.
 
-This relation demonstrates that value of type `Pair` are always 2-tuples (aka, pairs).
-It also shows that the actual content of the padding bytes is entirely irrelevant, we only care to have the right number of them to "pad" `ru` to the right place and to "pad" the entire list to have the right length.
+This relation specifies that values of type `Pair` are always 2-tuples (aka, pairs).
+It also says that the actual content of the padding bytes is entirely irrelevant, we only care to have the right number of them to "pad" `ru` to the right place and to "pad" the entire list to have the right length.
+So, for example when considering `Pair<u8, u16>`, the value `Tuple[42, 119]` is represented on a little-endian target by `[Raw(42), byte, Raw(119), Raw(0)]` for *any* `byte: Byte`.
 
 ### `&T`/`&mut T`
 
@@ -80,9 +82,14 @@ A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment
 For the value representation of integer types, there are two different reasonable choices.
 Certainly, a value `Int(i)` where `i` in `0..256` is related to `[b]` if `b.as_int() == Some(i)`.
 
-And then, maybe, we also want to additionally say that value `Uninit` is related to `[Uninit]`.
+And then, maybe, we also want to additionally say that value `Uninit` is related to byte list `[Uninit]`.
 This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).
 
+### `union`
+
+The `union` type does not even try to interpret memory, so for a `union` of size `n`, the value relation says that for any byte list `bytes` of that length, `RawBag(bytes)` is related to `bytes`.
+(Note however that [this definition might not be implementable](https://github.com/rust-lang/unsafe-code-guidelines/issues/156).)
+
 ## The role of the value representation in the operational semantics
 
 One key use of the value representation is to define a "typed" interface to memory:
@@ -97,7 +104,7 @@ trait TypedMemory: Memory {
 }
 ```
 
-here, `Type` is some representation of the Rust type system (akin to [`Ty`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/type.Ty.html) in the compiler).
+Here, `Type` is some representation of the Rust type system (akin to [`Ty`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/type.Ty.html) in the compiler).
 We can implement `TypedMemory` for any `Memory` as follows:
 * For `typed_write`, pick any representation of `val` for `ty`, and call `Memory::write`. If no representation exists, we have UB.
 * For `typed_read`, read `ty.size()` many bytes from memory, and then determine which value this list of bytes represents. If it does not represent any value, we have UB.
@@ -109,13 +116,13 @@ This also means that for types that have padding, the "typed copy" does not pres
 
 ## Relation to validity invariant
 
-One way we *could* also use the value representation (and the author things this is exceedingly elegant) is to define the validity invariant.
+One way we *could* also use the value representation (and the author thinks this is exceedingly elegant) is to define the validity invariant.
 Certainly, it is the case that if a list of bytes is not related to any value for a given type `T`, then that list of bytes is *invalid* for `T` and it should be UB to produce such a list of bytes at type `T`.
 We could decide that this is an "if and only if", i.e., that the validity invariant for a type is exactly "must be in the value representation".
 For many types this is likely what we will do anyway (e.g., for `bool` and `!` and `()` and integers), but for references, this choice would mean that *validity of the reference cannot depend on what memory looks like*---so "dereferencable" and "points to valid data" cannot be part of the validity invariant for references.
 The reason this is so elegant is that, as we have seen above, a "typed copy" already very naturally is UB when the memory that is copied is not a valid representation of `T`.
 This means we do not even need a special clause in our specification for the validity invariant---in fact, the term does not even have to appear in the specification---as everything juts falls out of how a "typed copy" applies the value representation twice.
 
-Justifying the `dereferencable` LLVM attribute is, in this case, left to the aliasing model (e.g. [Stacked Borrows]), just like that is needed to justify the `noalias` attribute.
+Justifying the `dereferencable` LLVM attribute is, in this case, left to the aliasing model (e.g. [Stacked Borrows]), just like the `noalias` attribute.
 
 [Stacked Borrows]: stacked-borrows.md

From 168249a3775f5cdfcc1a68c18120cdcde7ea2119 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 19:16:37 +0200
Subject: [PATCH 05/12] tweaks

---
 wip/memory-interface.md | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index 8cd6dabe..5166633b 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -4,8 +4,8 @@
 
 The purpose of this document is to describe the interface between a Rust program and memory.
 This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
-* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
-* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads.
+* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order - expressed as calls to the methods of this interface, and reactions to its return values. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
+* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads. A memory model is defined by implementing the memory interface.
 
 The interface shown below is also opinionated in several ways.
 It is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
@@ -20,6 +20,9 @@ One key question a memory model has to answer is *what is a pointer*.
 It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
 This becomes even more prominent with aliasing models such as [Stacked Borrows].
 So the interface will leave it up to the concrete instance to answer this question, and carry `Pointer` as an associated type.
+Practically speaking, `Pointer` will be some representation of an "address", plus [provenance] information.
+
+[provenance]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#pointer-provenance
 
 ## Bytes
 

From 53b899ea0daa2d682914dcac362bc6123da44052 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 20:16:29 +0200
Subject: [PATCH 06/12] address many review comments

---
 wip/memory-interface.md | 39 +++++++++++++++++++++++++--------------
 wip/value-domain.md     | 22 ++++++++++++----------
 2 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index 5166633b..a3dfcc9d 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -9,9 +9,9 @@ This interface is a key part of the Rust Abstract Machine: it lets us separate c
 
 The interface shown below is also opinionated in several ways.
 It is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
-For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult [non-normative defect report responses](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm), to see this).
+For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult [defect report responses](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm), to see this).
 Another key property of the interface presented below is that it is *untyped*.
-This implies that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
+This implies that in Rust, *operations are typed, but memory is not* - a key difference to C and C++ with their type-based strict aliasing rules.
 At the same time, the memory model provides a *side-effect free* way to turn pointers into "raw bytes", which is *not* [the direction C++ is moving towards](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf), and we might have to revisit this choice later if it turns out to not be workable.
 
 ## Pointers
@@ -19,7 +19,7 @@ At the same time, the memory model provides a *side-effect free* way to turn poi
 One key question a memory model has to answer is *what is a pointer*.
 It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
 This becomes even more prominent with aliasing models such as [Stacked Borrows].
-So the interface will leave it up to the concrete instance to answer this question, and carry `Pointer` as an associated type.
+So we leave it up to the memory model to answer this question, and make `Pointer` an associated type.
 Practically speaking, `Pointer` will be some representation of an "address", plus [provenance] information.
 
 [provenance]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#pointer-provenance
@@ -27,7 +27,7 @@ Practically speaking, `Pointer` will be some representation of an "address", plu
 ## Bytes
 
 The unit of communication between the memory model and the rest of the program is a *byte*.
-Again the question of "what is a byte" is not as trivial as it might seem; beyond `u8` values we have to represent `Pointer`s and [uninitialized memory][uninit].
+Again, the question of "what is a byte" is not as trivial as it might seem; beyond `u8` values we have to represent `Pointer`s and [uninitialized memory][uninit].
 We define the `Byte` type (in terms of an arbitrary `Pointer` type) as follows:
 
 ```rust
@@ -39,6 +39,7 @@ enum Byte<Pointer> {
     /// One byte of a pointer.
     PtrFragment {
         /// The pointer of which this is a byte.
+        /// That is, the byte is a fragment of this pointer.
         ptr: Pointer,
         /// Which byte of the pointer this is.
         /// `idx` will always be in `0..PTR_SIZE`.
@@ -47,10 +48,15 @@ enum Byte<Pointer> {
 }
 ```
 
-The purpose of `PtrFragment` is to be able to have a byte-wise representation of a `Pointer`.
+The purpose of `PtrFragment` is to enable a byte-wise representation of a `Pointer`.
 On a 32-bit system, the sequence of 4 bytes representing `ptr: Pointer` is:
 ```
-[PtrFragment { ptr, idx: 0 }, PtrFragment { ptr, idx: 1 }, PtrFragment { ptr, idx: 2 }, PtrFragment { ptr, idx: 3 }]
+[
+    PtrFragment { ptr, idx: 0 },
+    PtrFragment { ptr, idx: 1 },
+    PtrFragment { ptr, idx: 2 },
+    PtrFragment { ptr, idx: 3 },
+]
 ```
 
 Based on the `PtrToInt` trait (see below), we can turn every initialized `Byte` into an integer in `0..256`:
@@ -73,13 +79,18 @@ impl<Pointer: PtrToInt> Byte<Pointer> {
 The Rust memory interface is described by the following (not-yet-complete) trait definition:
 
 ```rust
+/// All operations are fallible, so they return `Result`.  If they fail, that
+/// means the program caused UB. What exactly the `UndefinedBehavior` type is
+/// does not matter here.
+type Result<T=()> = std::result::Result<T, UndefinedBehavior>;
+
 /// *Note*: All memory operations can be non-deterministic, which means that
 /// executing the same operation on the same memory can have different results.
 /// We also let all operations potentially mutate memory. For example, reads
 /// actually do change the current state when considering concurrency or
 /// Stacked Borrows.
-/// And finally, all operations are fallible (they return `Result`); if they
-/// fail, that means the program caused UB.
+/// This is pseudo-Rust, so we just use fully owned types everywhere for
+/// symmetry and simplicity.
 trait Memory {
     /// The type of pointer values.
     type Pointer: Copy + PtrToInt;
@@ -88,23 +99,23 @@ trait Memory {
     const PTR_SIZE: u64;
 
     /// Create a new allocation.
-    fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Error>;
+    fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer>;
 
     /// Remove an allocation.
-    fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result<(), Error>;
+    fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result;
 
     /// Write some bytes to memory.
-    fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result<(), Error>;
+    fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result;
 
     /// Read some bytes from memory.
-    fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Error>;
+    fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>>;
 
     /// Offset the given pointer.
     fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode)
-        -> Result<Self::Pointer, Error>;
+        -> Result<Self::Pointer>;
 
     /// Cast the given integer to a pointer.  (The other direction is handled by `PtrToInt` below.)
-    fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Error>;
+    fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer>;
 }
 
 /// The `Pointer` type must know how to extract its bytes, *without any access to the `Memory`*.
diff --git a/wip/value-domain.md b/wip/value-domain.md
index a8fa24e4..7be69fc0 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -17,28 +17,30 @@ The Rust value domain is described by the following (incomplete) type definition
 
 ```rust
 enum Value<Pointer> {
-    /// A mathematical integer.
+    /// A mathematical integer, used for `i*`/`u*` types.
     Int(BigInt),
-    /// A Boolean value.
+    /// A Boolean value, used for `bool`.
     Bool(bool),
-    /// A pointer.
+    /// A pointer value, used for (thin) references and raw pointers.
     Ptr(Pointer),
     /// An uninitialized value.
     Uninit,
-    /// An n-tuple.
+    /// An n-tuple, used for arrays, structs, tuples, SIMD vectors.
     Tuple(Vec<Self>),
-    /// A variant of a sum type.
+    /// A variant of a sum type, used for enums.
     Variant {
-        idx: u64,
+        idx: BigInt,
         data: Box<Self>,
     },
-    /// A "bag of raw bytes".
+    /// A "bag of raw bytes", used for unions.
     RawBag(Vec<Byte<Pointer>>),
     /* ... */
 }
 ```
 
-As Rust grows, we might expand this definition. That is okay; all previously defined representation relations are still well-defined when the domain grows, the newly added values will just not be valid for old types as one would expect.
+The point of this type is to capture the mathematical concepts that are represented by the data we store in memory.
+The definition is likely incomplete, and even if it was complete now, we might expand it as Rust grows.
+That is okay; all previously defined representation relations are still well-defined when the domain grows, the newly added values will just not be valid for old types as one would expect.
 
 ## Example value relations
 
@@ -97,10 +99,10 @@ One key use of the value representation is to define a "typed" interface to memo
 ```rust
 trait TypedMemory: Memory {
     /// Write a value of the given type to memory.
-    fn typed_write(&mut self, ptr: Self::Pointer, val: Value, ty: Type) -> Result<(), Error>;
+    fn typed_write(&mut self, ptr: Self::Pointer, val: Value, ty: Type) -> Result;
 
     /// Read a value of the given type.
-    fn typed_read(&mut self, ptr: Self::Pointer, ty: Type) -> Result<Value, Error>;
+    fn typed_read(&mut self, ptr: Self::Pointer, ty: Type) -> Result<Value>;
 }
 ```
 

From 91648b1f95eb6b814efb281f7953d34256be9eb5 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 20:26:59 +0200
Subject: [PATCH 07/12] just assume a memory model is in scope

---
 wip/memory-interface.md | 5 ++++-
 wip/value-domain.md     | 6 +++---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index a3dfcc9d..fab056ef 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -28,7 +28,7 @@ Practically speaking, `Pointer` will be some representation of an "address", plu
 
 The unit of communication between the memory model and the rest of the program is a *byte*.
 Again, the question of "what is a byte" is not as trivial as it might seem; beyond `u8` values we have to represent `Pointer`s and [uninitialized memory][uninit].
-We define the `Byte` type (in terms of an arbitrary `Pointer` type) as follows:
+We define the `Byte` type as follows, where `Pointer` will later be instantiated with the `Memory::Pointer` associated type.
 
 ```rust
 enum Byte<Pointer> {
@@ -137,6 +137,9 @@ enum OffsetMode {
 }
 ```
 
+We will generally assume we have a particular memory model in scope, and freely refer to its `PTR_SIZE` and `Pointer` items.
+We will also write `Byte` for `Byte<Pointer>`.
+
 This is a very basic memory interface that is incomplete in at least the following ways:
 
 * To implement rules like "dereferencing a null, unaligned, or dangling raw pointer is UB" (even if no memory access happens), there needs to be a way to do an "alignment, bounds and null-check".
diff --git a/wip/value-domain.md b/wip/value-domain.md
index 7be69fc0..4cc96482 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -13,10 +13,10 @@ However, before we can even start specifying the relation, we have to specify th
 
 ## Values
 
-The Rust value domain is described by the following (incomplete) type definition (in terms of an arbitrary `Pointer` type instantiated by [the memory interface][memory-interface]):
+The Rust value domain is described by the following (incomplete) type definition (using the `Pointer` type defined by [the memory interface][memory-interface]):
 
 ```rust
-enum Value<Pointer> {
+enum Value {
     /// A mathematical integer, used for `i*`/`u*` types.
     Int(BigInt),
     /// A Boolean value, used for `bool`.
@@ -33,7 +33,7 @@ enum Value<Pointer> {
         data: Box<Self>,
     },
     /// A "bag of raw bytes", used for unions.
-    RawBag(Vec<Byte<Pointer>>),
+    RawBag(Vec<Byte>),
     /* ... */
 }
 ```

From 3785d4c680225046c9a8781f893c018cc72ab4d7 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 20:27:56 +0200
Subject: [PATCH 08/12] typo

---
 wip/memory-interface.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index fab056ef..87c7aa07 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -4,7 +4,7 @@
 
 The purpose of this document is to describe the interface between a Rust program and memory.
 This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
-* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order - expressed as calls to the methods of this interface, and reactions to its return values. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
+* The *expression/statement semantics* of Rust boils down to explaining which "memory events" (calls to the memory interface) happen in which order - expressed as calls to the methods of this interface, and reactions to its return values. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
 * The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads. A memory model is defined by implementing the memory interface.
 
 The interface shown below is also opinionated in several ways.

From 17611a13fc4269269000112d9db3ae09f958a1a8 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 20:29:58 +0200
Subject: [PATCH 09/12] add NonNullU8

---
 wip/value-domain.md | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/wip/value-domain.md b/wip/value-domain.md
index 4cc96482..4574b380 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -87,6 +87,11 @@ Certainly, a value `Int(i)` where `i` in `0..256` is related to `[b]` if `b.as_i
 And then, maybe, we also want to additionally say that value `Uninit` is related to byte list `[Uninit]`.
 This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).
 
+### `NonNullU8`
+
+`NonNullU8` is basically the same as `u8` except that we remove the 0, and certainly do not consider `Uninit` a valid value.
+A value `Int(i)` where `i` in `1..256` is related to `[b]` if `b.as_int() == Some(i)`, and that's it.
+
 ### `union`
 
 The `union` type does not even try to interpret memory, so for a `union` of size `n`, the value relation says that for any byte list `bytes` of that length, `RawBag(bytes)` is related to `bytes`.

From 334c8b696579506e21197cb75b4ca9a7131375f5 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 20:31:23 +0200
Subject: [PATCH 10/12] linebreaks

---
 wip/memory-interface.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/wip/memory-interface.md b/wip/memory-interface.md
index 87c7aa07..9ba8219a 100644
--- a/wip/memory-interface.md
+++ b/wip/memory-interface.md
@@ -4,8 +4,10 @@
 
 The purpose of this document is to describe the interface between a Rust program and memory.
 This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
-* The *expression/statement semantics* of Rust boils down to explaining which "memory events" (calls to the memory interface) happen in which order - expressed as calls to the methods of this interface, and reactions to its return values. This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
-* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads. A memory model is defined by implementing the memory interface.
+* The *expression/statement semantics* of Rust boils down to explaining which "memory events" (calls to the memory interface) happen in which order - expressed as calls to the methods of this interface, and reactions to its return values.
+  This part of the specification is *pure* in the sense that it has no "state": everything that needs to be remembered from one expression evaluation to the next is communicated through memory.
+* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads.
+  A memory model is defined by implementing the memory interface.
 
 The interface shown below is also opinionated in several ways.
 It is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.

From d871de683f5487278f4cf844e215e514c1792509 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Thu, 25 Jul 2019 22:25:12 +0200
Subject: [PATCH 11/12] NonNullU8 is actually called NonZeroU8

---
 wip/value-domain.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/wip/value-domain.md b/wip/value-domain.md
index 4574b380..6b3cad2f 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -87,9 +87,9 @@ Certainly, a value `Int(i)` where `i` in `0..256` is related to `[b]` if `b.as_i
 And then, maybe, we also want to additionally say that value `Uninit` is related to byte list `[Uninit]`.
 This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).
 
-### `NonNullU8`
+### `NonZeroU8`
 
-`NonNullU8` is basically the same as `u8` except that we remove the 0, and certainly do not consider `Uninit` a valid value.
+`NonZeroU8` is basically the same as `u8` except that we remove the 0, and certainly do not consider `Uninit` a valid value.
 A value `Int(i)` where `i` in `1..256` is related to `[b]` if `b.as_int() == Some(i)`, and that's it.
 
 ### `union`

From d63559b8c8085c19e31114e95164cafbabd7a799 Mon Sep 17 00:00:00 2001
From: Ralf Jung <post@ralfj.de>
Date: Fri, 26 Jul 2019 11:14:57 +0200
Subject: [PATCH 12/12] cite cerberus

---
 wip/value-domain.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/wip/value-domain.md b/wip/value-domain.md
index 6b3cad2f..0c840a5d 100644
--- a/wip/value-domain.md
+++ b/wip/value-domain.md
@@ -112,6 +112,8 @@ trait TypedMemory: Memory {
 ```
 
 Here, `Type` is some representation of the Rust type system (akin to [`Ty`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc/ty/type.Ty.html) in the compiler).
+This interface is inspired by [Cerberus](https://www.cl.cam.ac.uk/~pes20/cerberus/).
+
 We can implement `TypedMemory` for any `Memory` as follows:
 * For `typed_write`, pick any representation of `val` for `ty`, and call `Memory::write`. If no representation exists, we have UB.
 * For `typed_read`, read `ty.size()` many bytes from memory, and then determine which value this list of bytes represents. If it does not represent any value, we have UB.