Skip to content

document what unsafety means #9258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 18, 2013
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 63 additions & 11 deletions doc/rust.md
Original file line number Diff line number Diff line change
Expand Up @@ -962,24 +962,76 @@ parameters to allow methods with that trait to be called on values
of that type.


#### Unsafe functions

Unsafe functions are those containing unsafe operations that are not contained in an [`unsafe` block](#unsafe-blocks).
Such a function must be prefixed with the keyword `unsafe`.
#### Unsafety

Unsafe operations are those that potentially violate the memory-safety guarantees of Rust's static semantics.
Specifically, the following operations are considered unsafe:

The following language level features cannot be used in the safe subset of Rust:

- Dereferencing a [raw pointer](#pointer-types).
- Casting a [raw pointer](#pointer-types) to a safe pointer type.
- Calling an unsafe function.
- Calling an unsafe function (including an intrinsic or foreign function).

##### Unsafe blocks
##### Unsafe functions

A block of code can also be prefixed with the `unsafe` keyword, to permit a sequence of unsafe operations in an otherwise-safe function.
This facility exists because the static semantics of Rust are a necessary approximation of the dynamic semantics.
When a programmer has sufficient conviction that a sequence of unsafe operations is actually safe, they can encapsulate that sequence (taken as a whole) within an `unsafe` block. The compiler will consider uses of such code "safe", to the surrounding context.
Unsafe functions are functions that are not safe in all contexts and/or for all possible inputs.
Such a function must be prefixed with the keyword `unsafe`.

##### Unsafe blocks

A block of code can also be prefixed with the `unsafe` keyword, to permit calling `unsafe` functions
or dereferencing raw pointers within a safe function.

When a programmer has sufficient conviction that a sequence of potentially unsafe operations is
actually safe, they can encapsulate that sequence (taken as a whole) within an `unsafe` block. The
compiler will consider uses of such code safe, in the surrounding context.

Unsafe blocks are used to wrap foreign libraries, make direct use of hardware or implement features
not directly present in the language. For example, Rust provides the language features necessary to
implement memory-safe concurrency in the language but the implementation of tasks and message
passing is in the standard library.

Rust's type system is a conservative approximation of the dynamic safety requirements, so in some
cases there is a performance cost to using safe code. For example, a doubly-linked list is not a
tree structure and can only be represented with managed or reference-counted pointers in safe code.
By using `unsafe` blocks to represent the reverse links as raw pointers, it can be implemented with
only owned pointers.

##### Behavior considered unsafe

This is a list of behavior which is forbidden in all Rust code. Type checking provides the guarantee
that these issues are never caused by safe code. An `unsafe` block or function is responsible for
never invoking this behaviour or exposing an API making it possible for it to occur in safe code.

* Data races
* Dereferencing a null/dangling raw pointer
* Mutating an immutable value/reference, if it is not marked as non-`Freeze`
* Reads of [undef](http://llvm.org/docs/LangRef.html#undefined-values) (uninitialized) memory
* Breaking the [pointer aliasing rules](http://llvm.org/docs/LangRef.html#pointer-aliasing-rules)
with raw pointers (a subset of the rules used by C)
* Invoking undefined behavior via compiler intrinsics:
* Indexing outside of the bounds of an object with `std::ptr::offset` (`offset` intrinsic), with
the exception of one byte past the end which is permitted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a sentence here that clarifies that Indexing means the mere act of calculating such an pointer. People might think you're referring to dereferencing an out-of bounds pointer here if they don't read it carefully.

"Yes, the mere act of calculating such an offset pointer is undefined behavior, even if you never attempt to dereference it"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a pointer pointing outside the bounds of an object is safe as long as you don't make it with std::ptr::offset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... that's why I made that comment on the line that talks about ptr::offset. :)

* Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64` instrinsics) on
overlapping buffers
* Invalid values in primitive types, even in private fields/locals:
* Dangling/null pointers in non-raw pointers, or slices
* A value other than `false` (0) or `true` (1) in a `bool`
* A discriminant in an `enum` not included in the type definition
* A value in a `char` which is a surrogate or above `char::MAX`
* non-UTF-8 byte sequences in a `str`

##### Behaviour not considered unsafe

This is a list of behaviour not considered *unsafe* in Rust terms, but that may be undesired.

* Deadlocks
* Reading data from private fields (`std::repr`, `format!("{:?}", x)`)
* Leaks due to reference count cycles, even in the global heap
* Exiting without calling destructors
* Sending signals
* Accessing/modifying the file system
* Unsigned integer overflow (well-defined as wrapping)
* Signed integer overflow (well-defined as two's complement representation wrapping)

#### Diverging functions

Expand Down