Rust - Unsafe Rust
Overview
Estimated time: 45–55 minutes
Understand when and how to use unsafe Rust for low-level programming, interfacing with C libraries, and performance-critical code. Learn about raw pointers, unsafe blocks, and the responsibilities that come with bypassing Rust's safety guarantees.
Learning Objectives
- Understand when unsafe code is necessary and appropriate.
- Use raw pointers and unsafe blocks safely.
- Interface with C libraries using Foreign Function Interface (FFI).
- Implement unsafe traits and write safe abstractions.
- Follow best practices for unsafe code.
Prerequisites
What is Unsafe Rust?
Safe vs Unsafe Rust
Understanding the boundary between safe and unsafe code:
fn main() {
// This is safe Rust - compiler guarantees memory safety
let mut numbers = vec![1, 2, 3, 4, 5];
let first = &numbers[0];
// numbers.push(6); // This would cause a compile error
println!("First: {}", first);
// This requires unsafe - we're telling the compiler we know what we're doing
unsafe {
let raw_ptr = numbers.as_mut_ptr();
*raw_ptr = 42; // Direct memory access
}
println!("Modified vector: {:?}", numbers);
// The unsafe operations
println!("Unsafe operations available:");
println!("1. Dereference raw pointers");
println!("2. Call unsafe functions");
println!("3. Access/modify static mutable variables");
println!("4. Implement unsafe traits");
println!("5. Access fields of unions");
}
When to Use Unsafe
Common scenarios where unsafe code is necessary:
// 1. Implementing fundamental data structures
struct MyVec {
ptr: *mut T,
len: usize,
capacity: usize,
}
impl MyVec {
fn new() -> Self {
MyVec {
ptr: std::ptr::null_mut(),
len: 0,
capacity: 0,
}
}
fn push(&mut self, value: T) {
if self.len == self.capacity {
self.resize();
}
unsafe {
// Write to uninitialized memory
std::ptr::write(self.ptr.add(self.len), value);
}
self.len += 1;
}
fn resize(&mut self) {
let new_capacity = if self.capacity == 0 { 1 } else { self.capacity * 2 };
unsafe {
let layout = std::alloc::Layout::array::(new_capacity).unwrap();
let new_ptr = if self.capacity == 0 {
std::alloc::alloc(layout) as *mut T
} else {
let old_layout = std::alloc::Layout::array::(self.capacity).unwrap();
std::alloc::realloc(self.ptr as *mut u8, old_layout, layout.size()) as *mut T
};
if new_ptr.is_null() {
panic!("Allocation failed");
}
self.ptr = new_ptr;
self.capacity = new_capacity;
}
}
}
// 2. FFI with C libraries
extern "C" {
fn strlen(s: *const libc::c_char) -> libc::size_t;
fn malloc(size: libc::size_t) -> *mut libc::c_void;
fn free(ptr: *mut libc::c_void);
}
// 3. Performance-critical code that needs to bypass bounds checks
fn fast_sum_slice(slice: &[i32]) -> i32 {
let mut sum = 0;
let ptr = slice.as_ptr();
let len = slice.len();
unsafe {
for i in 0..len {
sum += *ptr.add(i); // No bounds checking
}
}
sum
}
fn main() {
// Example usage
let numbers = [1, 2, 3, 4, 5];
let sum = fast_sum_slice(&numbers);
println!("Fast sum: {}", sum);
}
Raw Pointers
Creating and Using Raw Pointers
Raw pointers provide direct memory access without safety guarantees:
fn main() {
let mut num = 42;
// Create raw pointers from references
let raw_const: *const i32 = #
let raw_mut: *mut i32 = &mut num;
println!("Raw const pointer: {:p}", raw_const);
println!("Raw mut pointer: {:p}", raw_mut);
// Safe operations on raw pointers (no dereferencing)
println!("Pointer is null: {}", raw_const.is_null());
println!("Pointer address: {:p}", raw_const);
// Unsafe operations
unsafe {
// Dereference raw pointers
println!("Value through const pointer: {}", *raw_const);
// Modify through mutable pointer
*raw_mut = 100;
println!("Modified value: {}", *raw_mut);
// Pointer arithmetic
let arr = [1, 2, 3, 4, 5];
let ptr = arr.as_ptr();
for i in 0..arr.len() {
println!("Element {}: {}", i, *ptr.add(i));
}
}
// Creating pointers from arbitrary addresses (very dangerous!)
let address = 0x12345678usize;
let ptr = address as *const i32;
// Don't dereference arbitrary pointers!
// unsafe { println!("{}", *ptr); } // This would likely crash
println!("Final num value: {}", num);
}
Pointer Arithmetic and Memory Layout
Working with memory layout and pointer calculations:
use std::mem;
#[repr(C)]
struct Point {
x: f64,
y: f64,
}
fn analyze_memory_layout() {
let points = vec![
Point { x: 1.0, y: 2.0 },
Point { x: 3.0, y: 4.0 },
Point { x: 5.0, y: 6.0 },
];
println!("Point size: {} bytes", mem::size_of::());
println!("Point alignment: {} bytes", mem::align_of::());
unsafe {
let ptr = points.as_ptr();
// Access as raw bytes
let byte_ptr = ptr as *const u8;
println!("First point as bytes:");
for i in 0..mem::size_of::() {
print!("{:02x} ", *byte_ptr.add(i));
}
println!();
// Pointer arithmetic
for i in 0..points.len() {
let point_ptr = ptr.add(i);
println!("Point {}: ({}, {})", i, (*point_ptr).x, (*point_ptr).y);
// Access individual fields
let x_ptr = &(*point_ptr).x as *const f64;
let y_ptr = &(*point_ptr).y as *const f64;
println!(" X at {:p}: {}", x_ptr, *x_ptr);
println!(" Y at {:p}: {}", y_ptr, *y_ptr);
}
}
}
fn main() {
analyze_memory_layout();
// Manual memory management example
unsafe {
// Allocate memory for 5 integers
let layout = std::alloc::Layout::array::(5).unwrap();
let ptr = std::alloc::alloc(layout) as *mut i32;
if ptr.is_null() {
panic!("Allocation failed");
}
// Initialize the memory
for i in 0..5 {
std::ptr::write(ptr.add(i), i as i32 * 10);
}
// Read back the values
println!("Allocated array:");
for i in 0..5 {
println!(" [{}] = {}", i, *ptr.add(i));
}
// Clean up - must deallocate what we allocated
std::alloc::dealloc(ptr as *mut u8, layout);
}
}
Unsafe Functions and Traits
Defining Unsafe Functions
Functions that perform unsafe operations must be marked as unsafe:
// Unsafe function - caller must ensure safety invariants
unsafe fn dangerous_function(ptr: *mut i32, len: usize) -> i32 {
let mut sum = 0;
for i in 0..len {
sum += *ptr.add(i); // Potential out-of-bounds access
}
sum
}
// Safe wrapper that ensures safety
fn safe_sum_array(slice: &[i32]) -> i32 {
unsafe {
// We know this is safe because slice guarantees valid memory
dangerous_function(slice.as_ptr() as *mut i32, slice.len())
}
}
// Unsafe trait - implementor must uphold safety invariants
unsafe trait UnsafeTrait {
fn dangerous_method(&self);
}
// Implementing unsafe trait requires unsafe impl
struct MyStruct;
unsafe impl UnsafeTrait for MyStruct {
fn dangerous_method(&self) {
println!("Implementing dangerous method safely");
}
}
// Example: Custom allocator interface
struct SimpleAllocator;
unsafe impl std::alloc::GlobalAlloc for SimpleAllocator {
unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
// This is a simplified example - real allocators are complex
std::alloc::System.alloc(layout)
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
std::alloc::System.dealloc(ptr, layout)
}
}
fn main() {
let numbers = [1, 2, 3, 4, 5];
// Using safe wrapper
let sum = safe_sum_array(&numbers);
println!("Sum: {}", sum);
// Using unsafe function directly
unsafe {
let sum = dangerous_function(numbers.as_ptr() as *mut i32, numbers.len());
println!("Direct unsafe sum: {}", sum);
}
// Using unsafe trait
let my_struct = MyStruct;
my_struct.dangerous_method();
}
FFI (Foreign Function Interface)
Calling C Functions
Interface with C libraries using extern blocks:
use std::ffi::{CStr, CString};
use std::os::raw::{c_char, c_int, c_void};
// Declare external C functions
extern "C" {
fn strlen(s: *const c_char) -> usize;
fn strcpy(dest: *mut c_char, src: *const c_char) -> *mut c_char;
fn malloc(size: usize) -> *mut c_void;
fn free(ptr: *mut c_void);
fn printf(format: *const c_char, ...) -> c_int;
}
fn main() {
// Working with C strings
let rust_string = "Hello from Rust!";
let c_string = CString::new(rust_string).expect("CString::new failed");
unsafe {
// Call C strlen function
let len = strlen(c_string.as_ptr());
println!("C strlen result: {}", len);
// Call C printf function
let format = CString::new("C printf: %s\n").unwrap();
printf(format.as_ptr(), c_string.as_ptr());
// Manual memory management with C malloc/free
let size = 100;
let ptr = malloc(size) as *mut c_char;
if !ptr.is_null() {
// Copy string to allocated memory
strcpy(ptr, c_string.as_ptr());
// Convert back to Rust string
let copied_cstr = CStr::from_ptr(ptr);
let copied_string = copied_cstr.to_string_lossy();
println!("Copied string: {}", copied_string);
// Free the allocated memory
free(ptr as *mut c_void);
}
}
// Safe wrapper for C string operations
fn safe_c_strlen(s: &str) -> usize {
let c_string = CString::new(s).expect("CString::new failed");
unsafe { strlen(c_string.as_ptr()) }
}
println!("Safe wrapper result: {}", safe_c_strlen("Test string"));
}
// Define C-compatible functions that can be called from C
#[no_mangle]
pub extern "C" fn rust_add(a: c_int, b: c_int) -> c_int {
a + b
}
#[no_mangle]
pub extern "C" fn rust_process_array(ptr: *mut c_int, len: usize) {
if ptr.is_null() {
return;
}
unsafe {
for i in 0..len {
*ptr.add(i) *= 2;
}
}
}
Union Types
Working with C-style Unions
Unions allow multiple interpretations of the same memory:
use std::mem;
#[repr(C)]
union MyUnion {
integer: u32,
float: f32,
bytes: [u8; 4],
}
fn main() {
let mut my_union = MyUnion { integer: 0x41424344 };
unsafe {
println!("As integer: 0x{:08x}", my_union.integer);
println!("As float: {}", my_union.float);
println!("As bytes: {:?}", my_union.bytes);
// Modify through different fields
my_union.float = 3.14159;
println!("After setting float:");
println!("As integer: 0x{:08x}", my_union.integer);
println!("As float: {}", my_union.float);
println!("As bytes: {:?}", my_union.bytes);
// Byte manipulation
my_union.bytes[0] = 0xFF;
println!("After modifying first byte:");
println!("As integer: 0x{:08x}", my_union.integer);
println!("As float: {}", my_union.float);
}
// Tagged union for type safety
#[repr(C)]
union Value {
int_val: i32,
float_val: f32,
bool_val: bool,
}
#[repr(C)]
struct TaggedValue {
tag: u8, // 0 = int, 1 = float, 2 = bool
value: Value,
}
impl TaggedValue {
fn new_int(val: i32) -> Self {
TaggedValue {
tag: 0,
value: Value { int_val: val },
}
}
fn new_float(val: f32) -> Self {
TaggedValue {
tag: 1,
value: Value { float_val: val },
}
}
fn as_int(&self) -> Option {
if self.tag == 0 {
unsafe { Some(self.value.int_val) }
} else {
None
}
}
fn as_float(&self) -> Option {
if self.tag == 1 {
unsafe { Some(self.value.float_val) }
} else {
None
}
}
}
let tagged_int = TaggedValue::new_int(42);
let tagged_float = TaggedValue::new_float(3.14);
println!("Tagged int: {:?}", tagged_int.as_int());
println!("Tagged float: {:?}", tagged_float.as_float());
}
Safe Abstractions
Building Safe APIs on Unsafe Foundations
Create safe interfaces that encapsulate unsafe code:
use std::ptr;
use std::alloc::{alloc, dealloc, Layout};
// Safe vector implementation using unsafe code internally
pub struct SafeVec {
ptr: *mut T,
len: usize,
capacity: usize,
}
impl SafeVec {
pub fn new() -> Self {
SafeVec {
ptr: ptr::null_mut(),
len: 0,
capacity: 0,
}
}
pub fn with_capacity(capacity: usize) -> Self {
if capacity == 0 {
return Self::new();
}
let layout = Layout::array::(capacity).unwrap();
let ptr = unsafe { alloc(layout) as *mut T };
if ptr.is_null() {
panic!("Allocation failed");
}
SafeVec {
ptr,
len: 0,
capacity,
}
}
pub fn push(&mut self, value: T) {
if self.len == self.capacity {
self.resize();
}
unsafe {
ptr::write(self.ptr.add(self.len), value);
}
self.len += 1;
}
pub fn pop(&mut self) -> Option {
if self.len == 0 {
None
} else {
self.len -= 1;
unsafe {
Some(ptr::read(self.ptr.add(self.len)))
}
}
}
pub fn get(&self, index: usize) -> Option<&T> {
if index < self.len {
unsafe {
Some(&*self.ptr.add(index))
}
} else {
None
}
}
pub fn len(&self) -> usize {
self.len
}
pub fn capacity(&self) -> usize {
self.capacity
}
fn resize(&mut self) {
let new_capacity = if self.capacity == 0 { 1 } else { self.capacity * 2 };
let new_layout = Layout::array::(new_capacity).unwrap();
let new_ptr = unsafe { alloc(new_layout) as *mut T };
if new_ptr.is_null() {
panic!("Allocation failed");
}
unsafe {
if !self.ptr.is_null() {
// Copy existing elements
ptr::copy_nonoverlapping(self.ptr, new_ptr, self.len);
// Deallocate old memory
let old_layout = Layout::array::(self.capacity).unwrap();
dealloc(self.ptr as *mut u8, old_layout);
}
}
self.ptr = new_ptr;
self.capacity = new_capacity;
}
}
impl Drop for SafeVec {
fn drop(&mut self) {
// Drop all elements
while let Some(_) = self.pop() {}
// Deallocate memory
if !self.ptr.is_null() {
unsafe {
let layout = Layout::array::(self.capacity).unwrap();
dealloc(self.ptr as *mut u8, layout);
}
}
}
}
// Safe usage of unsafe code
fn main() {
let mut vec = SafeVec::new();
// All operations are safe from the user's perspective
vec.push(1);
vec.push(2);
vec.push(3);
println!("Length: {}", vec.len());
println!("Capacity: {}", vec.capacity());
while let Some(value) = vec.pop() {
println!("Popped: {}", value);
}
// Demonstrate safety - this returns None instead of crashing
println!("Get index 0 from empty vec: {:?}", vec.get(0));
}
Best Practices
Guidelines for Safe Unsafe Code
- Document safety requirements: Clearly state what the caller must ensure
- Minimize unsafe scope: Keep unsafe blocks as small as possible
- Create safe abstractions: Wrap unsafe code in safe APIs
- Test thoroughly: Unsafe code requires extensive testing
- Use tools: Miri, AddressSanitizer, and other tools help catch bugs
// Good: Clear documentation and minimal unsafe scope
/// Sums elements in a slice using unsafe pointer arithmetic.
///
/// # Safety
///
/// This function is safe because it only accesses elements within
/// the slice bounds, which are guaranteed by the slice type.
fn unsafe_sum_slice(slice: &[i32]) -> i32 {
let mut sum = 0;
let len = slice.len();
if len == 0 {
return 0;
}
let ptr = slice.as_ptr();
// SAFETY: We know ptr is valid for `len` elements because it comes from a slice
unsafe {
for i in 0..len {
sum += *ptr.add(i);
}
}
sum
}
// Bad: Large unsafe block without clear justification
fn bad_example(data: &mut [i32]) {
unsafe {
// Too much code in unsafe block
let ptr = data.as_mut_ptr();
let len = data.len();
for i in 0..len {
*ptr.add(i) *= 2;
}
// More operations...
if len > 0 {
*ptr = 100;
}
// Even more code...
}
}
Common Pitfalls
Mistakes to Avoid
- Null pointer dereference: Always check for null before dereferencing
- Use after free: Don't access memory after it's been deallocated
- Buffer overflows: Ensure pointer arithmetic stays within bounds
- Data races: Unsafe doesn't prevent data races in concurrent code
Checks for Understanding
- What are the five things you can do in unsafe Rust that you can't do in safe Rust?
- Why must functions that contain unsafe operations be marked as unsafe?
- What's the difference between
*const T
and*mut T
? - How do you create a safe abstraction over unsafe code?
Answers
- Dereference raw pointers, call unsafe functions, access/modify static mutable variables, implement unsafe traits, access union fields
- To warn callers that they must ensure safety invariants are met, and to require explicit acknowledgment with unsafe blocks
*const T
is a raw pointer to immutable data;*mut T
is a raw pointer to mutable data- Encapsulate unsafe operations in functions that maintain safety invariants and expose only safe APIs to users