Demystifying Rust: Implementing Box from Scratch

February 17, 2024

For a language that touts its low level capabilities, rust has a very rich and complete standard library. In fact, you effectively never directly interact with the allocator, and its actually very discouraged.

The reason for that, of course, is that its pretty easy to mess up. As such, allocation management is perfect for showing off the advantage of the unsafe system: it explicitly marks when you should be careful and gives you a very small scope to prove the soundness of. Its much easier to prove 2 or 3 lines of code will never segfault, versus an entire C++ class. Smart pointers (Box, Arc, Mutex) utilize this tool to create sound abstractions over allocation management, and in this article we'll go over how. Lets begin!

Step 0: Learn the Allocator

Rust's allocator API is hidden behind the aptly named feature:

#![feature(allocator_api)]
#![feature(non_null_convenience)] //more methods for a type later on

This gives us a lot of things, but for this we just need to worry about Global, Layout, and the Allocator trait:

use std::alloc::{Layout, Global, Allocator};

Lets get started and re-invent Box: a type that allocates a single item. All it is is a pointer!

pub struct Box<T>(*mut T);

impl<T> Box<T> {
    pub fn new(item: T) -> Self {
        todo!()
    }
}

The thing we interface with is Global. Because of the way it works, there isn't a single static to refer to, we need to just use default() to get a handle to it [^1]. Lets take a look at the allocator trait. Immediately, we find the most important function:

fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError>

This takes something called a Layout, and returns something called NonNull. Lets cover these.

The Layout Type

Theres actually a lot more info about a pointer than just the exact address it points to. We also need to know how big the item is, so that we know how many bytes to read. Usually this is static and provided by the type, but with dynamically sized types like [T] it needs to be carried around with the pointer itself (something called a “wide-pointer”). On top of that, pointers have something called an alignment, which requires that they begin on an address divisible by a power of two[^2]. Layout carries all of this info for us, and actually provides a method to generate it from a type:

let layout = Layout::new::<T>();

The NonNull Type

Pointers shouldn't be null. Rust provides functional error handling schemes like Option that are very preferred over sentinel values for a multitude of reasons. This leads to the situation where OxO is a representable value, but should never be constructed. NonNull enforces this invariant: it asserts that it will never be null. The main caveat of Option is that we need to carry along another entire boolean byte just to keep track of which variant it is. However, with NonNull we now have a forbidden value that the compiler promises will never exist! As such, NonNull provides whats called niche optimization: Option will use the forbidden 0x0 value as the discriminant of None, and means that it doesn't need any more size than the type itself. Cool stuff!

assert_eq!(size_of::<NonNull<u8>>(), size_of::<Option<NonNull<u8>>>());

Lets change the pointer type in our box, as it'll always be valid.

pub struct Box<T>(NonNull<T>);

Getting Memory

Back to the function! Lets get the allocator and give it a layout of our type:

let layout = Layout::new::<T>();
let ptr = Global::default().allocate(layout).unwrap();
let ptr = ptr.cast::<T>();

Now we have a pointer to some of our own data on the heap! First thing you'll notice is that we actually need to manually cast the pointer to the type we need, as allocate simply returns a slice of bytes. Lets not forget to actually write our data into that allocation:

unsafe { ptr.write(item) };

This is unsafe as we need to guarantee that we are allowed to write to that space in memory. Since we just allocated that memory with a layout generated from T, we know that this is okay.

Getting rid of the Memory

We have a situation now where the default drop actually doesn't suffice. All rust does right now is delete the pointer when the box gets dropped. But the allocator doesn't know that happened! This is called a memory leak, where data has been allocated on the heap, and all pointers to that memory are destroyed without reallocating. Our data is doomed to forever sit untouched on the heap, alone and forgotten. Luckily, rust has destructors. Implementing the Drop trait lets us run arbitrary code before the data gets destroyed, so this is the perfect place to deallocate!

impl<T> Drop for Box<T> {
    fn drop(&mut self) {
        unsafe { self.0.drop_in_place() };

        unsafe { 
            Global::default().deallocate(
                self.0.cast(), Layout::new::<T>()
            ) 
        };
    }
}

This is pretty similar to when we created the box, just in reverse. Once again, we need to provide a NonNull<u8>, which is why we call cast(), and the layout of our type. We also want to make sure that whatever deconstructor T has is run, with drop_in_place. This method is unsafe because it doesn't actually remvove the data in that memory, it simply runs the Drop implementation. Reading that memory from that point forward is undefined behavior.

And thats all she wrote! Along with a simple Deref impl, we've created a safe abstraction for storing data on the heap!

impl<T> Box<T> {
    pub fn new(item: T) -> Self {
        let layout = Layout::new::<T>();
        let ptr = Global::default().allocate(layout).unwrap();
        let ptr = ptr.cast::<T>();
        unsafe { ptr.write(item) };

        Self(ptr)
    }
}

impl<T> Drop for Box<T> {
    fn drop(&mut self) {
        unsafe { self.0.drop_in_place() };
        unsafe { 
            Global::default().deallocate(
                self.0.cast(), Layout::new::<T>()
            ) 
        };
    }
}

impl<T> Deref for Box<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        //we know that this is safe because we can only get a Box from a safe function
        unsafe { self.0.as_ref() }
    }
}

Alternatively, you could store the allocator in the struct, as it takes up no size.
Alignment is admittedly something I barely understand, I simply take it as a weird optimization that hardware does, and it really doesn't like when you break it.

thank you for reading <3 i plan to write a part 2 soon that implements vec. i figured it would be better to teach the allocator api in its own smaller article. sorry to those who read this before i added the deallocation part. oops...