Passing nothing is surprisingly difficult

David Benjamin (2024-01-15)

My day job is in browsers and cryptography, not compilers, yet I often find that I need to spend more of my time working through the semantics of programming languages than using them. This post discusses a thorny cross-language issue between C, C++, and Rust. In short:

As FFI challenges inherently span multiple languages, I wrote this mostly to have one common reference that describes the mismatch.

Slices

All three languages allow working with slices, or contiguous sequences of objects in memory. (Also called spans, but we’ll use “slices” here.) A slice is typically a pointer and a length, (start, count), giving count objects from start, of some type T.

A slice can also be specified by two pointers, (start, end), giving the objects from start (inclusive) to end (exclusive). This is better for iteration because only one value needs to be adjusted to advance, but the length is less available. C++ iterator pairs are a generalization of this form, and Rust slice iterators use this internally. The two forms can be converted with end = start + count and count = end - start, using C-style pointer arithmetic where everything is scaled by the object size. We’ll primarily discuss (start, count), but this duality means slices are closely related to pointer arithmetic.

In C and C++, slices are, at best, library types built out of pointers and lengths. Sometimes functions just take or return pointer and length separately, but still use it to represent a slice of memory. In Rust, slices are language primitives, but the underlying components are exposed for unsafe code and FFIs. To work with each of these, we must understand what combinations of pointers and lengths are valid.

This is straightforward for a positive-length slice: start must point within an allocation where there are at least count objects of type T. But suppose we want an empty (length zero) slice. What are the valid representations of an empty slice?

Certainly we can point start within (or just past) some array of Ts and set count to zero. But we may want to make an empty slice without an existing array. For example, a default-constructed std::span<T>() in C++ or &[] in Rust. What are our options then? In particular:

  1. Can an empty slice be (nullptr, 0)?
  2. Can an empty slice be (alignof(T), 0), or some other aligned address that doesn’t correspond to an allocation?

The second question may seem odd to C and C++ folks, but Rust folks may recognize it as std::ptr::NonNull::dangling.

C++

C++ is the easiest to discuss, as it has a formal specification and is (almost) self-consistent.

First, (nullptr, 0) is a valid empty slice in C++. The STL’s types routinely return it, and the language is compatible with it:

C++ defines APIs like std::span in terms of pointer addition for the (start, count) form, and iterator pairs in terms of pointer subtraction for the (start, end) form, so this is both necessary and sufficient.

Moreover, it would be impractical for C++ to forbid (nullptr, 0). C++ code routinely needs to interact with APIs that specify slices as individual components. Given such an API, no one writes code like this:

void takes_a_slice(const uint8_t *in, size_t len);

uint8_t placeholder;
takes_a_slice(&placeholder, 0);

It is much more natural to use nullptr:

void takes_a_slice(const uint8_t *in, size_t len);

takes_a_slice(nullptr, 0);

This means, to be practical, functions like takes_a_slice must accept (nullptr, 0). For implementing such functions to be practical, the underlying language primitives must then also accept (nullptr, 0).

As for the (alignof(T), 0) question, pointer addition and subtraction require the pointers point to some object and that the operation stays within the bounds of that object (or one past the end). C++ does not define there to be an object at alignof(T), so this is not allowed, instead producing Undefined Behavior. This has no immediate usability concern (no one is going to write reinterpret_cast<uint8_t*>(1) to call takes_a_slice), but we’ll see later that it has some consequences for Rust FFIs.

(Update 2024-01-16: Added the following paragraph, as this shorthand seems to have been unclear.)

Of course, in principle, nothing stops takes_a_slice from defining its own unique rules for these corner cases. Beyond what the type system naturally provides, user code will rarely formally define semantics, and we must instead look to fuzzier conventions and expectations. Sadly, in C and C++, these fuzzier aspects often include well-definedness. When all the slice-adjacent language primitives consistently use the same preconditions for slice components, a naively written function will match. This is then a reasonable default interpretation for takes_a_slice’s preconditions. When this post discusses “the” rules for slices for C++ or C, it is in part a shorthand for this emergent convention.

However, C++ is merely almost self-consistent. C++ picks up memcpy and other functions from C’s standard library, complete with C’s semantics…

C

C is messier. For the same reasons discussed above, it is impractical to reject (NULL, 0). However, C’s language rules make it difficult for a function to accept it. C does not have C++’s special cases for (T*)NULL + 0 and (T*)NULL - (T*)NULL. See clauses 8 and 9 of section 6.5.6 of N2310. memcpy and the rest of the C standard library similarly forbid (NULL, 0).

I think this should be considered a bug in the C specification, and compilers should not optimize based on it. In BoringSSL, we found the C rules so unusable that we resorted to wrapping the standard library with n != 0 checks. The pointer arithmetic rules are similarly a tax on development. Moreover, C++ inherits C’s standard library (but not its pointer rules), including this behavior. In Chromium’s C++ code, memcpy has been the single biggest blocker to enabling UBSan.

Fortunately, there is hope. Nikita Popov and Aaron Ballman have written a proposal to fix this in C. (Thank you!) While it won’t make C and C++ safe by any stretch of imagination, this is an easy step to fix an unforced error.

Note that, apart from contrived examples with deleted null checks, the current rules do not actually help the compiler meaningfully optimize code. A memcpy implementation cannot rely on pointer validity to speculatively read because, even though memcpy(NULL, NULL, 0) is undefined, slices at the end of a buffer are fine:

char buf[16];
memcpy(dst, buf + 16, 0);

If buf were at the end of a page with nothing allocated afterwards, a speculative read from memcpy would break.

Rust

Rust does not allow (nullptr, 0). Functions like std::slice_from_raw_parts require the pointer to be non-null. This comes from Rust treating types like &[T] and *[T] as analogous to &T and *T. They are “references” and “pointers” that are represented as (start, count) pairs. Rust requires every pointer type to have a “null” value outside its reference type. This is used in enum layout optimizations. For example, Option::<&[T]> has the same size as &[T] because None uses this null value.

Unfortunately, Rust chose (nullptr, 0) for the null slice pointer, which means the empty slice, &[], cannot use it. That left Rust having to invent an unusual convention: some non-null, aligned, but otherwise dangling pointer, usually (alignof(T), 0).

Is pointer arithmetic defined for this slice? From what I can tell, the answer appears to be no! (Update 2024-01-16: It sounds like this is in the process of being defined!)

Pointer arithmetic in Rust is spelled with the methods add, sub_ptr, and offset_from, which the standard library defines in terms of allocated objects. That means, for pointer arithmetic to work with alignof(T), there must be zero-size slices allocated at every non-zero address. Moreover, offset_from requires the two dangling pointers derived from the same slice to point to the “same” of these objects. While the third bullet here, second sentence, says casting literals gives a pointer that is “valid for zero-sized accesses”, it says nothing about allocated objects or pointer arithmetic.

Ultimately, these semantics come from LLVM. The Rustonomicon has more to say on this (beginning “The other corner-case…”). It concludes that, while there are infinitely many zero-size types at 0x01, Rust conservatively assumes alias analysis does not allow offsetting alignof(T) with zero for positive-sized types. This means Rust pointer arithmetic rules are incompatible with Rust empty slices. But recall that slice iteration and pointer arithmetic are deeply related. The Rustonomicon’s sample iterator uses pointer arithmetic, but needs to guard addition with cap == 0 in into_iter and cast to usize in size_hint.

This is too easy for programmers to forget. Indeed the real Rust slice iterator does pointer arithmetic unconditionally (pointer addition, pointer subtraction, behind some macros). This suggests Rust slice iterators are unsound.

FFIs

Beyond self-consistency concerns, all this means Rust and C++ slices are incompatible. Not all Rust (start, count) pairs can be passed into C++ and vice versa. C’s issues make its situation less clear, but the natural fix is to bring it in line with C++.

This means Rust FFI is not “zero-cost”. Passing slices between C/C++ and Rust requires conversions in both directions to avoid Undefined Behavior.

More important (to me) than performance, this is a safety and ergonomics problem. Programmers cannot be expected to memorize language specifications. If given a &[T] and trying to call a C API, the natural option is to use as_ptr, but that will return a C/C++-incompatible output. Most Rust crates I’ve seen which wrap C/C++ APIs do not convert and are unsound as a result.

This is particularly an issue because C and C++’s (more serious) safety problems cause real user harm and need addressing. But there is half a century of existing C and C++ code. We cannot realistically address this with a new language without good FFI. What makes for good FFI? At a bare minimum, I think calling a C or C++ function from Rust should not be dramatically less safe than calling that function from C or C++.

Wishlist

Empty lists should not be so complicated. We could change C, C++, and Rust in a few ways to improve things:

Make C accept nullptr

See Nikita Popov and Aaron Ballman’s proposal.

Fix Rust’s slice representation

All the alignof(T) problems ultimately come from Rust’s unusual empty slice representation. This falls out of Rust’s need for a “null” *[T] value that is not a &[T] value. Rust could have chosen any of a number of unambiguously unused representations, such as (nullptr, 1), (nullptr, -1), or (-1, -1).

While this would be a significant change now, with compatibility implications to work through, I think it is worth seriously considering. It would address the root cause of this mess, fixing a soundness hazard in not just Rust FFI, but Rust on its own. This hazard is real enough that Rust’s standard library hits it.

This is also the only option I see that fully meets Rust’s “zero-cost” FFI goals. Even if we make C and C++ accept (alignof(T), 0) from Rust (see below), any slices passed from C/C++ to Rust may still be (nullptr, 0).

Define pointer arithmetic for invalid pointers

If Rust leaves its slice representation as is, we instead should define pointer arithmetic for NonNull::dangling(). Expecting low-level Rust code to guard all pointer offsets is impractical.

Update 2024-01-16: Happily, it sounds like this is already in the process of being defined!

Where nullptr is a single value which could just be special-cased, there are many alignof(T) values. It seems one would need to define it in terms of the actual allocated objects. This is well beyond my expertise, so I’ve likely gotten all the details and terminology wrong, but one possibility is, in the vein of PNVI-ae-udi:

One subtlety, however, is that cast_ival_to_ptrval might not give back @empty if there is an object at that address. Giving back a concrete provenance means picking up pointer-zapping semantics, which would be undesirable here. For alignof(T), that shouldn’t happen if the maximum alignment is under a page and the bottom page is never allocated. But Rust allows not just alignof(T) but any non-zero integer literal, even if some allocation happens to exist at that address. (Perhaps we could use the “user-disambiguation” aspect and say all integer-to-pointer casts may additionally disambiguate to @empty? Would that impact the compiler’s aliasing analysis?)

I think this complexity demonstrates why nullptr is a much better choice for an empty slice than a dangling pointer. Pointer arithmetic with nullptr is easy to define, and nullptr cannot alias a real allocation.

If Rust (and LLVM) accepted invalid pointers, it would fix the soundness issues within Rust, but not with FFIs. If the C and C++ standards also picked this up, it would partially fix FFIs. We could then directly pass Rust slices into C and C++, but not in the other direction. Directly passing C and C++ slices into Rust can only be fixed by changing Rust to accept (nullptr, 0) form.

(Outside of Rust FFI, there’s no reason to use alignof(T) as a pointer in C/C++, so I do not know how plausible it would be for C/C++ to accept it.) Update 2024-01-16: Nelson Elhage reminded me that non-null sentinel pointers are sometimes used to allocate zero bytes. While C forbids malloc from doing this (malloc(0) must return either nullptr or a unique non-null pointer), other allocators might reasonably pick this option. It makes error checks more uniform without actually reserving address space. So there is a non-Rust reason to allow these pointers in C and C++.

FFI helpers in Rust standard library

If the languages’ slice representations cannot be made compatible, we’re still left with safety hazards in Rust FFI. In that case, Rust’s standard library should do more to help programmers pick the right operations: Add analogs of slice::from_raw_parts, slice::as_ptr, etc., that use the C and C++ representation, converting internally as needed. Document existing functions very clear warnings that they cannot be used for FFI. Finally, audit all existing calls in crates.io, as the majority of existing calls are likely for FFI.

For slice::from_raw_parts, we could go further and fix the existing function itself. This would be backwards-compatible, but adds unnecessary conversions to non-FFI uses. That said, if the crates.io audit reveals mostly FFI uses, that conversion may be warranted. For non-FFI uses, a type signature incorporating std::ptr::NonNull would have been more appropriate anyway.

This would improve things, but it’s an imperfect solution. We’d still sacrifice zero-cost FFI, and we’d still rely on programmers to read the warnings and realize the natural options are incorrect.

Acknowledgements

Thanks to Alex Chernyakhovsky, Alex Gaynor, Dana Jansens, Adam Langley, and Miguel Young de la Sota for reviewing early iterations of this post. Any mistakes in here are my own.