The other day I made a joke on twitter, and learned some interesting things about raw pointers in Rust.
The abridged joke goes something like this:
Yosh: What do you mean Rust doesn't ship with
rand
built-in?
Me: ASLR to the rescue!
fn main() { let rand = main as usize; dbg!(rand); }
Part 1: Explaining the joke §
Explaining the joke is bad form, but there is some valuable technical detail here.
The most important thing is ASLR (Address Space Layout Randomization). When software has memory safety bugs like buffer overflows, it's easy for an attacker to blast hostile data into a process's stack. That hostile data could replace the address that the currently running function will return to, giving the attacker the ability to execute some arbitrary instructions.
This would be bad. One of the mitigations that engineers came up with is to have each program execute at a randomized virtual address, to make such attacks harder. It's debatable whether this is effective at turning away attacks, but that's the goal, and ASLR is enabled on almost every operating system in use today.
Let me annotate my joke program a little bit:
fn main() {
// By calling this "rand", I'm pretending that this line of code is
// a random number generator.
let rand = main as usize;
// ^ ^
// | \-- that address as a pointer-sized integer
// |
// \-- the address of the main function
dbg!(rand); // print out the result
}
This program assumes that ASLR is used by the host operating system, which caused the program to run at a random address. Our program observes the address that main
is located at, and uses that as our random value.
It's reasonable to wonder whether the address of main
might be a static value, or that the Rust compiler might use some static address rather than recomputing at runtime due to ASLR. This isn't the case, though: experimentally we can verify that the value does change on each execution.
This is a dirty hack, and I don't recommend doing this in real programs. ASLR isn't a good random number generator. The address doesn't change that much, and under some conditions may not change at all. Even in the best circumstances, a program can only acquire one random value this way, so two different modules both using this trick would use the same value. Real random number generators are fast and readily available (at least, on any platform capable of using ASLR). Please use a well-regarded RNG istead of a hack like this.
Part 2: I learn things about Rust pointers §
The code I above isn't exactly what I posted on twitter. The original post looked like this:
fn main() {
let rand = main as *const fn() as usize;
dbg!(rand);
}
This wasn't well-written code. The two-step cast is just habit, because of situations like this:
fn print_address(int_ref: &u32) {
let px = int_ref as *const u32 as usize;
dbg!(px);
}
In many cases Rust won't allow us to cast a reference address directly to an integer; we have to go by way of a raw pointer.
Function pointers don't work the same way, though.
I didn't realize that there is no such thing as a "raw function pointer" in Rust. fn()
is itself a pointer type, so *const fn()
is a raw pointer to a function pointer, which doesn't make sense in this context.
Since there is no syntax for "raw function pointer", the compiler will let you substitute any other raw pointer type. Several tutorials use foo as *const ()
(pointer to unit) to temporarily hold an untyped function pointer.
For those expecting to the usual Rust guard rails, it's surprising that the compiler allows casting between arbitrary raw pointer types outside of an unsafe
block. This feels really dangerous— even though we can't do anything with the pointer outside of an unsafe
block, creating a raw pointer usually implies that an unsafe block will eventually do something with it. I kind of wish that this pointer casting required unsafe
, just because this code should send up red flags, and probably deserves a close look during code review.
I feel a little bad that while making a joke by doing something that's a little evil, I accidentally inserted a really evil cast that is highly misleading to the reader.
If I wanted to go all-in on the evil cast I could do something like this:
fn main() {
let rand = main as *const rand::rngs::OsRng as usize;
dbg!(rand);
}
A few people pointed out that there are a few instances of programs that actually do try to harvest some randomness from the program's address. There's probably a place for a dirty hack like that, but I would feel a bit icky if I ever published code like that myself.
Thanks to @yoshuawuyts for setting up the joke, and to @eddyb for pointing out my pointer mistake, and everyone else who commented!