Back to Posts

Rust Is Cool

October 15, 2020  /  ~14-minute read

Home | Blog | Portfolio | Contact

I recently set out to learn the Rust programming language. Why? Because new languages are fun and maybe I’ll use it some day.

It also so happens that lots of smart people from the functional programming communities I lurk in also like Rust . This was fascinating to me at first; my experience with functional programming has been about as far away from machine code as you can imagine. Meanwhile, I understood Rust to be something of a modernized C++.

Curiosity thoroughly piqued, I cracked open “The Book” and immersed myself in Rust for several days. I’ve now come out the other side, and would like to my share my impressions of the language.

Type System

It became clear very quickly why Rust was attracting my FP associates: Rust has an algebraic type system , that is, a way to specify new types as collections and alternatives of other types. We call these new types “algebraic data types,” or ADTs. ADTs go hand-in-hand with one of my favorite programming language features: pattern matching — which I’m happy to report is supported by Rust.

I’m a big ADTs because they let you specify all the possible outcomes of a function (including errors) and the type checker will make sure that any calling code handles all of them. For a quick example (that almost compiles):

enum FirstLineResult {
    Result(String),
    EmptyList,
}

fn first_line(lines: Vec<String>) -> FirstLineResult {
    if lines.is_empty() {
        EmptyList
    } else {
        Result(list[0])
    }
}

fn my_business_logic(lines: Vec<String>) {
    let message = match first_line(lines) {
        Result(s) => format!("The first line was {}!", s),
        EmptyFile => String::from("Dunno, it was empty"),
    };
    println!("{}", message);
}

Examples like this one have led me to consider algebraic data to be something like “domain-specific control flow.” FirstLineResult documents the various alternative outcomes of our computation, first_line, in the language of the domain, including what information is available in each case: if you get a Result back, then there’s a String available to you — otherwise you had an EmptyList and you don’t know anything else. Moreover, the compiler will make sure you’ve handled all the cases: we can’t treat a FirstLineResult like a string, and we can’t pattern match on it without handling EmptyFile. This means, for all intents and purposes, that this part of our application will always be in a known state and will only have to reason about valid, expected data (more on this “safe functional core” here and here ).

Of course, FirstLineResult is a specific instance of a very common and much more general pattern. Rust’s generics gives us a repeatable and expressive way to leverage this pattern and others like it.

Generics

“Computations which might fail” are ubiquitous, and they have a name in Rust: Option. We could have written FirstLineResult as:

type FirstLineResult = Option<String>;

// for reference, our previous definition:
// enum FirstLineResult {
//     Result(String),
//     EmptyFile,
// }

where

enum Option<T> {
    Some(T),
    None,
}
// notice the structural similarity to `enum FirstLineResult`

You either “have some T” or you “have none.”

In this expression, T is a “type variable” or “type parameter.” I often hear “generic type” refer to a type, like Option, that has one or more type parameters. Generic types are a great tool for abstraction, letting you encapsulate general patterns in a reusable way. What’s more, I find, is that they’re also a great help for correctness. To illustrate, here’s a cute example (which also casually illustrates Rust’s support for higher-order functions, another killer feature):

fn wat(f: fn(f64) -> f64, xs: Vec<f64>) -> Vec<f64> {
    let mut result = Vec::new();
    for x in xs {
        result.push(???);
    }
    result
}

What should I push to my result? There are so many things you can do to an f64 like x — negate it, divide it by two, replace it with pi — how could we possibly say which one is right? Even if you guessed that wat is map and we should simply push f(x), it’s not immediately obvious, especially for machines (who we might be asking to write some code for us).

Generics1 take our options away by abstracting over irrelevant types:

fn wat2<A, B>(f: fn(A) -> B, xs: Vec<A>) -> Vec<B> {
    let mut result = Vec::new();
    for x in xs {
        result.push(???);
    }
    result
}

We need to get our hands on a B to push into our result (a vector of Bs). Working backwards through the types and terms we have in scope, there’s only one way for us to get a B: by calling f. But f will only give us a B if we give it an A. Luckily, we have an A in scope right now: x. The types have naturally led us to the correct implementation2:

fn my_map<A, B>(f: fn(A) -> B, xs: Vec<A>) -> Vec<B> {
    let mut result = Vec::new();
    for x in xs {
        result.push(f(x));
    }
    result
}

Nothing in wat was actually specific to f64s, so generics let us abstract over that type and forget all the minutiae and baggage that came with it.

There are of course cases where want to know a little about A or B or T. For example, we can’t yet write:

fn negate_all<T>(xs: Vec<T>) -> Vec<T> {
    let mut result = Vec::new();
    for x in xs {
        result.push(- x);
    }
    result
}

because Rust doesn’t know how to negate a T:

error[E0600]: cannot apply unary operator `-` to type `T`
 --> src/lib.rs:4:21
  |
4 |         result.push(- x);
  |                     ^^^ cannot apply unary operator `-`

For more information about this error, try `rustc --explain E0600`.

Luckily for us, the Rust compiler has fantastic error reporting, and the explanation for E0600 is:

An unary operator was used on a type which doesn’t implement it.

Remember, we don’t know anything about T. negate_all’s type claims “for any T, if you give me a vector of Ts, I’ll give you one back.” The thing is, that’s a lie: negate_all only works for Ts it knows how to negate.

We can refine which T’s negate_all refers to with a “trait bound.” (In Rust, a trait is basically an interface; a collection of methods a concrete type might implement.) Now we can clarify our earlier claim, specifying that negate_all works for any T which implements negation3:

fn negate_all<T: Neg>(xs: Vec<T>) -> Vec<T> { /* ... */ }

This turns out to be a wildly useful way to write functions. Rather than starting with a concrete type and all the baggage (i.e. extraneous operations) that comes with it, you can start with some T. As you write your implementation, you recognize which operations you actually do need to perform, and constrain T accordingly. This workflow is something of a trait-driven extension to the point above about generics supporting correctness.

Error Propagation

Another joy of Rust is its automatic error propagation. To explain what makes it such a joy, I’ll do my best to explain a problem in Haskell with which I only have topical, second-hand familiarity.

Earlier, we saw how algebraic data can help us encode and handle all the possible outcomes of a function, including errors. We saw the venerable Option type, and now I’ll introduce its sister, Result:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

Like Option, Result has two variants: one for success and one for failure. Unlike Option, Result’s error case includes a piece of data about the error. This E can be a simple string with an error message, but best practice dictates that E too should be an ADT, one enumerating the ways things could go wrong. This way, callers can pattern match on the error and handle each case appropriately:

enum FileOpenError {
    NotFound,
    PermissionDenied,
}

fn fopen(path: String) -> Result<String, FileOpenError> { /* ... */ }

fn main() {
    match fopen("foo.txt") {
        Ok(contents) => println!("Foo says: {}", contents),
        Err(NotFound) => println!("Couldn't find foo.txt"),
        Err(PermissionDenied) => println!("I'm sorry, Dave"),
    }
}

In Haskell, Option and Result are called Maybe and Either, respectively. If you’ll excuse the biggest hand-wave of the decade, Maybe and Either also “automate” error propagation:

data Maybe    t = Just t  | Nothing
data Either e t = Ok t    | Err e  -- <- different names IRL

data FileOpenError = NotFound | PermissionDenied

fopen       :: String -> Either FileOpenError String
count_lines :: String -> Either FileOpenError Int

do_stuff :: Either String FileOpenError
do_stuff = do
  foo_contents <- fopen "foo.txt"
  foo_length   <- count_lines "foo.txt"
  return (foo_contents ++ show foo_length)

Notice how we didn’t have to pattern match on (that is, manually handle) the results of fopen or count_lines; when composed like this, operations of type Either e will short-circuit on the first operation to return an Err e, returning that error.

The challenge is that while t can vary from one operation to the next (String then Int above) e must be fixed if we’re to take advantage of this automatic error propagation. For example, this doesn’t type-check:

data FileOpenError = NotFound | PermissionDenied
data DbError = ConnectionRefused | Timeout

fopen      :: FilePath -> Either FileOpenError String
some_query :: String   -> Either DbError       String

do_stuff2 = do
  username <- fopen "alice_info.txt"
  alice    <- some_query username
  return alice

Our compiler will tell us:

• Couldn't match type ‘DbError’ with ‘FileOpenError’
  Expected type: Either FileOpenError String
    Actual type: Either DbError String

I’ll spare you the details, but the machinery which lets us sequence Either operations with that handy short-circuiting/ automatic error propagation is called “bind,” and it looks like this:

bind :: Either e a -> (a -> Either e b) -> Either e b
--      ^^^^^^^^^^ first op
--                          ^^^^^^^^^^ second op (can use "a" if it wants)
--               resulting sequence of ops ^^^^^^^^^^

Notice that while the first and second operations may or may not share a result type (sometimes a ~ b, sometimes not), e stays the same. A lot of ink has been spilled on this topic, so I’d encourage you to conduct some research on your own.

For now, we have some context on the problem, so we can finally return to how this all plays out in Rust. Rust also has a special notation which concisely encapsulates this automatic error propagation:

fn fopen(path: String)      -> Result<String, FileOpenError>
fn count_lines(path: String -> Result<u32,    FileOpenError>

fn do_stuff() -> Result<String, FileOpenError> {
    foo_contents = fopen("foo.txt")?;  // <-- notice the question mark
    foo_length   = count_lines("foo.txt")?;
    Ok(format!("{}{}", foo_contents, foo_length))
}

No more pattern matching! Now let’s try to vary e:

fn do_stuff2() -> Result<String, ???> {
  let username = fopen("alice_info.txt")?;
  let alice    = some_query(username)?;
  alice
}

It’s not immediately obvious what new “overall” error type should be. do_stuff2 could get a FileOpenError or a DbError. Well, we’ve seen that Rust can express types which are “either this or that,” so let’s just say that directly and see how the compiler reacts:

enum MyError {
    MyFileError(FileOpenError),
    MyDbError(DbError)
}

fn do_stuff3() -> Result<String, MyError> {
  let username = fopen("alice_info.txt")?;
  let alice    = some_query(username)?;
  alice
}

Rust tells us:

error[E0277]: `?` couldn't convert the error to `MyError`
  --> src/lib.rs:24:41
   |
24 | let username = fopen("alice_info.txt")?;
   |                                       ^ the trait `From<FileOpenError>` is not implemented for `MyError`
   |
   = note: the question mark operation (`?`) implicitly performs a
     conversion on the error value using the `From` trait

For more information about this error, try `rustc --explain E0277`.

This is another example of Rust’s outstanding error reporting (and we get a similar one for our use of ? with some_query). In contrast to Haskell, Rust’s error propagation machinery gives us a hook we can use to explain how to go from one e to the next! As the complier explained, this is the From trait. When we implement that trait for the composite MyError4, the code successfully complies.

In the relatively minuscule amount of Rust I’ve written to date, I’ve found that this concise error propagation syntax has encouraged me not to just throw away inconvenient errors (and open the door to runtime crashes) by letting “business” code easily delegate to purpose-built error handling code.

Memory Management

Remarkably, I’m over 2,000 words into a blog post about Rust and I haven’t even used the word “memory” yet.

Right on the tin, Rust makes it clear that it’s designed to help you make the most of your machine’s memory:

Rust is blazingly fast and memory-efficient: with no runtime or garbage collector[.]

Rust’s rich type system and ownership model guarantee memory-safety and thread-safety — enabling you to eliminate many classes of bugs at compile-time.

rust-lang.org

A garbage collector (GC) is a subroutine built in to many modern languages that automatically cleans up memory the main program doesn’t need anymore. It saves programmers the headache of manual memory management and obsoletes some classes of memory bugs. However, a GC can be prohibitively expensive in terms of CPU cycles and memory in the performance-critical situations Rust was geared towards.

“Traditionally,” no GC means manual memory management (a la C/C++). Rust breaks this tradition, finding a middle path with its (quite unique) ownership system.

I, as a Rust novice, encourage you to take my summary of ownership with a grain of salt and to follow the link above to The Book. That said, here’s how I conceptualize ownership: the Rust compiler has a global view of your program and can see all the operations a particular datum might be subject to and where. It uses this information to reject data-flows that can result in bad behavior. For example, to prevent a class of bad behavior known as “data races,” Rust lets you keep only one mutable (think “writable”) reference to an object at a time. This rule and many others, which are baked into the compiler, prevent data from being mutated out from under you when you were expecting it not to change. Rust has a panoply of reference and pointer types to express different sharing and thread-safety semantics — allowing you to explicitly specify how any particular data may be used, shared, copied, or transported.

Learning the ownership system felt very similar to (re)learning static typing through Haskell a number of years ago. I was coming from Python, where most of the exploration of a solution space takes place at run time, playing with various sample inputs, seeing how the program behaves, and adjusting it accordingly. I was used to finding a solution by debugging a running program, so you can imagine my frustration starting up with Haskell, where the whole program must type-check before you can run it. Eventually, I learned to explore solutions in the type system, where GHC is extremely well-equipped to assist you. This switch forced me to premeditate my solutions much more deeply; to understand how exactly subcomponents would fit together, which functions would be able to perform I/O, and so on. I feel a similar switch coming on with Rust’s ownership system. Now though, rather than premeditating the responsibilities and relationships of my program’s subcomponents, I’m premeditating who will use which data and how. Rather than letting a runtime figure it out for me, I have to be conscious of when bytes should be copied, references shared, and ownership transferred.

I’m really looking forward to digging in deeper, hopefully with a proper project. Just like learning a new language or programming paradigm, I think Rust’s ownership system is so unique that learning it will result in important and meaningful growth for anyone who engages with it, even if we don’t end up using the language.

Conclusion

I have a few more, semi-formed thoughts on Rust, but this post is tortuously long already. For now, I’m very optimistic about the language. You can’t sneak any nonsense past the compiler, but it will help you express your ideas correctly. I’m looking forward to writing a lot more Rust soon!