Fishy Words


Rust's HashMap

The interface for the rust hashmap is really cool. It’s an example of how the language uses good abstractions to combine speed and safety.

A hashmap is a simple interface: you can insert, retrieve, delete, check-membership, and iterate-over. For example, here’s an example python program that prints which letters appear an odd number of times in a string:

characters = "abca"
counter = dict()         # create

for c in characters:
    if c in counter:     # membership
        counter[c] += 1  # update
    else:
        counter[c] = 1   # insert

for key,val in counter.items(): # iterate
    if val % 2 == 1:
      print(f"{key} was an odd char!")

Note how we have too lookup the key twice: once to check membership, and then once to insert it.

Rust the language builds on the same dream of C++: to create a high level language that compiles into optimal machine code. It tries to squeeze every performance edge it can, while staying at a high level.

The first abstraction rust has is the match branching. This eliminates the double lookup if they key is in the map!

fn main() {
  let characters = "abca";
  let mut counter = std::collections::HashMap::new();

  for c in characters.chars() {
    match counter.get_mut(&c) {          // lookup
      Some(val) => { *val += 1; },       //     and modify!
      None => { counter.insert(c, 1); }, // insert
    }
  }

  for (key, val) in &counter {
    if val % 2 == 1 {
      println!("{} was an odd char!", key);
    }
  }
}

Rust has one more trick to get even faster. Notice how Some branch contains the location we want to insert at, which allows us to increment val directly instead of looking it up!

We need something like that for the None branch, something to represent the empty location to insert into.

Thus bringing us to rust’s Entry API:

use std::collections::HashMap;
use std::collections::hash_map::Entry; // entry

fn main() {
  let characters = "abca";
  let mut counter = HashMap::new();

  for c in characters.chars() {
    match counter.entry(c) {
      Entry::Occupied(mut entry) => {  // modify entry
        *entry.get_mut() += 1;
      }
      Entry::Vacant(entry) => {        // insert entry
        entry.insert(1);
      }
    }
  }

  for (key, val) in &counter {
    if val % 2 == 1 {
      println!("{} was an odd char!", key);
    }
  }
}

That’s how the right abstraction allows rust to combine speed and safety