The New MongoDB Rust Driver

Rust NYC

July 21, 2015

Kevin Yeh && Sam Rossi

@kyeahokay | @saghm

"Why is MongoDB exploring a

native Rust driver?"

¯\_(ツ)_/¯

What we've learned about the language

The core goals and tenets of Rust

Why you'd want to use the language

What's on the menu?

MongoDB Drivers
Safety: The Ownership System
Concurrency: Threading with mutability
Usability: Making developers happy

MongoDB Drivers

Servers

Your Code

What this talk is on

MongoDB Overview

Data is stored in "BSON" documents
- "BSON" -> "binary JSON"
Documents are grouped into collections
Databases can have multiple collections

How is the data stored?

Database

Collection

Document

How to talk to a driver: The CRUD API

All basic functionality can be described by the "CRUD" spec:

C reate - insert new documents

R ead - query existing documents

U pdate - change existing documents

D elete - remove existing documents

Example: Baseball Player Database

How would you query all the players on a certain team?

{
    "_id" : ObjectId("55a02f52648dca06dce7e5d0"),
    "first_name" : "Jose",
    "last_name" : "Alvarez",
    "bats" : "L",
    "throws" : "L",
    "team" : "LAA",
    "position" : "P",
    "avg" : null,
    "tags" : [ ]
}

Sample

Document

Basic Steps to Query Data

Select the database and collection

let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

Note: The code segment:

produces the BSON equivalent to the JSON object:

(more on that later)

doc! {
    "team" => "BOS"
}

Step 1

{ "team": "BOS" }

Select the database and collection
Set the query options
1. "filter" → which documents to select
2. "projection" (optional) → which fields from the document to return

Basic Steps to Query Data

let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

Step 1

Step 2.2

Step 2.1

Select the database and collection
Set the query options
1. "filter" → which documents to select
2. "projection" (optional) → which fields from the document to return
Check the result for a success

Basic Steps to Query Data

let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

Step 1

Step 2.2

Step 2.1

Step 3

Select the database and collection
Set the query options
1. "filter" → which documents to select
2. "projection" (optional) → which fields from the document to return
Check the result for a success
Use the results as needed

Basic Steps to Query Data

let mut string = "{\"result\":[".to_owned();

for (i, doc_result) in cursor.enumerate() {
    match json_string_from_doc_result(doc_result) {
        Ok(json_string) => {
            let new_string = if i == 0 {
                json_string
            } else {
                format!(",{}"), json_string)
            };

            string.push_str(&new_string);
        },
        Err(e) => return e,
    }
}

string.push_str("]}");

`Cursor`

implements

`Iterator`

For Instance:

Building with Rust and MongoDB

GridFS and Web Applications

Migrating to Rust 1.0

The Three Core Tenets of Rust

Safety through the ownership system
Concurrency through core structs and features
Usability through standard traits and recursive macros
...and speed!

The Three Core Tenets of Rust

Safety through the ownership system
Concurrency through core structs and features
Usability through standard traits and recursive macros

Struct Referencing: Looks Good?

Ownership

Rust guarantees speed and safety at runtime by enforcing ownership and lifetimes at compile time.

Since the Client object is borrowed, its lifetime must predictably last at least as long as the database that contains it. ('a)

Arc

Atomic Reference Count is your friend.

How do we pass an Arc of self?

Make questionable design decisions.

The Three Core Tenets of Rust

Safety through the ownership system
Concurrency through core structs and features
Usability through standard traits and recursive macros

Mutability in Rust

By default, all variables in Rust are immutable.

However, mutability is a bit different in Rust...

Interior and Exterior Mutability

By default, all structs and variables in Rust have exterior immutability.

Immutable structures can still hold mutable components, as long as they follow the ownership system of Rust:

You may have one or the other of these two kinds of borrows, but not both at the same time:

one or more references (&T) to a resource.
exactly one mutable reference (&mut T).

Mutexes and RwLocks

One approach to guaranteeing these rules at compile time is to use RAII locks.

Connection Pools

How we learned to handle mutability,

the hard way.

Motivation for Connection Pools

Each database operation requires a connection to the database.

Sometimes, they take a long time.

Connection pools handle the creation and reuse of sockets

connected to a MongoDB instance.

Traditional Capped Pool

Fixed-length arrays of locks and sockets.

Acquire socket lock, Use socket, Release socket lock.

Traditional Capped Pool

An immutable connection pool requires explicit lock and socket initialization on creation.

Poisoned socket locks: Permanently kills the connection and limits the pool size.

A mutable connection pool requires a mutex to guarantee thread safety.

Ordered unlocking: If the pool is locked before the socket, it cannot be released
before the socket.

Hyper

Explicit extraction of lock-free sockets!

Lock the pool

Pop a stream (S)

Return the stream

with a pool reference

Hyper

The Good:

Variable number of lazily-connected sockets
Almost lock-free
One master lock on the connection pool

The Bad:

The number of open sockets is uncapped.

Recycling vulnerability: Old sockets can be dropped as new ones are constantly made to take their place.

Our Friend, the Condvar

Master pool lock with variable-length, lock-free sockets.

If no sockets are available and we are capped at the number of open sockets, wait on the condition variable until we've been repopulated.

Thank you, Condvar.

The Three Core Tenets of Rust

Safety through the ownership system
Concurrency through the core structs and features
Usability through standard traits and recursive macros

Recursive Macros

The Dark Ages

Not very readable!

let mut update = bson::Document::new();
let mut set = bson::Document::new();

set.insert("director".to_owned(), Bson::String("Robert Zemeckis".to_owned()));
update.insert("$set".to_owned(), Bson::Document(set));

First try

Do we really need a second macro?

let update = doc! {
    "$set" => nested_doc! {
        "director" => Bson::String("Robert Zemeckis".to_owned())
    }
};

#[macro_export]
macro_rules! doc {
    ( $( $k:expr => $v: expr),* ) => {
        {
            let mut doc = Document::new();
            $(
                doc.insert($k.to_owned(), $v);
            )*
            doc
        }
    };
}

#[macro_export]
macro_rules! nested_doc {
    ( $( $k:expr => $v: expr),* ) => {
        Bson::Document(doc!(
            $( $k => $v),*
        ))
    }
}

Not quite there yet...

Explicit types still needed

let update = doc! {
    "$set" => {
        "director" => Bson::String("Robert Zemeckis".to_owned())
    }
};

#[macro_export]
macro_rules! add_to_doc {
    ($doc:expr, $key:expr => ($val:expr)) => {{
        $doc.insert($key.to_owned(), $val);
    }};

    ($doc:expr, $key:expr => [$($val:expr),*]) => {{
        let vec = vec![$($val),*];
        $doc.insert($key.to_owned(), Bson::Array(vec));
    }};

    ($doc:expr, $key:expr => { $($k:expr => $v:tt),* }) => {{
        $doc.insert($key.to_owned(), Bson::Document(doc! {
            $(
                $k => $v
            ),*
        }));
    }};
}

#[macro_export]
macro_rules! doc {
    ( $($key:expr => $val:tt),* ) => {{
        let mut document = Document::new();

        $(
            add_to_doc!(document, $key => $val);
        )*

        document
    }};
}

Ahh...much better!

let doc1 = doc! { "tags" => ["a", "b", "c"] };
let doc2 = doc! { "tags" => ["a", "b", "d"] };
let doc3 = doc! { "tags" => ["d", "e", "f"] };

coll.insert_many(vec![doc1.clone(), doc2.clone(), doc3.clone()], false, None)
    .ok().expect("Failed to execute insert_many command.");

// Build aggregation pipeline to unwind tag arrays and group distinct tags
let project = doc! { "$project" => { "tags" => 1 } };
let unwind = doc! { "$unwind" => ("$tags") };
let group = doc! { "$group" => { "_id" => "$tags" } };

#[macro_export]
macro_rules! bson {
    ([$($val:tt),*]) => {{
        let mut array = Vec::new();

        $(
            array.push(bson!($val));
        )*

        $crate::Bson::Array(array)
    }};

    ([$val:expr]) => {{
        $crate::Bson::Array(vec!(::std::convert::From::from($val)))
    }};

    ({ $($k:expr => $v:tt),* }) => {{
        $crate::Bson::Document(doc! {
            $(
                $k => $v
            ),*
        })
    }};

    ($val:expr) => {{
        ::std::convert::From::from($val)
    }};
}

#[macro_export]
macro_rules! doc {
    () => {{ $crate::Document::new() }};

    ( $($key:expr => $val:tt),* ) => {{
        let mut document = $crate::Document::new();

        $(
            document.insert($key.to_owned(), bson!($val));
        )*

        document
    }};
}

Auto-converted types!

and a little bit about traits...

High focus for Rust 1.0: Extendability

Get core features down, provide traits for the community.
Large portion of core rust features rely on generic traits.
Flexible functional-style programming in a largely imperative language.

a few useful traits...

Conversion Traits: Auto-convert types.
Encoder Traits: Auto-encode structs.
Error Traits: Handle errors unobtrusively.
Dereference Traits: Utilize the auto-dereferencing system.
Iterator Traits: Iterate naturally over custom structs.
I/O Traits: Perform I/O ops over custom structs.
...and many, many more.

Conversion Traits

Rust makes it easy to auto-convert types.

Encoder Traits

Encoding traits allow arbitrary structs to be converted.

BSON can be decoded into structs and vice versa without any matching involved.

Error Traits

All errors in rust are handled by Result<T, Err>.

Defining your own Error type is easy!

Conversion traits automatically coerce different types of errors.
Error traits provide easy access to coerced error information.

Unobtrusive error handling and control paths.

Dereferencing Traits

To provide concurrency and safety, Rust relies on a number of wrapping structs: Arc, Box, MutexGuard, Ref, etc.

An auto-dereferencing system makes this as transparent and user-friendly as possible.

Iterator Traits

Iterate with control.

I/O Traits

Read and write anything.

The Three Core Tenets of Rust

...and speed!

FFI and Unsafety

When the rust compiler isn't enough.

ObjectId

How do I get the machine id?

How do I save a static generated id?

Driver Status

Donesies ✓

bson library
- OID generation
- stable macro
connection strings
wire protocol
CRUD, commands, and cursors
bulk writes
connection pooling
error handling
automated testing suites

Up Next

replica set failover
server discovery and monitoring (SDAM)
SCRAM-SHA-1 auth
shard tagging, indexes, and other server commands

Learn more...

The Rust Book

Wrapper Types in Rust: Choosing Your Guarantees

Error Handling in Rust

Some notes on Send and Sync

A Practical Intro to Macros in Rust 1.0

The New MongoDB Rust Driver

"Why is MongoDB exploring a

native Rust driver?"

¯\_(ツ)_/¯

What we've learned about the language

The core goals and tenets of Rust

Why you'd want to use the language

What's on the menu?

MongoDB Drivers

MongoDB Overview

How is the data stored?

How to talk to a driver: The CRUD API

Example: Baseball Player Database

Basic Steps to Query Data

Basic Steps to Query Data

Basic Steps to Query Data

Basic Steps to Query Data

Building with Rust and MongoDB

GridFS and Web Applications

Migrating to Rust 1.0

The Three Core Tenets of Rust

Safety through the ownership system

Concurrency through core structs and features

Usability through standard traits and recursive macros

...and speed!

The Three Core Tenets of Rust

Safety through the ownership system

Concurrency through core structs and features

Usability through standard traits and recursive macros

Struct Referencing: Looks Good?

Ownership

Rust guarantees speed and safety at runtime by enforcing ownership and lifetimes at compile time.

Since the Client object is borrowed, its lifetime must predictably last at least as long as the database that contains it. ('a)

Arc

Atomic Reference Count is your friend.

How do we pass an Arc of self?

How do we pass an Arc of self?

Make questionable design decisions.

The Three Core Tenets of Rust

Safety through the ownership system

Concurrency through core structs and features

Usability through standard traits and recursive macros

Mutability in Rust

Interior and Exterior Mutability

Mutexes and RwLocks

Connection Pools

How we learned to handle mutability,

the hard way.

Motivation for Connection Pools

Traditional Capped Pool

Fixed-length arrays of locks and sockets.

Traditional Capped Pool

Hyper

Explicit extraction of lock-free sockets!

Hyper

Our Friend, the Condvar

Master pool lock with variable-length, lock-free sockets.

If no sockets are available and we are capped at the number of open sockets, wait on the condition variable until we've been repopulated.

Thank you, Condvar.

The Three Core Tenets of Rust

Safety through the ownership system

Concurrency through the core structs and features

Usability through standard traits and recursive macros

Recursive Macros

The Dark Ages

First try

Not quite there yet...

Ahh...much better!

and a little bit about traits...

High focus for Rust 1.0: Extendability

a few useful traits...

Conversion Traits

Encoder Traits

Error Traits

Dereferencing Traits

Dereferencing Traits

Iterator Traits

I/O Traits

The Three Core Tenets of Rust

FFI and Unsafety