The New MongoDB Rust Driver

Rust NYC

July 21, 2015

 

Kevin Yeh && Sam Rossi

@kyeahokay | @saghm

"Why is MongoDB exploring a 

native Rust driver?"

¯\_(ツ)_/¯

  • What we've learned about the language

  • The core goals and tenets of Rust

  • Why you'd want to use the language

What's on the menu?

  • MongoDB Drivers
  • Safety: The Ownership System
  • Concurrency: Threading with mutability
  • Usability: Making developers happy

MongoDB Drivers

Servers

Your Code

What this talk is on

MongoDB Overview

  • Data is stored in "BSON" documents
    • ​"BSON" -> "binary JSON"
  • Documents are grouped into collections
  • Databases can have multiple collections

How is the data stored?

Database

 

 

 

 

Database

 

 

 

 

Collection

 

 

Collection

 

 

Collection

 

 

Collection

 

 

Collection

 

 

Document

Document

Document

Document

Document

Document

Document

Document

Document

Document

How to talk to a driver: The CRUD API

All basic functionality can be described by the "CRUD" spec:

C reate - insert new documents

R ead - query existing documents

U pdate - change existing documents

D elete - remove existing documents

Example: Baseball Player Database

How would you query all the players on a certain team?

{
    "_id" : ObjectId("55a02f52648dca06dce7e5d0"),
    "first_name" : "Jose",
    "last_name" : "Alvarez",
    "bats" : "L",
    "throws" : "L",
    "team" : "LAA",
    "position" : "P",
    "avg" : null,
    "tags" : [ ]
}

Sample

Document

Basic Steps to Query Data

  1. Select the database and collection                         
let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

 

 

Note: The code segment:

 

 

produces the BSON equivalent to the JSON object:

 

 

(more on that later)

doc! {
    "team" => "BOS"
}

            Step 1

{ "team": "BOS" }
  1. Select the database and collection
  2. Set the query options
    1. "filter" → which documents to select
    2. "projection" (optional) → which fields from the document to return

Basic Steps to Query Data

let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

            Step 1

      Step 2.2

      Step 2.1

  1. Select the database and collection
  2. Set the query options
    1. "filter" → which documents to select
    2. "projection" (optional) → which fields from the document to return
  3. Check the result for a success

Basic Steps to Query Data

let db = client.db("mlb");
let coll = db.collection("players");

let filter = Some(doc! { "team" => team });

let mut options = FindOptions::new();
options.projection = Some(doc! {
    "_id" => 0,
    "first_name" => 1,
    "last_name" => 1,
    "position" => 1
});

match coll.find(filter, Some(options)) {
    Ok(cursor) => Ok(cursor),
    Err(e) => err_as_string!(e),
}

            Step 1

      Step 2.2

      Step 2.1

      Step 3

  1. Select the database and collection
  2. Set the query options
    1. "filter" → which documents to select
    2. "projection" (optional) → which fields from the document to return
  3. Check the result for a success
  4. Use the results as needed

Basic Steps to Query Data

let mut string = "{\"result\":[".to_owned();

for (i, doc_result) in cursor.enumerate() {
    match json_string_from_doc_result(doc_result) {
        Ok(json_string) => {
            let new_string = if i == 0 {
                json_string
            } else {
                format!(",{}"), json_string)
            };

            string.push_str(&new_string);
        },
        Err(e) => return e,
    }
}

string.push_str("]}");

`Cursor`

implements

`Iterator`

For Instance:

Building with Rust and MongoDB

GridFS and Web Applications

Migrating to Rust 1.0

The Three Core Tenets of Rust

  • Safety through the ownership system

  • Concurrency through core structs and features

  • Usability through standard traits and recursive macros 

  • ...and speed!

The Three Core Tenets of Rust

  • Safety through the ownership system

  • Concurrency through core structs and features

  • Usability through standard traits and recursive macros

Struct Referencing: Looks Good?

Ownership

Rust guarantees speed and safety at runtime by enforcing ownership and lifetimes at compile time.

Since the Client object is borrowed, its lifetime must predictably last at least as long as the database that contains it. ('a)

Arc

Atomic Reference Count is your friend.

How do we pass an Arc of self?

How do we pass an Arc of self?

Make questionable design decisions.

The Three Core Tenets of Rust

  • Safety through the ownership system

  • Concurrency through core structs and features

  • Usability through standard traits and recursive macros 

Mutability in Rust

By default, all variables in Rust are immutable.

 

 

 

 

However, mutability is a bit different in Rust...

 

Interior and Exterior Mutability

By default, all structs and variables in Rust have exterior immutability.

 

Immutable structures can still hold mutable components, as long as they follow the ownership system of Rust:

 

You may have one or the other of these two kinds of borrows, but not both at the same time:

  • one or more references (&T) to a resource.

  • exactly one mutable reference (&mut T).

 

Mutexes and RwLocks

One approach to guaranteeing these rules at compile time is to use RAII locks.

Connection Pools

How we learned to handle mutability,

the hard way.

Motivation for Connection Pools

 

Each database operation requires a connection to the database.

Sometimes, they take a long time.

 

Connection pools handle the creation and reuse of sockets

connected to a MongoDB instance.

Traditional Capped Pool

Fixed-length arrays of locks and sockets.

Acquire socket lock, Use socket, Release socket lock.

Traditional Capped Pool

An immutable connection pool requires explicit lock and socket initialization on creation.

  • Poisoned socket locks: Permanently kills the connection and limits the pool size.

 

A mutable connection pool requires a mutex to guarantee thread safety.

  • Ordered unlocking: If the pool is locked before the socket, it cannot be released
  • before the socket.

Hyper

Explicit extraction of lock-free sockets!

Lock the pool

Pop a stream (S)

Return the stream

with a pool reference

Hyper

The Good:

  • Variable number of lazily-connected sockets

  • Almost lock-free

  • One master lock on the connection pool

 

The Bad:

The number of open sockets is uncapped.

Recycling vulnerability: Old sockets can be dropped as new ones are constantly made to take their place. 

Our Friend, the Condvar

Master pool lock with variable-length, lock-free sockets.

If no sockets are available and we are capped at the number of open sockets,  wait on the condition variable until we've been repopulated.

Thank you, Condvar.

The Three Core Tenets of Rust

  • Safety through the ownership system

  • Concurrency through the core structs and features

  • Usability through standard traits and recursive macros 

Recursive Macros

The Dark Ages

Not very readable!

let mut update = bson::Document::new();
let mut set = bson::Document::new();

set.insert("director".to_owned(), Bson::String("Robert Zemeckis".to_owned()));
update.insert("$set".to_owned(), Bson::Document(set));

First try

Do we really need a second macro?

let update = doc! {
    "$set" => nested_doc! {
        "director" => Bson::String("Robert Zemeckis".to_owned())
    }
};
#[macro_export]
macro_rules! doc {
    ( $( $k:expr => $v: expr),* ) => {
        {
            let mut doc = Document::new();
            $(
                doc.insert($k.to_owned(), $v);
            )*
            doc
        }
    };
}

#[macro_export]
macro_rules! nested_doc {
    ( $( $k:expr => $v: expr),* ) => {
        Bson::Document(doc!(
            $( $k => $v),*
        ))
    }
}

Not quite there yet...

Explicit types still needed

let update = doc! {
    "$set" => {
        "director" => Bson::String("Robert Zemeckis".to_owned())
    }
};
#[macro_export]
macro_rules! add_to_doc {
    ($doc:expr, $key:expr => ($val:expr)) => {{
        $doc.insert($key.to_owned(), $val);
    }};

    ($doc:expr, $key:expr => [$($val:expr),*]) => {{
        let vec = vec![$($val),*];
        $doc.insert($key.to_owned(), Bson::Array(vec));
    }};

    ($doc:expr, $key:expr => { $($k:expr => $v:tt),* }) => {{
        $doc.insert($key.to_owned(), Bson::Document(doc! {
            $(
                $k => $v
            ),*
        }));
    }};
}

#[macro_export]
macro_rules! doc {
    ( $($key:expr => $val:tt),* ) => {{
        let mut document = Document::new();

        $(
            add_to_doc!(document, $key => $val);
        )*

        document
    }};
}

Ahh...much better!

let doc1 = doc! { "tags" => ["a", "b", "c"] };
let doc2 = doc! { "tags" => ["a", "b", "d"] };
let doc3 = doc! { "tags" => ["d", "e", "f"] };

coll.insert_many(vec![doc1.clone(), doc2.clone(), doc3.clone()], false, None)
    .ok().expect("Failed to execute insert_many command.");

// Build aggregation pipeline to unwind tag arrays and group distinct tags
let project = doc! { "$project" => { "tags" => 1 } };
let unwind = doc! { "$unwind" => ("$tags") };
let group = doc! { "$group" => { "_id" => "$tags" } };
#[macro_export]
macro_rules! bson {
    ([$($val:tt),*]) => {{
        let mut array = Vec::new();

        $(
            array.push(bson!($val));
        )*

        $crate::Bson::Array(array)
    }};

    ([$val:expr]) => {{
        $crate::Bson::Array(vec!(::std::convert::From::from($val)))
    }};

    ({ $($k:expr => $v:tt),* }) => {{
        $crate::Bson::Document(doc! {
            $(
                $k => $v
            ),*
        })
    }};

    ($val:expr) => {{
        ::std::convert::From::from($val)
    }};
}

#[macro_export]
macro_rules! doc {
    () => {{ $crate::Document::new() }};

    ( $($key:expr => $val:tt),* ) => {{
        let mut document = $crate::Document::new();

        $(
            document.insert($key.to_owned(), bson!($val));
        )*

        document
    }};
}

Auto-converted types!

and a little bit about traits...

High focus for Rust 1.0: Extendability

  • Get core features down, provide traits for the community.
  • Large portion of core rust features rely on generic traits.
  • Flexible functional-style programming in a largely imperative language.  

a few useful traits...

  • Conversion Traits: Auto-convert types.

  • Encoder Traits: Auto-encode structs.

  • Error Traits: Handle errors unobtrusively.

  • Dereference Traits: Utilize the auto-dereferencing system.

  • Iterator Traits: Iterate naturally over custom structs.

  • I/O Traits: Perform I/O ops over custom structs.

  • ...and many, many more.

Conversion Traits

Rust makes it easy to auto-convert types.

Encoder Traits

Encoding traits allow arbitrary structs to be converted.

BSON can be decoded into structs and vice versa without any matching involved.

Error Traits

All errors in rust are handled by Result<T, Err>.

 

 

Defining your own Error type is easy!

  • Conversion traits automatically coerce different types of errors.
  • Error traits provide easy access to coerced error information.

Unobtrusive error handling and control paths.

Dereferencing Traits

Dereferencing Traits

To provide concurrency and safety, Rust relies on a number of wrapping structs: Arc, Box, MutexGuard, Ref, etc. 

 

An auto-dereferencing system makes this as transparent and user-friendly as possible.

Iterator Traits

Iterate with control.

I/O Traits

Read and write anything.

The Three Core Tenets of Rust

...and speed!

FFI and Unsafety

When the rust compiler isn't enough.

ObjectId

How do I get the machine id?

How do I save a static generated id?

Driver Status

Donesies ✓

  • bson library

    • OID generation

    • stable macro

  • ​connection strings
  • wire protocol

  • CRUD, commands, and cursors

  • bulk writes

  • connection pooling

  • error handling

  • automated testing suites

Up Next

  • replica set failover

  • server discovery and monitoring (SDAM)

  • SCRAM-SHA-1 auth

  • shard tagging, indexes, and other server commands

Learn more...

...and join us!

Thanks!!!

@kyeahokay || @saghm