Protocol Evolution

It's hard to update all nodes in distributed systems simultaneously. Firstly, it requires synchronization of releases and teams. Secondly, it means all nodes become unavailable until the update is finished, which can be highly unpleasant for users.

An alternative approach is updating nodes one by one without any kind of locks between them. However, they can become incompatible with each other because the communication protocol has evolved. That's where a protocol evolution appears. If a message is sent to another node, we should be more careful when changing it.

Communication is based on the msgpack format. It's close to JSON, but binary encoded and more feature-rich: it supports arbitrary map keys, special values of floats (NaN, ±inf), and binary data. However, the ability to evolve is the same, so the rules of evolution are also similar.

It's worth noting that the compatibility of nodes can be checked automatically on a per-message basis (because it's known where a message is used on the sender and receiver part). However, it's unimplemented right now.

The common principle

Messages are defined in so-called protocol crates. They shouldn't contain any logic, only message definitions and convenient constructors, getters, and setters for them. To understand the common principle of evolution, it needs to distinguish nodes that send a message from those that receive it. Senders and receivers are compiled with different versions of the same protocol crate.

The common principle states: senders should send a more specific version of a message than receivers accept.

In this way, if a message in the protocol becomes more specific (e.g., it gets more fields), then senders must be updated first. And, vice versa, if a message becomes less specific (e.g., losing fields), receivers must be updated first. Some changes allow any orders of updating; see the next section for details.

In the case of downgrading nodes, the update order is the opposite.

Let's consider some example of evolution:

It's a simple example of evolution with only adding required fields, but it helps to understand the common principle. Providing more fields from a sender than a receiver expects is always acceptable because a receiver can skip these fields during deserialization. If a receiver is updated first, it will expect more fields, and deserialization will fail.

This principle can be propagated to multiple updates as well:

At step 2, the message becomes more specific, so senders are updated first. At step 4, the message becomes less specific, so receivers are updated first.

Examples

ActionCurrent versionNext versionUpdate ordering

Adding a new required field

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    a: u32,
    b: u32,
}

Sender, Receiver

Adding a new optional field

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    a: u32,
    #[serde(default)]
    b: u32,
    c: Option<u32>,
}

Any

Renaming a field

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    #[serde(alias = "a")]
    b: u32,
}

Receiver, Sender

Removing a required field

#[message]
struct Sample {
    a: u32,
    b: u32,
}
#[message]
struct Sample {
    a: u32,
}

Receiver, Sender

Promotion of numbers1

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    a: u64,
    // Any numeric
    // a: i16
}

Any

(Un)wrapping into a newtype

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    a: U32,
}

#[message(part)]
struct U32(u32);

Any

Adding a new variant

#[message]
enum Sample {
    A(u32),
}
#[message]
enum Sample {
    A(u32),
    B(u32),
}

Receiver, Sender

Renaming a variant

#[message]
enum Sample {
    A(u32),
}
#[message]
enum Sample {
    #[serde(alias = "A")]
    B(u32),
}

Receiver, Sender

Removing a variant

#[message]
enum Sample {
    A(u32),
    B(u32),
}
#[message]
enum Sample {
    A(u32),
}

Sender, Receiver

Replacing a field with an union (a untagged enumeration)

#[message]
struct Sample {
    a: u32,
}
#[message]
struct Sample {
    a: NumOrStr,
}

#[message(part)]
#[serde(untagged)]
enum NumOrStr {
    Num(u32),
    Str(String),
}

Any

Removing a variant

#[message]
enum Sample {
    A(u32),
    B(u32),
}
#[message]
enum Sample {
    A(u32),
}

Sender, Receiver

Enumerations with serde(other)

#[message]
enum Sample {
    A,
    B,
}
#[message]
enum Sample {
    A,
    #[serde(other)]
    O,
}

Receiver, Sender

1

Promotion to/from i128 and u128 doesn't work due to msgpack's implementation details.