Architecture

SD-WAN networking

All SD-WAN nodes run the same software and are capable of any SD-WAN function and can be used as a client, a concentrator, or both simultaneously.

Here is an example configuration that would be typical for a provider that has “east” and “west” data centers:

graph {
    rankdir=TB;
    newrank=true;
    fontname="Arial";
    splines=polyline;
    nodesep=0.5;
    ranksep=2.0;
    bgcolor="transparent";
    compound=true;

    node [
        shape=box,
        style="filled,rounded",
        fillcolor="#E9F7EF",
        color="#1E8449",
        fontname="Arial",
        fontsize=12
    ];

    edge [
        color="#424949",
        style=solid,
        penwidth=1.5,
        fontsize=8
    ];


    // Core node groups
    subgraph cluster_dc_west {
        label="Group: West";
        style="filled,rounded";
        fillcolor="#EAECEE";
        color="#B3B6B7";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        west_core1 [label="Node: West1"];
        west_core2 [label="Node: West2"];
    }

    subgraph cluster_dc_east {
        label="Group: East";
        style="filled,rounded";
        fillcolor="#EAECEE";
        color="#B3B6B7";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        east_core1 [label="Node: East1"];
        east_core2 [label="Node: East2"];
    }

    { rank=same; west_core1; east_core1; }
    { rank=same; west_core2; east_core2; }

    west_core1 -- east_core1 [ltail="cluster_dc_west", lhead="cluster_dc_east"];


    // Edge nodes
    subgraph cluster_edge_node_beta {
        label="Node: Beta";
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        beta_vrf_red [label="VRF Red", fillcolor="#FFE0E0", color="#C0392B"];
    }

    subgraph cluster_edge_node_alpha {
        label="Node: Alpha";
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        alpha_vrf_red [label="VRF Red", fillcolor="#FFE0E0", color="#C0392B"];
        // Use an invisible anchor in the middle to make the connections line up better
        alpha_wan_anchor [shape=point, style=invis, width=0.01, height=0.01, label=""];
        alpha_vrf_blue [label="VRF Blue", fillcolor="#D3E4F4", color="#2E86C1"];
    }

    subgraph cluster_edge_node_gamma {
        label="Node: Gamma";
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        gamma_vrf_blue [label="VRF Blue", fillcolor="#D3E4F4", color="#2E86C1"];
    }

    { rank=same; beta_vrf_red; gamma_vrf_blue; alpha_vrf_red; alpha_wan_anchor; alpha_vrf_blue;}

    west_core1 -- beta_vrf_red [ltail="cluster_dc_west", lhead="cluster_edge_node_beta"];
    east_core2 -- beta_vrf_red [ltail="cluster_dc_east", lhead="cluster_edge_node_beta"];
    west_core2 -- gamma_vrf_blue [ltail="cluster_dc_west", lhead="cluster_edge_node_gamma"];
    east_core1 -- gamma_vrf_blue [ltail="cluster_dc_east", lhead="cluster_edge_node_gamma"];
    west_core1 -- alpha_wan_anchor [ltail="cluster_dc_west", lhead="cluster_edge_node_alpha"];
    east_core1 -- alpha_wan_anchor [ltail="cluster_dc_east", lhead="cluster_edge_node_alpha"];
    beta_vrf_red -- alpha_vrf_red [ltail="cluster_edge_node_beta", lhead="cluster_edge_node_alpha"];
    gamma_vrf_blue -- alpha_vrf_blue [ltail="cluster_edge_node_gamma", lhead="cluster_edge_node_alpha"];


    // LANs
    { rank=same; lan_beta_red; lan_gamma_blue; lan_alpha_red; lan_alpha_blue; }

    lan_beta_red [label="LAN: 10.1.2.0/24", fillcolor="#FFE0E0", color="#C0392B"];
    lan_gamma_blue [label="LAN: 10.2.2.0/24", fillcolor="#D3E4F4", color="#2E86C1"];
    lan_alpha_red [label="LAN: 10.1.1.0/24", fillcolor="#FFE0E0", color="#C0392B"];
    lan_alpha_blue [label="LAN: 10.2.1.0/24", fillcolor="#D3E4F4", color="#2E86C1"];
    beta_vrf_red -- lan_beta_red;
    gamma_vrf_blue -- lan_gamma_blue;
    alpha_vrf_red -- lan_alpha_red;
    alpha_vrf_blue -- lan_alpha_blue;
}

At each data center there are 2 nodes for redundancy and scalability. These are commonly called “core nodes” to reflect their position in the network.

At each site there is a node that can provide service to one or more local networks. These are commonly called “edge nodes” to, again, reflect their position in the network. These edge nodes can connect to multiple other nodes simultaneously. The node that actually handles traffic is determined by policy and availability.

Connections can be made across single or multiple network links. When multiple links are available, configurable policy is used to select which link or links are active for a connection. Customer connections that are made through these links maintain state even when the underlying links are changed to different providers thanks to packet-level redundancy.

In the above example, the Beta and Gamma nodes connect directly to the Alpha node. If the direct path between those nodes is considered available and policy allows, traffic will pass directly between them. If that link fails but the links to the core nodes are maintained, the traffic will pass though those core nodes instead.

Since these SD-WAN connections can carry traffic for multiple distinct VRFs, we do not need to establish redundant connections to nodes to handle these separate routing instances. The traffic is kept isolated within each VRF’s routing context so mixing between VRFs is not possible unless explicitly enabled through routing policies. You can even use the same private network addresses in different VRFs and they will not conflict with each other.

Management plane

The management plane is built on top of MQTT, a lightweight protocol that is popular in large IoT architectures.

graph {
    rankdir=TB;
    fontname="Arial";
    splines=polyline;
    nodesep=0.5;
    ranksep=0.7;
    bgcolor="transparent";
    compound=true;

    node [
        shape=box,
        style="filled,rounded",
        fillcolor="#FEF9E7",
        color="#7D6608",
        fontname="Arial",
        fontsize=10
    ];

    edge [
        color="#424949",
        style=solid,
        penwidth=1.5
    ];

    subgraph cluster_manager {
        label="Manager";
        style="filled,rounded";
        fillcolor="#D6EAF8";
        color="#2874A6";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        broker [label="MQTT Broker"];
        node_manager [label="Node Manager"];

        node_manager -- broker;
    }

    subgraph cluster_node1 {
        label="Node";
        labelloc="b"
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        mesh1 [label="Orchestration Mesh"];
        node_broker1 [label="MQTT Broker"];
        node_services1 [label="Local Services"];

        mesh1 -- node_broker1 -- node_services1
    }
    broker -- mesh1;

    subgraph cluster_node2 {
        label="Node";
        labelloc="b"
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        mesh2 [label="Orchestration Mesh"];
        node_broker2 [label="MQTT Broker"];
        node_services2 [label="Local Services"];

        mesh2 -- node_broker2 -- node_services2
    }
    broker -- mesh2;

    subgraph cluster_node3 {
        label="Node";
        labelloc="b"
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        mesh3 [label="Orchestration Mesh"];
        node_broker3 [label="MQTT Broker"];
        node_services3 [label="Local Services"];

        mesh3 -- node_broker3 -- node_services3
    }
    broker -- mesh3;
}

On the manager there is a central broker that relays messages between the node manager and the nodes. MQTT is designed to handle many simultaneous connections and can also be distributed across multiple brokers if necessary.

On each node a local broker relays messages between the orchestration mesh and the services running on the node. The orchestration mesh relays messages between the local broker and the central one using any available local interface that allows connectivity to the central broker.

The use of the orchestration mesh here is to maintain reliability. As long as any single interface can connect to the central broker, the connection will be maintained. The actual path used at any point of time is invisible to local services.

Additionally, this architecture was specifically designed to enable custom software on the manager and nodes to utilize its infrastructure.

Manager

The manager is responsible for managing the networks, including the nodes and their configurations.

graph {
    rankdir=TB;
    fontname="Arial";
    bgcolor="transparent";
    splines=polyline;
    nodesep=0.6;
    ranksep=0.8;
    compound=true;

    node [
        shape=box,
        style="filled,rounded",
        fillcolor="#FEF9E7",
        color="#7D6608",
        fontname="Arial",
        fontsize=10
    ];

    edge [
        color="#424949",
        style=solid,
        penwidth=1.5
    ];

    subgraph cluster_manager {
        label="Manager";
        labelloc="t";
        style="filled,rounded";
        fillcolor="#D6EAF8";
        color="#2874A6";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        frontend [label="Frontend"];
        api [label="API"];
        database [label="Database"];
        broker [label="MQTT Broker"];
        config_manager [label="Config Manager"];
        node_manager [label="Node Manager"];
        status_collector [label="Status Collector"];

        frontend -- api;
        api -- database;

        database -- config_manager;
        database -- node_manager;
        database -- status_collector;

        {rank=same; config_manager; node_manager; status_collector;}

        node_manager -- broker;
        status_collector -- broker;
    }
}

The manager consists of the following components:

Frontend

Provides a web-based interface to manage the system

API

Provides a programmable interface for managing the system

Database

Stores configuration, state, and historical telemetry data for nodes, groups, and other entities

Config Manager

Detects changes in the database and generates new node configurations when needed

Node Manager

Tracks the online state of nodes and sends configurations to them when necessary

Status Collector

Collects telemetry data from nodes and stores it in the database

MQTT Broker

Relays messages containing data such as configuration changes and telemetry between the manager and the nodes

Node

Nodes implement the SD-WAN routing in the system. All nodes run the same software and are capable of performing all SD-WAN functionality regardless of their position in the network. They can be organized into groups for simpler configuration management.

In general, nodes that reside in data centers that accept connections from other nodes are called “core nodes” while nodes that reside in customer premises that connect to other nodes are called “edge nodes”. However, since all nodes can perform all duties, even simultaneously, this distinction is merely conceptual and not tracked or enforced by the system.

graph {
    rankdir=TB;
    fontname="Arial";
    bgcolor="transparent";
    splines=polyline;
    nodesep=0.6;
    ranksep=0.8;
    compound=true;

    node [
        shape=box,
        style="filled,rounded",
        fillcolor="#E8F8F5",
        color="#138D75",
        fontname="Arial",
        fontsize=10
    ];

    edge [
        color="#424949",
        style=solid,
        penwidth=1.5
    ];

    subgraph cluster_Node {
        label="Node";
        labelloc="t";
        style="filled,rounded";
        fillcolor="#E9F7EF";
        color="#1E8449";
        fontname="Arial";
        fontsize=12;
        penwidth=1.5;

        mesh [label="Orchestration Mesh"];
        broker [label="MQTT Broker"];
        agent [label="Agent (Control plane)"];
        cli [label="CLI"];
        xdp [label="Linux/XDP (Data plane)"];

        mesh -- broker;

        broker -- agent;
        broker -- cli;

        {rank=same; agent; cli;}

        agent -- xdp;
    }
}

The node consists of the following components:

MQTT Broker

Relays messages between local services, including custom software, on the node using the MQTT protocol

Orchestration Mesh

Relays messages between the local MQTT broker and the manager, using all available connections to maintain reliability

Agent

Implements the control plane by tracking and applying configurations from the manager, sending telemetry data, and making policy decisions based on network conditions

CLI

A local command line interface that allows for viewing the configuration and status, watching events, and manually updating configuration

Linux/XDP

Implements the data plane with Linux and XDP to handle packet processing, encapsulation/decapsulation, and forwarding