.. _managing-bonds:

===============
Managing bonds
===============

Bonds can be updated at any time, except when a bond or leg tuning job
is in progress. To update a bond while it is being tested, you need to
first `cancel the test <performance-tuning/automatic-bond-tuning.html#cancelling-a-tuning>`__.

.. contents:: :depth: 2


Adding a bond
--------------

Before installing Bonding on a device, create its configuration in the
web interface.

From the dashboard or bond list page, click Add bond.

|image0|

Provide the following information.

Details
^^^^^^^^

Details fields are used for configuration of the bonder hardware.

Name
+++++

The name of the bonder. This must be unique in the bonder's space.

The space name and bonder name are combined and used as the hostname for
the bonder.

Note
+++++

A free-form field for any relevant information.

Circuit ID
+++++++++++

The circuit ID of the bond.

Product
++++++++

The bond's product name, as sold by the partner.

Serial number
++++++++++++++

The serial number of the bonder hardware.

Asset tag
++++++++++

The asset tag given to the bonder hardware.

Space
++++++

The space to which the bond is assigned.

Username
+++++++++

Usually root, the user for management via SSH.

Password
+++++++++

The password for the user specified above.

The username and password fields are for record-keeping only. You should
change them only if you manually change the username or password on the
bonder.

Advanced options
+++++++++++++++++

The following fields are visible after you click "Show advanced".

Proof of concept
+++++++++++++++++

If checked, you will not be billed for this bond for one month. For more
details, see `proof of concept
bonds <proof-of-concept-bonds.html>`__.

Metric collection interval
+++++++++++++++++++++++++++

How frequently to query performance metrics, in seconds.

Metric reporting interval
++++++++++++++++++++++++++

How frequently to report collected metrics to the management server, in
seconds.

CPU governor
++++++++++++

Select which algorithm to use for scaling CPU frequencies.
If unset, the last used method for the CPU type will be used or the system default will be used after the system is rebooted.
Selection of an alternate governor, particularly so ``Performance``,
may result in increased throughput on certain platforms.
For a detailed explanation of each governor, see `the documentation <https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt>`__.

TCP congestion control algorithm
+++++++++++++++++++++++++++++++++

The TCP congestion control algorithm defines the behaviour of node sourced
TCP connections. Currently, there are 14 different algorithms to choose
from when configuring a bonder on the management server. Prior to 6.4, CUBIC was
the implicit choice. This option has no effect on traffic that is not originated
on the bonder itself.

#. CUBIC
#. BIC
#. CDG
#. Datacenter TCP
#. Hamilton TCP
#. Hybla
#. Illinois
#. Low priority
#. Reno
#. Scalable TCP
#. Vegas
#. Veno
#. Westwood
#. Yeah TCP
#. BBR

For more information on this topic and the various algorithms refer
to https://en.wikipedia.org/wiki/TCP_congestion_control

Depending on the type of traffic going over a bond, it's possible that a
different algorithm could decrease performance, while others might
improve it. This is most noticable when running speed tests.


Conntrack table size
++++++++++++++++++++

The maximum number of connections the host can track in its internal
tables. If the number of tracked connections reaches this number, new
connections will be dropped and an entry made in the system log file.

Web server
+++++++++++

If checked, the bonder will offer a simple web service to local networks
and trusted remote networks. Hosts on a connected IP network will be
able to view basic configuration and status information.

Node debug
+++++++++++

If checked, services running on the bonder log events in much more
detail than normal. PPP clients also log in more detail. Debug mode is
not recommended except on the recommendation of a technical support
agent.

Manage process affinity
++++++++++++++++++++++++

If checked, the tunnel process is bound to the second CPU core on
multicore bonders. This improves performance by reducing the number of
tunnel context switches.

Tunnel affinity core
+++++++++++++++++++++

Which CPU core, with 1 being the lowest CPU core ID, to try and assign the tunnel to use. Depending
on the hardware being used this choice can make a measurable improvement to
possible throughput. Testing and benchmarking should be done when nearing CPU
limits of a bonder to decide what the best CPU core is for the tunnel process to use
on any specific hardware platform. Before this option was available, if "Manage
process affinity" was enabled it would act as if this option was set to 2.


.. _automatic-source-ip:

Automatic source IP
++++++++++++++++++++

If checked, traffic originating from the bonder will have a source IP
selected from one of these IPs, in the following order:

If the bond is in a PWAN-enabled space:

#. A public connected IP not included in PWAN (most preferred)
#. A private connected IP not included in PWAN and having a CPE NAT IP
#. A private connected IP included in PWAN
#. A public connected IP included in PWAN
#. Bonded tunnel IP (least preferred)

If the bond is not in a PWAN-enabled space:

#. A public connected IP (most preferred)
#. A private connected IP having a CPE NAT IP
#. Bonded tunnel IP (least preferred)

If unchecked, traffic from the bonder will use the IP on its bonded
tunnel interface and it will be NAT'ed to the IP of the bond's
aggregator.

Enabling this option allows an administrator to test network routing
from the bonder in the same way that end-users see the network. That is,
if there is a datacenter routing issue, this option makes it easier to
identify that issue from the bonder, since the bonder will use the same
network as the end-user. If the option is disabled, traffic from the
bonder does not use a connected IP, but is NAT'ed through the
aggregator, making it somewhat more difficult to test network routing
from the perspective of  the end user.

This option has no effect on traffic being forwarded through the bonder-
only on traffic originating from the bonder itself.

Disabled by default. You can configure this default for new bonds on
the `bond defaults <../administration/bond-defaults.html>`__ page.

Configuration
^^^^^^^^^^^^^^

Configuration fields control the handling of customer traffic and have a
significant impact on performance and reliability. Settings changed here
are sent to both the bonder and aggregator.

Aggregator
+++++++++++

The primary aggregation server for the bond. The bonder will send
customer traffic to this aggregator, and this aggregator will act as a
router for the bond's connected IPs, CPE NAT IPs, and routes.

Secondary aggregator
+++++++++++++++++++++

The secondary aggregation server for the bond. If the primary
aggregation server fails, the bond will be moved to this aggregator. If
this field is left blank, the bond will be left on the primary
aggregator even if the aggregator fails. The secondary aggregator must
be in the same space as the primary aggregator.

Aggregator failback
++++++++++++++++++++

If checked and a secondary aggregator is selected, the bond will be
moved back to the primary aggregator automatically when it recovers. If
this field is not checked, the bond will be left on the secondary
aggregator even when the primary recovers, and it will need to be moved
back manually. This gives an administrator a chance to investigate why
the primary aggregator failed before it begins hosting bonds again.

QoS profile
++++++++++++

The QoS profile to use for the bond. If the empty option is selected, no
traffic shaping will be performed and latency may increase to
intolerable levels for applications such as VoIP.

Compression
++++++++++++

When checked, traffic through the bond is compressed. This increases the
effective bandwidth of the bond at the expense of greater CPU use. The
effectiveness of compression depends on type of data going through the
bond—data that has already been compressed, such as video files and ZIP
archive files, cannot be compressed again.

Tunnel security
++++++++++++++++

The security method used for customer data between the bonder and
aggregator. For details, see `Tunnel security and
encryption <tunnel-security-and-encryption.html>`__.

Encryption cipher
++++++++++++++++++

The encryption cipher to use, applicable only when the tunnel security
setting is Encryption. The two options, AES 128 and AES 256, are described in
`Tunnel security and encryption <tunnel-security-and-encryption.html>`__.
Some CPUs offer hardware AES 128 acceleration, which significantly reduces the
CPU usage required for encryption.

.. warning::
    Changing the encryption cipher while the bond is in use will cause a short
    outage for customer traffic while the DTLS session renegotiates.

Encryption handshake interval
++++++++++++++++++++++++++++++

The length of time between session re-keying, in seconds. This is only
applicable when the tunnel security setting is Encryption.

Packet distribution
++++++++++++++++++++

This value is set when `automatically tuning a
bond <performance-tuning/automatic-bond-tuning.html>`__.

The packet distribution algorithm to use for sending traffic.
See `Manual bond tuning <performance-tuning/manual-bond-tuning.html>`__ for tips
on choosing the best packet distribution algorithm for a bond.

*Weighted Round Robin* is a simple algorithm that balances traffic
across legs in proportion to their configured speeds. For example, in a
bond with a 3 Mbps leg and 6 Mbps leg, one third of the traffic will be
sent on the 3 Mbps leg and two thirds of the traffic will be sent on the
6 Mbps leg. It's a good choice for bonds with legs of all the same type,
bandwidth, and latency, or where numerous people use the bond at the
same time, such as in a large office network.

Weighted Round Robin is available on nodes running SD-WAN
2014.4 or later. It can be selected when a bond's aggregator, secondary
aggregator, and bonder have all been upgraded to 2014.4 or later.

*Flowlet* is an algorithm that minimizes packet reordering by
splitting bursts of traffic in a single flow into sub-flows called
"flowlets". Each flowlet is assigned to a single leg. The flow may be
assigned to a different leg after an idle period (see "delta", below).

*Intelligent Delay Managed Packet Queuing (IDMPQ)* is an algorithm that
constantly analyzes link delay, speed, queue size, and traffic
characteristics to make packet distribution decisions that minimize
packet reordering and delay variation. It requires very accurate tuning
of leg upload and download speeds and results in somewhat higher CPU use
than Flowlet. It is an excellent choice for bonds with different types
of legs, such as a bond with a DSL and a cable Internet connection.

IDMPQ causes high CPU load on aggregators with a large number of bonds.
It is not recommended to use IDMPQ on any bond on an aggregator hosting
more than 100 legs. This issue will be resolved in a future release.

Flowlet delta
++++++++++++++

Used only when the packet distribution algorithm is Flowlet, this is the
length of time in milliseconds that must pass between bursts of traffic
in a flow before the flow can be assigned to a different leg. If the
delta value is 0, it balances traffic in the same way as weighted round
robin, but is somewhat less efficient because it keeps track of traffic
flows.

Flowlet delta should be set based on the type of traffic going through
the bond. For bonds with a small number of concurrent connections (for
example, bonds supporting a single web user, one VPN tunnel, or two
concurrent video upload streams), the delta should be set to 0. This
allows packets in a single flow to use the full bandwidth of each leg,
but can cause packet disordering at the receiving host. For bonds
supporting many concurrent connections (for example, those hosting web
browsing traffic for an office or an ISP), flowlet delta can be set
higher to give greater overall throughput. In this scenario, it should
be set to the approximate difference in one-way delay between the
highest-latency and lowest-latency legs. For example, if the
highest-latency leg has a one-way delay of 15 ms, and the lowest-latency
leg has a one-way delay of 10 ms, then delta would be 5 ms. However,
this setting would result in limited throughput for a single flow,
because the flow would be bound to a single leg for short periods.

Advanced options
+++++++++++++++++

Batched leg send operations
++++++++++++++++++++++++++++

If checked, packets are sent out as a batch instead of individually, improving CPU
utilization at the cost of a small (about 1ms) increase in latency on any sent
packets. This is especially useful when trying to run on low-end devices or
achieve very high rates of throughput. This option does nothing if the bonder or
aggregator for the bond is running Debian 7 (Wheezy).

Clamp TCP
++++++++++

If checked, TCP connections are modified so their packets are small
enough to fit through the bonded tunnel interface.

If TCP clamping is disabled, users may have trouble browsing the web,
especially SSL-protected web sites. This should only be disabled for
troubleshooting.

Source address verification
++++++++++++++++++++++++++++

If checked, the bonder validates outgoing traffic to ensure it comes
from an expected source IP address. This reduces the chance that the
bond can be used as part of a network attack using spoofed source IP
addresses, as described in `IETF BCP 38 Network Ingress Filtering:
Defeating Denial of Service Attacks which employ IP Source Address
Spoofing <https://tools.ietf.org/html/bcp38>`__.

Encryption replay protection
+++++++++++++++++++++++++++++

If checked, and the tunnel security option is set to Encryption,
encryption replay attacks will be prevented. A replay attack occurs when
an adversary records encrypted communications and sends the encrypted
data to one of the encryption peers again at a later time. This could
cause corruption or duplication of data in the encrypted data stream,
but can be prevented by enabling this option.

If a leg is experiencing severe packet reordering it may be interpreted
as a replay attack and legitimate data may be dropped, causing packet
loss and slow speeds. If a leg experiences this kind of reordering, this
option may be disabled to improve performance.

Automatic ping timing
++++++++++++++++++++++

If checked, sets leg ping and fail times automatically based on the
leg's normal latency. Each leg in the bond will have its own ping and
fail timing. The ping timers will be set to three times the leg latency
and failure timers set to nine times the leg latency. Both of these
timers are subject to minimum values as described below.

If unchecked, the values in the ping and fail time fields are used for
each leg.

Regular leg ping time
++++++++++++++++++++++

This field controls ping timing on regular (i.e. non-failover) legs.
Ping timing is the interval between ping requests sent between the
bonder and aggregator. This field is in milliseconds (ms).

If the automatic ping timing option is enabled, the value of this field
is used as the minimum interval for ping timing. For example, a leg with
normal latency of 10 ms would have its ping interval automatically set
to 30 ms. However, if the value of this field is 100 ms, the ping
interval would be set to 100 ms. This prevents pings from occurring at a
very high rate on legs with very low latency.

If the automatic ping timing option is disabled, the value of this field
is used as the ping interval for all regular legs in the bond. This
value can be increased from the default of 100 ms to save bandwidth on
legs with usage-based billing such as 3G wireless legs.

To calculate the monthly bandwidth used by a leg for ping packets in one
direction, use the following equation:

b=305,856/t

where *t* is the leg ping time in milliseconds and b is the monthly
amount of data taken by the ping packets in a single direction, in
megabytes.

For example, with the default period *t* = 100 ms, ping packets take
about 3.1 gigabytes of data each direction for each leg, each month.

Regular leg fail time
++++++++++++++++++++++

This field controls fail timing on regular legs. A leg will be declared
as failed this long after receiving the last ping reply from the remote
tunnel. This field is in milliseconds.

As with the regular leg ping time field, if automatic ping timing is
enabled, this field is used as the minimum fail time. If automatic ping
timing is disabled, this value is used as the fail time for all regular
legs in the bond.

The default is 300 ms. This value must be greater than the regular leg
ping time above. The recommended value is three times the ping time to
avoid declaring a leg as down during momentary periods of increased
latency.

Failover leg ping time
+++++++++++++++++++++++

This field is the same as the regular leg ping time except it is applied
to the bond's failover leg, if applicable. The default value is 1000 ms,
or one second.

Failover leg fail time
+++++++++++++++++++++++

This field is the same as the regular leg fail time except it is applied
to the bond's failover leg, if applicable. The default value is 3000 ms,
or three seconds.

Automatic reorder max hold
+++++++++++++++++++++++++++

If checked, adapt the reordering max hold value automatically based on
real-time leg latency.

Reorder max hold
+++++++++++++++++

The length of time to buffer received packets for reordering, in
milliseconds. Like the Flowlet delta value, this should be set to the
approximate difference in one-way delay between the highest-latency and
lowest-latency legs. Larger values allow the order of packets to be
restored even when the difference in delay of the legs is large, but a
flow will suffer greater interruption when packets are lost.

Smaller values allow a flow to recover more quickly from lost packets,
but can only reorder packets when the difference in delay is smaller.

This value has no effect when automatic reorder max hold is enabled.

Packet loss detection
++++++++++++++++++++++

If enabled, which is the default, legs will determine and report the
amount of packet loss that is occurring on each leg. Legs with a loss
rate greater than their set threshold are removed from the bond unless
no better-performing legs are available. For example, in a bond with two
normal legs and one failover leg, where one of the normal legs has a
packet loss rate above its removal threshold, that leg will not be used
in the bond. If the loss rate of the other normal leg also goes above
its threshold, it too will be removed from the bond, and the failover
leg will be used. If the failover leg goes down, the two normal legs
with high loss rates will then be used in the bond again.

Packet loss is calculated by the nodes in both directions by comparing the
number of packets sent to the number of packets received. This method of
detection captures packet loss:

#. as the user experiences it, based on the actual
   packets lost in transit over the legs. Other forms of packet loss
   testing (e.g., testing using ICMP on an interval) can show no loss in
   these situations because packet loss can affect traffic differently
   depending on protocols, packet sizes, and bursts of traffic.
#. at a very granular level, including small bursts
   of packet loss that would otherwise be statistically insignificant with
   a long-running packet loss test.
#. using the exact same transport as the data packets, avoiding the
   possibility of QoS or prioritization giving incorrect results when
   testing with another transport.

When packet loss detection is enabled, you can view information about
current and historical packet loss in three locations:

#. Alerts on the legs in the summary column. If packet loss is detected
   to be greater than 0.1% in either direction, an alert is displayed.
#. The last reported packet loss values are displayed in the extended
   *Leg status* information found by clicking the right caret in the first
   column of the leg list.
#. The historical values are recorded and visible in the *Leg packet loss*
   chart in the *Performance* section of the bond.

Flap detection
+++++++++++++++

If enabled, which is the default, legs will delay becoming available if
they have recently lost connection. The amount of time they delay will
increase up to 30 seconds if the leg continues gaining and losing
connectivity. For more information, see `Flap
detection <flap-detection.html>`__.

Leg MTU detection
++++++++++++++++++

If enabled, which is the default, legs will attempt to detect the real
MTU of the link between the bonder and the aggregator. This is
particularly useful when legs are attached to CPE devices such as DSL
routers that create tunnels, reducing the MTU. It is recommended to
leave this option on.

Leg MTU detection time
+++++++++++++++++++++++

If leg MTU detection is enabled, each leg will attempt to detect the
path MTU this many hours after the last detection. The default is 24
hours.

Send jitter buffer
+++++++++++++++++++

If enabled, there will be a delay sending packets to mitigate jitter.
This delay reduces the amount of reordering and can have noticeable
performance benefit. This can improve the consistency of applications
that require a continuous delivery of packets like VoIP. However, there
is less of a benefit for legs with a high amount of jitter as there is
only a delay when sending.

Debug
++++++

If checked, the tunnel and TCP proxy (if applicable) processes for this
bond running on both the bonder and aggregator log events in much more
detail than normal. Debug mode is not recommended except on the
recommendation of a technical support agent.

TCP Proxy
++++++++++

See `TCP proxy <tcp-proxy.html>`__ for a description of TCP
proxy options.

When you complete the form, click Save at the bottom of the page.

|image1|

The bond will be activated on the aggregation server and the
configuration will be available for the bonder to download.

Updating a bond
----------------

To modify a bond's configuration, navigate to its edit page. This page
is available by clicking the |image2| icon beside the bond in the bond
list or by clicking the |image3| icon in the list and then clicking
Edit at the top-right corner of the bond details page.

|image4|

Modify values in each form as necessary. You can add more legs,
connected IPs, CPE NAT IPs, and routes, or delete objects by clicking
the delete button in the "Delete" column at the right side of the
object's form.

Click Save to apply the changes and return to the bond details page. The
bonder and aggregator will immediately be updated with the new
configuration.

Removing a bond
----------------

To minimize resources used on your aggregators, ensure that bonds are
deleted when they are no longer needed.

To remove a bond, first take the bonder out of service. Then navigate to
its details page and click the |image5| icon on the Edit button, then
Delete.

|image6|

Confirm your action on the dialog that appears.


.. |image0| image:: /attachments/1179691/12877946.png
.. |image1| image:: /attachments/1179691/12877951.png
.. |image2| image:: /attachments/1179691/12877956.png
.. |image3| image:: /attachments/1179691/12877955.png
.. |image4| image:: /attachments/1179691/12877952.png
.. |image5| image:: /attachments/1179691/1933377.png
.. |image6| image:: /attachments/1179691/12877953.png
