Linux Bonding

Материал из Xgu.ru

Перейти к: навигация, поиск
stub.png
Данная страница находится в разработке.
Эта страница ещё не закончена. Информация, представленная здесь, может оказаться неполной или неверной.

Если вы считаете, что её стоило бы доработать как можно быстрее, пожалуйста, скажите об этом.

Оригинал: [1]
Перевод: Игорь Чубин

Драйвер bonding ядра linux обеспечивает метод агрегации нескольких сетевых интерфейсов в единый логический bonded интерфейс. Поведение агрегированных ("bonded") интерфейсов зависит от режима ("mode"). Проще говоря, режимы обеспечивают либо балансировку нагрузки, либо горячий резерв. Кроме того, может быть обеспечен контроль целостности.

The bonding driver originally came from Donald Becker's beowulf patches for kernel 2.0. It has changed quite a bit since, and the original tools from extreme-linux and beowulf sites will not work with this version of the driver.

For new versions of the driver, updated userspace tools, and who to ask for help, please follow the links at the end of this file.

Содержание

[править] Инсталляция

Самые популярные дистрибутивы поставляются с уже установленными драйвером агрегирования в виде модуля ядра и программой контроля пользовательского доступа ifenslave. Если таковых нет - Вам нужно собрать модуль агрегирования из исходников. Для этого нужно сделать следующее.


[править] Конфигурирование и сборка ядра для поддержки агрегирования

Текущая версия драйвера аггрегирования доступна в каталоге drivers/net/bonding в дереве исходных кодов ядра (доступное на kernel.org). Пользователи, которые всё любят делать сами, могут взять последнее ядро с kernel.org.

Запустите конфигурирование ядра с помощью "make menuconfig" (или "make xconfig", или "make config"), выберите "Bonding driver support" в секции "Network device support". Рекомендуется собирать драйвер как модуль, поскольку в настоящий момент только при такой конфигурации можно будет передавать параметры драйверу и настраивать больше одного агрегирующего устройства.

Соберите и установите новое ядро и модули, после этого можно приступать к инсталляции ifenslave.

[править] Установка ifenslave

Утилита контроля доступа ifenslave включена в дерево исходников ядра, в файле Documentation/networking/ifenslave.c. Рекомендовано использовать ifenslave соответсвующей версии ядра (из того же дерева исходиников или официального дистрибутива), однако старые версии ifenslave также должны работать (без функциональности новых версий). Запуск ifenslave новее чем ядро не поддерживается, хотя может и работать.


Для инсталляции ifenslave, выполните следующее:

 # gcc -Wall -O -I/usr/src/linux/include ifenslave.c -o ifenslave
 # cp ifenslave /sbin/ifenslave

If your kernel source is not in "/usr/src/linux," then replace "/usr/src/linux/include" in the above with the location of your kernel source include directory.

You may wish to back up any existing /sbin/ifenslave, or, for testing or informal use, tag the ifenslave to the kernel version (e.g., name the ifenslave executable /sbin/ifenslave-2.6.10).

Icon-caution.gif

If you omit the "-I" or specify an incorrect directory, you may end up with an ifenslave that is incompatible with the kernel you're trying to build it for. Some distros (e.g., Red Hat from 7.1 onwards) do not have /usr/include/linux symbolically linked to the default kernel source include directory.

[править] Опции сборки драйвера

Опции передаются в качестве параметров драйверу "bonding" во время загрузки модуля. Они могут передаваться в качестве аргументов командной строки утилите insmod, но обычно указываются в одном из конфигурационных файлов: /etc/modules.conf или /etc/modprobe.conf, или файле, специфичном для конкретного дистрибутива (некоторые будут указаны в следующем разделе).

Возможные параметры драйвера агрегации указаны ниже. Если параметр специально не указан, будет использовано значение по умолчанию. При первоначальной конфигурации агрегации рекомендуется запустить в отдельном терминале "tail -f /var/log/messages" для выявления сообщений об ошибках.

It is critical that either the miimon or arp_interval and arp_ip_target parameters be specified, otherwise serious network degradation will occur during link failures. Very few devices do not support at least miimon, so there is really no reason not to use it.

Options with textual values will accept either the text name or, for backwards compatibility, the option value. E.g., "mode=802.3ad" and "mode=4" set the same mode.

Параметры такие:

arp_interval 
Specifies the ARP link monitoring frequency in milliseconds. If ARP monitoring is used in an etherchannel compatible mode (modes 0 and 2), the switch should be configured in a mode that evenly distributes packets across all links. If the switch is configured to distribute the packets in an XOR fashion, all replies from the ARP targets will be received on the same link which could cause the other team members to fail. ARP monitoring should not be used in conjunction with miimon. A value of 0 disables ARP monitoring. The default value is 0.
arp_ip_target 
Specifies the IP addresses to use as ARP monitoring peers when arp_interval is > 0. These are the targets of the ARP request sent to determine the health of the link to the targets. Specify these values in ddd.ddd.ddd.ddd format. Multiple IP addresses must be separated by a comma. At least one IP address must be given for ARP monitoring to function. The maximum number of targets that can be specified is 16. The default value is no IP addresses.
downdelay 
Specifies the time, in milliseconds, to wait before disabling a slave after a link failure has been detected. This option is only valid for the miimon link monitor. The downdelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. The default value is 0.
lacp_rate 
Option specifying the rate in which we'll ask our link partner to transmit LACPDU packets in 802.3ad mode. Possible values are:
  • slow or 0 
    Request partner to transmit LACPDUs every 30 seconds.
  • fast or 1 
    Request partner to transmit LACPDUs every 1 second The default is slow.
max_bonds 
Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 and bond2 will be created. The default value is 1.
miimon 
Specifies the MII link monitoring frequency in milliseconds. This determines how often the link state of each slave is inspected for link failures. A value of zero disables MII link monitoring. A value of 100 is a good starting point.
The use_carrier option, below, affects how the link state is determined. See the High Availability section for additional information. The default value is 0.
mode 
Specifies one of the bonding policies. The default is balance-rr (round robin).

Possible values are:

  • balance-rr or 0 
    Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.
  • active-backup or 1 
    Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch.
In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding will issue one or more gratuitous ARPs on the newly active slave. One gratutious ARP is issued for the bonding master interface and each VLAN interfaces configured above it, provided that the interface has at least one IP address configured.
Gratuitous ARPs issued for VLAN interfaces are tagged with the appropriate VLAN id. This mode provides fault tolerance. The primary option, documented below, affects the behavior of this mode.
  • balance-xor or 2 
    XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple
 ( {source} \oplus {destination} ) % n_{slaves}
Alternate transmit policies may be selected via the xmit_hash_policy option.
This mode provides load balancing and fault tolerance.
  • broadcast or 3
    Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.
  • 802.3ad or 4
    IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification.
Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option, documented below. Note that not all transmit policies may be 802.3ad compliant, particularly in regards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance.
  • Prerequisites:
    1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave.
    2. A switch that supports IEEE 802.3ad Dynamic link aggregation.
Most switches will require some type of configuration to enable 802.3ad mode.
  • balance-tlb or 5
    Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.
  • Prerequisite:
    1. Ethtool support in the base drivers for retrieving the speed of each slave.


  • balance-alb or 6 
    Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation.
The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server.
Receive traffic from connections created by the server is also balanced. When the local system sends an ARP Request the bonding driver copies and saves the peer's IP information from the ARP packet.
When the ARP Reply arrives from the peer, its hardware address is retrieved and the bonding driver initiates an ARP reply to this peer assigning it to one of the slaves in the bond.
A problematic outcome of using ARP negotiation for balancing is that each time that an ARP request is broadcast it uses the hardware address of the bond. Hence, peers learn the hardware address of the bond and the balancing of receive traffic collapses to the current slave. This is handled by sending updates (ARP Replies) to all the peers with their individually assigned hardware address such that the traffic is redistributed. Receive traffic is also redistributed when a new slave is added to the bond and when an inactive slave is re-activated. The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond.
When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond by initiating ARP Replies with the selected mac address to each of the clients. The updelay parameter (detailed below) must be set to a value equal or greater than the switch's forwarding delay so that the ARP Replies sent to the peers will not be blocked by the switch.
  • Prerequisites:
    1. Ethtool support in the base drivers for retrieving the speed of each slave.
    2. Base driver support for setting the hardware address of a device while it is open. This is required so that there will always be one slave in the team using the bond hardware address (the curr_active_slave) while having a unique hardware address for each slave in the bond. If the curr_active_slave fails its hardware address is swapped with the new curr_active_slave that was chosen.
primary 
A string (eth0, eth2, etc) specifying which slave is the primary device. The specified device will always be the active slave while it is available. Only when the primary is off-line will alternate devices be used. This is useful when one slave is preferred over another, e.g., when one slave has higher throughput than another. The primary option is only valid for active-backup mode.
updelay 
Specifies the time, in milliseconds, to wait before enabling a slave after a link recovery has been detected. This option is only valid for the miimon link monitor. The updelay value should be a multiple of the miimon value; if not, it will be rounded down to the nearest multiple. The default value is 0.
use_carrier 
Specifies whether or not miimon should use MII or ETHTOOL ioctls vs. netif_carrier_ok() to determine the link status. The MII or ETHTOOL ioctls are less efficient and utilize a deprecated calling sequence within the kernel. The netif_carrier_ok() relies on the device driver to maintain its state with netif_carrier_on/off; at this writing, most, but not all, device drivers support this facility.
If bonding insists that the link is up when it should not be, it may be that your network device driver does not support netif_carrier_on/off. The default state for netif_carrier is "carrier on," so if a driver does not support netif_carrier, it will appear as if the link is always up. In this case, setting use_carrier to 0 will cause bonding to revert to the MII / ETHTOOL ioctl method to determine the link state.
A value of 1 enables the use of netif_carrier_ok(), a value of 0 will use the deprecated MII / ETHTOOL ioctls. The default value is 1.
xmit_hash_policy 
Selects the transmit hash policy to use for slave selection in balance-xor and 802.3ad modes. Possible values are:
  • layer2 
    Uses XOR of hardware MAC addresses to generate the hash. The formula is
 ( {source} \oplus {destination} ) % N_{slave}

This algorithm will place all traffic to a particular network peer on the same slave. This algorithm is 802.3ad compliant.

  • layer3+4
    This policy uses upper layer protocol information, when available, to generate the hash. This allows for traffic to a particular network peer to span multiple slaves, although a single connection will not span multiple slaves.

The formula for unfragmented TCP and UDP packets is

 (( port_{src} \oplus port_{dst}) \oplus ( (IP_{src} \oplus IP_{dst}) ) % N

For fragmented TCP or UDP packets and all other IP protocol traffic, the source and destination port information is omitted. For non-IP traffic, the formula is the same as for the layer2 transmit hash policy.

This policy is intended to mimic the behavior of certain switches, notably Cisco switches with PFC2 as well as some Foundry and IBM products.

This algorithm is not fully 802.3ad compliant. A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets striped across two interfaces. This may result in out of order delivery. Most traffic types will not meet this criteria, as TCP rarely fragments traffic, and most UDP traffic is not involved in extended conversations. Other implementations of 802.3ad may or may not tolerate this noncompliance.

The default value is layer2. This option was added in bonding version 2.6.3. In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy.

[править] Configuring Bonding Devices

There are, essentially, two methods for configuring bonding: with support from the distro's network initialization scripts, and without. Distros generally use one of two packages for the network initialization scripts: initscripts or sysconfig. Recent versions of these packages have support for bonding, while older versions do not. /etc/net has built-in support for interface bonding.

We will first describe the options for configuring bonding for distros using versions of initscripts and sysconfig with full or partial support for bonding, then provide information on enabling bonding without support from the network initialization scripts (i.e., older versions of initscripts or sysconfig).

If you're unsure whether your distro uses sysconfig or initscripts, or don't know if it's new enough, have no fear. Determining this is fairly straightforward.

First, issue the command:

 $ rpm -qf /sbin/ifup

It will respond with a line of text starting with either "initscripts" or "sysconfig," followed by some numbers. This is the package that provides your network initialization scripts.

Next, to determine if your installation supports bonding, issue the command:

 $ grep ifenslave /sbin/ifup

If this returns any matches, then your initscripts or sysconfig has support for bonding.

[править] Configuration with sysconfig support

This section applies to distros using a version of sysconfig with bonding support, for example, SuSE Linux Enterprise Server 9.

SuSE SLES 9's networking configuration system does support bonding, however, at this writing, the YaST system configuration frontend does not provide any means to work with bonding devices. Bonding devices can be managed by hand, however, as follows.

First, if they have not already been configured, configure the slave devices. On SLES 9, this is most easily done by running the yast2 sysconfig configuration utility. The goal is for to create an ifcfg-id file for each slave device. The simplest way to accomplish this is to configure the devices for DHCP (this is only to get the file ifcfg-id file created; see below for some issues with DHCP). The name of the configuration file for each device will be of the form:

 ifcfg-id-xx:xx:xx:xx:xx:xx

Where the "xx" portion will be replaced with the digits from the device's permanent MAC address.

Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been created, it is necessary to edit the configuration files for the slave devices (the MAC addresses correspond to those of the slave devices). Before editing, the file will contain multiple lines, and will look something like this:

 BOOTPROTO='dhcp'
 STARTMODE='on'
 USERCTL='no'
 UNIQUE='XNzu.WeZGOGF+4wE'
 _nm_name='bus-pci-0001:61:01.0'

Change the BOOTPROTO and STARTMODE lines to the following:

 BOOTPROTO='none'
 STARTMODE='off'

Do not alter the UNIQUE or _nm_name lines. Remove any other lines (USERCTL, etc).

Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified, it's time to create the configuration file for the bonding device itself. This file is named ifcfg-bondX, where X is the number of the bonding device to create, starting at 0. The first such file is ifcfg-bond0, the second is ifcfg-bond1, and so on. The sysconfig network configuration system will correctly start multiple instances of bonding.

The contents of the ifcfg-bondX file is as follows:

 BOOTPROTO="static"
 BROADCAST="10.0.2.255"
 IPADDR="10.0.2.10"
 NETMASK="255.255.0.0"
 NETWORK="10.0.2.0"
 REMOTE_IPADDR=""
 STARTMODE="onboot"
 BONDING_MASTER="yes"
 BONDING_MODULE_OPTS="mode=active-backup miimon=100"
 BONDING_SLAVE0="eth0"
 BONDING_SLAVE1="bus-pci-0000:06:08.1"

Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK values with the appropriate values for your network.

The STARTMODE specifies when the device is brought online. The possible values are:

onboot 
The device is started at boot time. If you're not sure, this is probably what you want.
manual 
The device is started only when ifup is called manually. Bonding devices may be configured this way if you do not wish them to start automatically at boot for some reason.
hotplug 
The device is started by a hotplug event. This is not a valid choice for a bonding device.
off or ignore 
The device configuration is ignored.

The line BONDING_MASTER='yes' indicates that the device is a bonding master device. The only useful value is "yes."

The contents of BONDING_MODULE_OPTS are supplied to the instance of the bonding module for this device. Specify the options for the bonding mode, link monitoring, and so on here. Do not include the max_bonds bonding parameter; this will confuse the configuration system if you have multiple bonding devices.

Finally, supply one BONDING_SLAVEn="slave device" for each slave. where "n" is an increasing value, one for each slave. The "slave device" is either an interface name, e.g., "eth0", or a device specifier for the network device. The interface name is easier to find, but the ethN names are subject to change at boot time if, e.g., a device early in the sequence has failed. The device specifiers (bus-pci-0000:06:08.1 in the example above) specify the physical network device, and will not change unless the device's bus location changes (for example, it is moved from one PCI slot to another). The example above uses one of each type for demonstration purposes; most configurations will choose one or the other for all slave devices.

When all configuration files have been modified or created, networking must be restarted for the configuration changes to take effect. This can be accomplished via the following:

 # /etc/init.d/network restart

Note that the network control script (/sbin/ifdown) will remove the bonding module as part of the network shutdown processing, so it is not necessary to remove the module by hand if, e.g., the module parameters have changed.

Also, at this writing, YaST/YaST2 will not manage bonding devices (they do not show bonding interfaces on its list of network devices). It is necessary to edit the configuration file by hand to change the bonding configuration.

Additional general options and details of the ifcfg file format can be found in an example ifcfg template file:

 /etc/sysconfig/network/ifcfg.template

Note that the template does not document the various BONDING_ settings described above, but does describe many of the other options.

[править] Using DHCP with sysconfig

Under sysconfig, configuring a device with BOOTPROTO='dhcp' will cause it to query DHCP for its IP address information. At this writing, this does not function for bonding devices; the scripts attempt to obtain the device address from DHCP prior to adding any of the slave devices. Without active slaves, the DHCP requests are not sent to the network.

[править] Configuring Multiple Bonds with sysconfig

The sysconfig network initialization system is capable of handling multiple bonding devices. All that is necessary is for each bonding instance to have an appropriately configured ifcfg-bondX file (as described above). Do not specify the "max_bonds" parameter to any instance of bonding, as this will confuse sysconfig. If you require multiple bonding devices with identical parameters, create multiple ifcfg-bondX files.

Because the sysconfig scripts supply the bonding module options in the ifcfg-bondX file, it is not necessary to add them to the system /etc/modules.conf or /etc/modprobe.conf configuration file.

[править] Configuration with initscripts support

This section applies to distros using a version of initscripts with bonding support, for example, Red Hat Linux 9 or Red Hat Enterprise Linux version 3 or 4. On these systems, the network initialization scripts have some knowledge of bonding, and can be configured to control bonding devices.

These distros will not automatically load the network adapter driver unless the ethX device is configured with an IP address. Because of this constraint, users must manually configure a network-script file for all physical adapters that will be members of a bondX link. Network script files are located in the directory:

 /etc/sysconfig/network-scripts

The file name must be prefixed with "ifcfg-eth" and suffixed with the adapter's physical adapter number. For example, the script for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0. Place the following text in the file:

 DEVICE=eth0
 USERCTL=no
 ONBOOT=yes
 MASTER=bond0
 SLAVE=yes
 BOOTPROTO=none

The DEVICE= line will be different for every ethX device and must correspond with the name of the file, i.e., ifcfg-eth1 must have a device line of DEVICE=eth1. The setting of the MASTER= line will also depend on the final bonding interface name chosen for your bond. As with other network devices, these typically start at 0, and go up one for each device, i.e., the first bonding instance is bond0, the second is bond1, and so on.

Next, create a bond network script. The file name for this script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is the number of the bond. For bond0 the file is named "ifcfg-bond0", for bond1 it is named "ifcfg-bond1", and so on. Within that file, place the following text:

 DEVICE=bond0
 IPADDR=192.168.1.1
 NETMASK=255.255.255.0
 NETWORK=192.168.1.0
 BROADCAST=192.168.1.255
 ONBOOT=yes
 BOOTPROTO=none
 USERCTL=no

Be sure to change the networking specific lines (IPADDR, NETMASK, NETWORK and BROADCAST) to match your network configuration.

Finally, it is necessary to edit /etc/modules.conf (or /etc/modprobe.conf, depending upon your distro) to load the bonding module with your desired options when the bond0 interface is brought up. The following lines in /etc/modules.conf (or modprobe.conf) will load the bonding module, and select its options:

 alias bond0 bonding
 options bond0 mode=balance-alb miimon=100

Replace the sample parameters with the appropriate set of options for your configuration.

Finally run "/etc/rc.d/init.d/network restart" as root. This will restart the networking subsystem and your bond link should be now up and running.

[править] Использование DHCP со скриптами initscripts

Более новые скрипты initscripts (та версия, которая входит в Fedora Core 3 и Red Hat Enterprise Linux 4) умеют назначать IP-адреса на агрегированные интерфейсы.

Для того чтобы агрегированный интерфейс использовал DHCP, нужно его настроить, как написано выше, только заменить строку "BOOTPROTO=none" на "BOOTPROTO=dhcp" и добавить строку "TYPE=Bonding". Обратите внимание, что значение TYPE чувствительно к регистру.

[править] Configuring Multiple Bonds with initscripts

At this writing, the initscripts package does not directly support loading the bonding driver multiple times, so the process for doing so is the same as described in the "Configuring Multiple Bonds Manually" section, below.

NOTE: It has been observed that some Red Hat supplied kernels are apparently unable to rename modules at load time (the "-obonding1" part). Attempts to pass that option to modprobe will produce an "Operation not permitted" error. This has been reported on some Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels exhibiting this problem, it will be impossible to configure multiple bonds with differing parameters.

[править] Configuring Bonding with /etc/net

This section applies to distros having /etc/net already integrated or to hand-made /etc/net installations. Bonding interfaces are usual /etc/net interfaces, the only thing you need to do is to decide which interfaces you will assign to the bond and which bond options you will use. In this example we will setup a high-availability ethernet bonding from two ethernet cards. /etc/net keeps information about interfaces in

 /etc/net/ifaces

First of all we have to create a configuration directory for each interface involved in configuration:

 # mkdir /etc/net/ifaces/primary
 # mkdir /etc/net/ifaces/backup
 # mkdir /etc/net/ifaces/failover

Then we will fill options files for ethernet interfaces:

 # cat > /etc/net/ifaces/primary/options
 TYPE=eth
 MODULE=e100
 ^D
 # cat > /etc/net/ifaces/backup/options
 TYPE=eth
 MODULE=e100
 ^D
 # cat >> /etc/net/iftab
 primary mac 00:10:dc:9e:af:d5
 backup mac 00:10:dc:9e:af:d6
 ^D

We have configured two ethernet cards and fixed their names with iftab. Now it's time to configure bonding:

 # cat > /etc/net/ifaces/failover/options
 TYPE=bond
 BONDMODE=1
 HOST='primary backup'
 BONDOPTIONS='use_carrier=1 miimon=100 primary=primary'
 ^D
 # cat > /etc/net/ifaces/failover/ipv4address
 192.168.1.1/24
 ^D
 # cat > /etc/net/ifaces/failover/ipv4route
 default via 192.168.1.254
 ^D

After that the only thing we have to do is

 # ifup failover

/etc/net will automatically discover (from HOST option) the correct order of initialization. You can configure as many bonds as you need. DHCP is currently not supported for bonding interfaces in /etc/net.

[править] Configuring Bonding Manually

-This section applies to distros whose network initialization scripts (the sysconfig or initscripts package) do not have specific knowledge of bonding. One such distro is SuSE Linux Enterprise Server version 8.

The general method for these systems is to place the bonding module parameters into /etc/modules.conf or /etc/modprobe.conf (as appropriate for the installed distro), then add modprobe and/or ifenslave commands to the system's global init script. The name of the global init script differs; for sysconfig, it is /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local.

For example, if you wanted to make a simple bond of two e100 devices (presumed to be eth0 and eth1), and have it persist across reboots, edit the appropriate file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the following:

 modprobe bonding mode=balance-alb miimon=100
 modprobe e100
 ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
 ifenslave bond0 eth0
 ifenslave bond0 eth1

Replace the example bonding module parameters and bond0 network configuration (IP address, netmask, etc) with the appropriate values for your configuration.

Unfortunately, this method will not provide support for the ifup and ifdown scripts on the bond devices. To reload the bonding configuration, it is necessary to run the initialization script, e.g.,

 # /etc/init.d/boot.local

or

 # /etc/rc.d/rc.local

It may be desirable in such a case to create a separate script which only initializes the bonding configuration, then call that separate script from within boot.local. This allows for bonding to be enabled without re-running the entire global init script.

To shut down the bonding devices, it is necessary to first mark the bonding device itself as being down, then remove the appropriate device driver modules. For our example above, you can do the following:

 # ifconfig bond0 down
 # rmmod bonding
 # rmmod e100

Again, for convenience, it may be desirable to create a script with these commands.

[править] Configuring Multiple Bonds Manually

This section contains information on configuring multiple bonding devices with differing options for those systems whose network initialization scripts lack support for configuring multiple bonds.

If you require multiple bonding devices, but all with the same options, you may wish to use the "max_bonds" module parameter, documented above.

To create multiple bonding devices with differing options, it is necessary to load the bonding driver multiple times. Note that current versions of the sysconfig network initialization scripts handle this automatically; if your distro uses these scripts, no special action is needed. See the section Configuring Bonding Devices, above, if you're not sure about your network initialization scripts.

To load multiple instances of the module, it is necessary to specify a different name for each instance (the module loading system requires that every loaded module, even multiple instances of the same module, have a unique name). This is accomplished by supplying multiple sets of bonding options in /etc/modprobe.conf, for example:

 alias bond0 bonding
 options bond0 -o bond0 mode=balance-rr miimon=100
 alias bond1 bonding
 options bond1 -o bond1 mode=balance-alb miimon=50

will load the bonding module two times. The first instance is named "bond0" and creates the bond0 device in balance-rr mode with an miimon of 100. The second instance is named "bond1" and creates the bond1 device in balance-alb mode with an miimon of 50.

In some circumstances (typically with older distributions), the above does not work, and the second bonding instance never sees its options. In that case, the second options line can be substituted as follows:

 install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50

This may be repeated any number of times, specifying a new and unique name in place of bond1 for each subsequent instance.

[править] Просмотр настроек агрегирования

[править] Настройка агрегирования

Для каждого агрегированного интерфейса в каталоге /proc/net/bonding существует файл (только для чтения). Файл содержит информацию об агрегировании, опции и и состояние подчинённого устройства.

Например, содержимое /proc/net/bonding/bond0, после того как драйвер загружен с параметрами mode=0 и miimon=1000, выглядит следующим образом:

	Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004)
        Bonding Mode: load balancing (round-robin)
        Currently Active Slave: eth0
        MII Status: up
        MII Polling Interval (ms): 1000
        Up Delay (ms): 0
        Down Delay (ms): 0

        Slave Interface: eth1
        MII Status: up
        Link Failure Count: 1

        Slave Interface: eth0
        MII Status: up
        Link Failure Count: 1

Формат и содержимое меняются в зависимости от версии драйвера и конфигурации.

[править] Настройка сети

Конфигурацию сети можно просмотреть при помощи команды ifconfig. У устройств, принимающих участие в агрегировании, будут установлены флаги MASTER и SLAVE. У главных устройств будет установлен флаг MASTER; у подчинённых будет установлен флаг SLAVE. Вывод команды ifconfig не содержит информации о том какие интерфейсы подчинены каким.

В приведённом ниже примере интерфейс bond0 главный (master), а интерфейсы eth0 и eth1 подчинённые (slave). Обратите внимание, что у всех подчинённых интерфейсов интерфейса bond0 MAC-адреса такие же как и у bond0 во всех режимах за исключением TLB и ALB, в которых MAC-адрес должен быть уникальным для каждого подчинённого интерфейса.

 # /sbin/ifconfig
 bond0     Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
         inet addr:XXX.XXX.XXX.YYY  Bcast:XXX.XXX.XXX.255  Mask:255.255.252.0
         UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
         RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
         TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
         collisions:0 txqueuelen:0
 
 eth0      Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
         inet addr:XXX.XXX.XXX.YYY  Bcast:XXX.XXX.XXX.255  Mask:255.255.252.0
         UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
         RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
         TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
         collisions:0 txqueuelen:100
         Interrupt:10 Base address:0x1080
 
 eth1      Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
         inet addr:XXX.XXX.XXX.YYY  Bcast:XXX.XXX.XXX.255  Mask:255.255.252.0
         UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
         RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
         TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:100
         Interrupt:9 Base address:0x1400

[править] Switch Configuration

For this section, "switch" refers to whatever system the bonded devices are directly connected to (i.e., where the other end of the cable plugs into). This may be an actual dedicated switch device, or it may be another regular system (e.g., another computer running Linux),

The active-backup, balance-tlb and balance-alb modes do not require any specific configuration of the switch.

The 802.3ad mode requires that the switch have the appropriate ports configured as an 802.3ad aggregation. The precise method used to configure this varies from switch to switch, but, for example, a Cisco 3550 series switch requires that the appropriate ports first be grouped together in a single etherchannel instance, then that etherchannel is set to mode "lacp" to enable 802.3ad (instead of standard EtherChannel).

The balance-rr, balance-xor and broadcast modes generally require that the switch have the appropriate ports grouped together. The nomenclature for such a group differs between switches, it may be called an "etherchannel" (as in the Cisco example, above), a "trunk group" or some other similar variation. For these modes, each switch will also have its own configuration options for the switch's transmit policy to the bond. Typical choices include XOR of either the MAC or IP addresses. The transmit policy of the two peers does not need to match. For these three modes, the bonding mode really selects a transmit policy for an EtherChannel group; all three will interoperate with another EtherChannel group.

[править] 802.1q VLAN Support

Поверх агрегированного интерфейса можно использовать VLANы, это делается традиционно при помощи драйвера 8021q. However, only packets coming from the 8021q driver and passing through bonding will be tagged by default. Self generated packets, for example, bonding's learning packets or ARP packets generated by either ALB mode or the ARP monitor mechanism, are tagged internally by bonding itself. As a result, bonding must "learn" the VLAN IDs configured above it, and use those IDs to tag self generated packets.

For reasons of simplicity, and to support the use of adapters that can do VLAN hardware acceleration offloading, the bonding interface declares itself as fully hardware offloading capable, it gets the add_vid/kill_vid notifications to gather the necessary information, and it propagates those actions to the slaves. In case of mixed adapter types, hardware accelerated tagged packets that should go through an adapter that is not offloading capable are "un-accelerated" by the bonding driver so the VLAN tag sits in the regular location.

VLAN interfaces must be added on top of a bonding interface only after enslaving at least one slave. The bonding interface has a hardware address of 00:00:00:00:00:00 until the first slave is added. If the VLAN interface is created prior to the first enslavement, it would pick up the all-zeroes hardware address. Once the first slave is attached to the bond, the bond device itself will pick up the slave's hardware address, which is then available for the VLAN device.

Also, be aware that a similar problem can occur if all slaves are released from a bond that still has one or more VLAN interfaces on top of it. When a new slave is added, the bonding interface will obtain its hardware address from the first slave, which might not match the hardware address of the VLAN interfaces (which was ultimately copied from an earlier slave).

There are two methods to insure that the VLAN device operates with the correct hardware address if all slaves are removed from a bond interface:

  • Remove all VLAN interfaces then recreate them
  • Set the bonding interface's hardware address so that it matches the hardware address of the VLAN interfaces.

Note that changing a VLAN interface's HW address would set the underlying device -- i.e. the bonding interface -- to promiscuous mode, which might not be what you want.

[править] Link Monitoring

The bonding driver at present supports two schemes for monitoring a slave device's link state: the ARP monitor and the MII monitor.

At the present time, due to implementation restrictions in the bonding driver itself, it is not possible to enable both ARP and MII monitoring simultaneously.

[править] ARP Monitor Operation

The ARP monitor operates as its name suggests: it sends ARP queries to one or more designated peer systems on the network, and uses the response as an indication that the link is operating. This gives some assurance that traffic is actually flowing to and from one or more peers on the local network.

The ARP monitor relies on the device driver itself to verify that traffic is flowing. In particular, the driver must keep up to date the last receive time, dev->last_rx, and transmit start time, dev->trans_start. If these are not updated by the driver, then the ARP monitor will immediately fail any slaves using that driver, and those slaves will stay down. If networking monitoring (tcpdump, etc) shows the ARP requests and replies on the network, then it may be that your device driver is not updating last_rx and trans_start.

[править] Configuring Multiple ARP Targets

While ARP monitoring can be done with just one target, it can be useful in a High Availability setup to have several targets to monitor. In the case of just one target, the target itself may go down or have a problem making it unresponsive to ARP requests. Having an additional target (or several) increases the reliability of the ARP monitoring.

Multiple ARP targets must be separated by commas as follows:

 # example options for ARP monitoring with three targets
 alias bond0 b