iSCSI/Boot

From ArchWiki

Arch Linux can be installed on an iSCSI target. This allows the machine to only contain enough mass storage to contain a boot loader, such as an USB drive, while its root partition is accessed over the network with iSCSI. Any subsequent partitions can also be accessed with iSCSI, or with some other network storage protocol, such as NFS. When combined with PXE, iBFT, or some other method of network booting, it allows the machine to not contain any mass storage.

Boot process

There are various ways of mounting an iSCSI target at boot time:

  • Using a standard initramfs that is configured to initialise a network connection and mount the iSCSI target, and finally boot into it. The procedure for this is outlined below.
  • Certain firmwares (BIOS or UEFI) and network adapters can initialise a network connection and mount an iSCSI target themselves before booting into the OS contained in the iSCSI target. This requires installing a boot loader in the target as well. This is common in server- and enterprise-grade hardware. Consult your hardware manual and manufacturer on how to achieve this.
  • Using a custom PXE firmware, such as iPXE, to mount the iSCSI target and boot into it.

Preparation

Target setup

Follow the procedure outlined in iSCSI to setup an iSCSI target on a remote machine.

Pre-installation

Boot into the Arch Linux installer using the latest official installation media from the download page. Follow the installation guide up until partitioning.

Note: It is desirable to have a static IP address in the network when using iSCSI. This can be achieved either by setting the address manually, or by using a static reservation in the network's DHCP server.

iSCSI initiator setup

If needed, adjust your initiator name in the iSCSI configuration.

/etc/iscsi/initiatorname.conf
InitiatorName=iqn.2011-03.example.org.istgt:arch

Discover the available iSCSI targets on the remote machine to ensure they are visible to the installer. Adjust the portal IP address to match your environment. The output will depend on your environment; ensure it matches the target you set up earlier.

# iscsiadm -m discovery -t sendtargets -p PORTAL_IP_ADDRESS
192.168.1.100:3260,-1 iqn.2011-03.example.org.istgt:arch

Login to the iSCSI target.

# iscsiadm -m node -T TARGET_NAME -p PORTAL_IP_ADDRESS -l

The iSCSI block device is now available. Verify the output of lsblk to ensure the block device is available. The kernel log messages may contain helpful information if accessing the target fails.

Continue installing Arch on the iSCSI target as usual, up until you generate the initramfs and install the boot loader.

System setup

iSCSI daemon

Install the open-iscsi package in the new system.

Start/enable iscsid.service.

Note: The iSCSI daemon will detect and manage the iSCSI session originally started in the initramfs.

Fstab

The install procedure automatically generates /etc/fstab to match the new install. Ensure the root filesystem is added correctly. The iSCSI daemon handles any errors in the iSCSI kernel layer, so it is important to keep it running as long as the root filesystem - and any other iSCSI filesystems - are mounted. This can be achieved by appending x-systemd.requires=iscsid.service to the filesystem mount options. For example:

/etc/fstab
UUID=12345678-1234-1234-1234-123456789012 /     ext4   defaults,_netdev,x-systemd.requires=iscsid.service   0      1

initramfs

This guide outlines a method which uses a busybox-based init, as opposed to a systemd-based init. The procedure will be different for a systemd-based init.

Connect to the network automatically

The initcpio has to connect to the network before opening a connection to the iSCSI target. This can be achieved in various ways, but a simple way is to use the net-hook with the ip= kernel parameter. This hook is designed to be used with an NFS-based root, but omitting the NFS-specific parameters lets it only configure the network adapter.

Install the mkinitcpio-nfs-utils in the new system.

Insert the net hook in the HOOKS array:

/etc/mkinitcpio.conf
HOOKS=(base ... net block ...)

Append the ip= parameter accordingly to the kernel parameters. For example, to configure the eth0 network adapter with DHCP:

ip=dhcp
Tip: Once the actual system boots from the initramfs, the network adpater will have its address left over. Ensure your network configuration respects this address, or otherwise uses the same address.

Start the iSCSI session automatically

In order to start the iSCSI session in the initramfs, a custom hook must be created for it. The hook consists of a build hook and a runtime hook, as explained in Mkinitcpio#HOOKS.

The build hook will add the required iSCSI modules to the initramfs, and the iscsistart binary.

/etc/initcpio/install/iscsi
build () {
        map add_module iscsi_tcp iscsi_ibft libiscsi libiscsi_tcp scsi_transport_iscsi crc32c
        add_checked_modules "/drivers/net"
        add_binary iscsistart
        add_runscript
}

help () {
cat <<HELPEOF
        This hook allows you to boot from an iSCSI target.
HELPEOF
}
Warning: The iscsistart binary is specifically used to start an iSCSI session from a boot environment. It should not be used in conjunction with iscsiadm or iscsid!

The runtime hook will start the iSCSI session using a given initiator name, target name and portal. Fill in the values according to your environment.

/etc/initcpio/hooks/iscsi
run_hook () {
        msg "Mounting iSCSI target"
        iscsistart -i INITIATOR_NAME -t TARGET_NAME -g 1 -a PORTAL_IP_ADDRESS -d 1
}

Append the newly created iscsi hook after the net hook in the HOOKS array:

/etc/mkinitcpio.conf
HOOKS=(... net iscsi block ...)

Adjust the root= kernel parameter to match the iSCSI target.

Tip: Using a persistent block device name helps prevent mismatches with other block devices if using node names (/dev/sda, /dev/sdb etc.)

Generate the initramfs image and finish the installation guide.

Tips and tricks

Using an iBFT-compatible ROM

iscsistart can be used to configure the network and iSCSI target based on iBFT-supplied configuration values from the firmware.

Modify the runtime hook for iBFT:

/etc/initcpio/hooks/iscsi
run_hook ()
{
    modprobe iscsi_tcp
    modprobe iscsi_ibft

    echo "Network configuration based on iBFT"
    iscsistart -N {{|}}{{|}} echo "Unable to configure network"

    echo "iSCSI auto connect based on iBFT"
    until iscsistart -b ; do
        sleep 3
    done
}
Note: Ensure that whatever network management system you use does not bring the interface down too early during the shutdown process.

Set custom configuration values for iscsistart

Since the iscsistart binary in the initramfs does not use any iSCSI daemon, nor does it use a configuration file, it assumes default values for all configuration settings for the daemon. Individual settings can be set with the -P-command line parameter:

iscsistart ... -P 'node.conn[0].iscsi.MaxXmitDataSegmentLength=26144'

The parameter can be given multiple times to set multiple settings. All available settings and their default values can be found in iscsid.conf.

Make the iSCSI daemon resilient to network problems

The open-iscsi README describes optimal iSCSI daemon settings when using an iSCSI target for the root. The iSCSI timeouts should be adjusted so that the iSCSI kernel layer has several chances to try to re-establish a broken session, and so that commands are not quickly requeued to the SCSI layer. This is essentially the opposite of when using dm-multipath.

Turn off iSCSI no-ops:

/etc/iscsi/iscsid.conf
node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0

Set the replacement timeout to a large value:

/etc/iscsi/iscsid.conf
node.session.timeo.replacement_timeout = 86400

If a network problem is detected by the iSCSI initiator, running commands are failed immediately. However, the SCSI error handler in the SCSI layer prevents them from failing until the iSCSI daemon's configured replacement timeout has elapsed. iscsiadm can be used to see if the SCSI error handler is currently running:

# iscsiadm -m session -P 3
************************
Attached SCSI devices:
************************
Host Number: 0  State: running
scsi0 Channel 00 Id 0 Lun: 0
        Attached scsi disk sda          State: running

To modify the timeout of failing commands until the SCSI error handler is ran, modify the device's timeout value directly:

# echo Y > /sys/block/sdX/device/timeout

where the value is in seconds, or with an udev rule:

/etc/udev/rules.d/50-iscsi.rules
ACTION=="add", SUBSYSTEM=="scsi" , ATTR{type}=="0|7|14", RUN+="/bin/sh -c 'echo Y > /sys$$DEVPATH/timeout'"
Note: The default timeout is 30 seconds.

Improving performance

Since low-level I/O-commands will go through the IO scheduler on the remote system, and are subsequently queued there, it is possible to disable any queuing algorithm on the initiator system by using the none setting for the queuing algorithm for the iSCSI device. See Improving performance#Input/output schedulers for details and configuration.

Troubleshooting

Detected conn error (1011)

This can occur for multiple reasons:

  • The initiator, target, or network is overwhelmed from iSCSI no-ops, causing a busy connection be considered disconnected, in which case the SCSI error handler overrides the device, which can break the working-but-busy iSCSI session. See #Make the iSCSI daemon resilient to network problems on how to disable the no-ops.
  • The network adapter's MTU is mismatched from that of the rest of the network.
  • In rare cases, when TCP window scaling is enabled. It can be disabled with sysctl with the net.ipv4.tcp_window_scaling setting.

See also