NVMe over Fabrics

From ArchWiki

NVMe over Fabrics (NVMe-oF) allows sending NVMe commands over Ethernet or Fibre Channel. This can be used for remote access to block devices similar to iSCSI. An NVMe-oF host can access devices exposed by an NVMe-oF controller. This is not limited to NVMe devices, you can also expose block devices such as ZFS zvols.

Host configuration

To access a block device exposed by a remote NVMe-oF controller, install the nvme-cli package.

First, make sure the necessary kernel module is loaded:

# modprobe nvme-fabrics

Now you can use the CLI to connect to the device:

# nvme connect --transport=tcp --traddr=192.168.0.5 --trsvcid=4420 --nqn=nqn.2024-08.com.example:my-disk
--transport=tcp
Specifies the transport protocol. TCP is the most simple to set up.
--traddr=192.168.0.5
The address of the NVMe-oF target, i.e. the server that exposes the block device.
--trsvcid=4420
The port the target is listening on. 4420 is the recommended port (IANA assignment).
--nqn=nqn.2024-08.com.examplemy-disk
The NVMe Qualified Name (NQN) of the device to connect to. The device must be configured on the controller.

After running this command, the device will be available as a normal NVMe block device, e.g. under /dev/nvme0n1 with partitions under /dev/nvme0n1p1. It is recommended to refer to these devices by UUID, however.

Controller configuration

Install the nvmetcli package.

First, make sure the necessary kernel module is loaded:

# modprobe nvmet

Controller configuration happens through a file system located at /sys/kernel/config/nvmet. nvmetcli provides a convenient interface to modify that file system. It also provides a way to load and save the settings. nvmetcli save will save the current state to /etc/nvmet/config.json, and nvmetcli restore will load from that location. You can enable the nvmet.service unit to load the configuration from that file automatically.

Warning: nvmetcli will only save, load and edit settings that it recognizes. If you configure unsupported settings (such as DHCHAP) they will be silently dropped. You will have to write them manually to the above filesystem.

Inside nvmetcli, you can navigate using cd and show the config tree using ls. There are three top-level directories.

hosts
contains the NQNs of hosts that are referenced in other parts of the tree, e.g. for access control. You can add an NQN here and then make the client use that NQN using the nvme connect --hostnqn=... parameter.
ports
configures the protocols used to access the devices, e.g. a TCP port.
subsystems
configures the individual accessible devices.

Adding a device

First you'll want to create a device to expose:

/> cd subsystems
/subsystems> create nqn.2024-08.com.example:my-device
/subsystems> cd nqn.2024-08.com.example:my-device
/subsystems/n...ple:my-device>

Configure access for the device:

/subsystems/n...ple:my-device> set attr allow_any_host=1
Warning: This will allow any host with network access to this controller to access the block device. There is no further access control by default.

Configure a namespace for the device and set the backing block device:

/subsystems/n...ple:my-device> cd namespaces
/subsystems/n...ce/namespaces> create 1
/subsystems/n...ce/namespaces> cd 1
/subsystems/n.../namespaces/1> set device path=/dev/path/to/block/device
/subsystems/n.../namespaces/1> enable

Configuring the port

Next, create a port. Navigate back to the top using cd /.

/> cd ports
/ports> create 1
/ports> cd 1
/ports/1> set addr trtype=tcp traddr=192.168.0.5 trsvcid=4420
trtype
The transport protocol
traddr
The listener address
trsvcid
The listener TCP port ("service ID")

Add the device you created above to the network port:

/ports/1> cd subsystems
/ports/1/subsystems> create nqn.2024-08.com.example:my-device

That's it! Now the device is accessible to the host using the above instructions.

Authentication using DHCHAP

It is prudent to not allow just any device access to the exposed block device. To authenticate the host, NVMe-oF offers a DHCHAP handshake using a secret shared between host and controller. Unfortunately, as of writing, nvmetcli does not support this out of the box, so you have to modify the config FS manually. The host nvme command does support DHCHAP.

First, use nvmetcli to limit access to the device to only a single host, identified by NQN.

/> cd hosts
/hosts> create nqn.2024-08.com.example.host
/hosts> cd /subsystems/nqn.2024-08.com.example:my-device
/subsystems/n...ple:my-device> set attr allow_any_host=0
/subsystems/n...ple:my-device> cd allowed_hosts
/subsystems/n...allowed_hosts> create nqn.2024-08.com.example.host

The NVMe controller now knows that only the host with the NQN nqn.2024-08.com.example.host may access the device. You can test this (without DHCHAP, for now) using nvme connect --hostnqn=nqn.2024-08.com.example.host. With the wrong NQN, the connection should fail, with the right NQN, it should succeed.

The next step is to configure DHCHAP. First you need to generate a secret like so:

$ nvme gen-dhchap-key --nqn=nqn.2024-08.com.example.host
DHHC-1:00:znDcb37R200FNlZkIOkv37idpu/notvalid!!si1VQ09KhKv2g:

Because nvmetcli does not support DHCHAP, you must configure it manually:

# echo 'DHHC-1:00:znDcb37R200FNlZkIOkv37idpu/notvalid!!si1VQ09KhKv2g:' > /sys/kernel/config/nvmet/hosts/nqn.2024-08.com.example.host/dhchap_key

Now, you can connect using the DHCHAP secret:

# nvme connect --transport=tcp --traddr=192.168.0.5 --trsvcid=4420 --nqn=nqn.2024-08.com.example:my-disk --hostnqn=nqn.2024-08.com.example.host --dhchap-secret="DHHC-1:00:znDcb37R200FNlZkIOkv37idpu/notvalid!!si1VQ09KhKv2g:"
Warning: DHCHAP will authenticate the host but not encrypt the block device data on the network.

Booting from NVMe-oF

It is possible to connect to an NVMe-oF device during the boot process inside initramfs and use the device for the root file system. The procedure is very similar to that for iSCSI. Only the NVMe-specific parts are listed here for now.

The initcpio hook is different for NVMe-oF. The install script adds different modules, and adds the nvme binary instead of iscsistart:

/etc/initcpio/install/nvme-of
build () {
	map add_module nvme-fabrics nvme-tcp nvme-keyring
	add_checked_modules "/drivers/net"
	add_binary nvme
	add_runscript
}
help () {
	cat <<HELPEOF
	This hook allows you to boot from an NVMe-oF target.
HELPEOF
}

The actual hook uses the CLI to connect to the device:

/etc/initcpio/hooks/nvme-of
run_hook () {
	msg "Mounting NVMe-oF target"
	nvme connect --transport=tcp --nqn=nqn.2024-08.com.example:my-device --traddr=192.168.0.5 --trsvcid=4420
}
Tip: If you configured DHCHAP, make sure to add the required arguments here.

Modify the mkinitcpio configuration:

/etc/mkinitcpio.conf
MODULES=(... nvme-fabrics)
HOOKS=(... net nvme-of block ...)