NixOS Setup and Configuration

NixOS
System Installation and Configuration

A brief overview (read instructions) on setting up a new NixOS system with LVM on LUKS on md. We go through drive preparation, basic NixOS installation instructions, and slight modifications to the instructions for installing a new system from configuration.

NixOS

Previously, we introduced the concepts and ideas behind NixOS and by extension nix, the package manager. We, therefore, will not be reiterating the discussion here.

System Installation and Configuration

Installing NixOS is fairly straight forward. However, that is a relative term. My experience is with Arch Linux and more recently Gentoo, both not known for having forgiving installations.

I don't want to replace the manual, I only want to really supplement it with the steps where I deviated from its path or add information for my personal configuration/preferences.

That said, we will focus on disk preparation and partitioning as that is the most complicated portion of our installation.

We will walk through the installation of two machines, first, will be my current laptop with two SSD's, second, my main desktop with six hard drives. Since we are doing two setups, we will also have a chance to do both BIOS and UEFI partitioning schemes.

I am assuming the use of the NixOS live-installation medium.

Laptop

Disk Preparation

I am assuming the use of the NixOS live installation medium.

Since this will be an encrypted everything (sans /boot), we will need to securely erase all the drives.

For each drive, ${device}, perform the following:

Lines beginning with # are commands to be executed as the root user.

# cryptsetup open --type plain --key-file=/dev/urandom ${device} wipe-me
# dd if=/dev/zero of=/dev/mapper/wipe-me status=progress
# cryptsetup close

For large hard drives, this step can take a considerable amount of time. This can be done in parallel by using different identifiers than wipe-me.

This probably cannot be parallelized if using the more paranoid random source /dev/random device instead of /dev/urandom as there will likely not be enough entropy for more than one device.

Concretely, this may look like:

# cryptsetup open --type plain --key-file=/dev/urandom /dev/sda wipe-me
# dd if=/dev/zero of=/dev/mapper/wipe-me status=progress
# cryptsetup close

After securely erasing each hard drive to be used, we will next setup the various partitions for each drive. Since we will be using LVM on a LUKS container, residing on a RAID 1 pair of hard drives, our partitioning scheme will be pretty simple.

Since NixOS, by default, uses Grub2, we will need to create a 2 MB first partition for BIOS systems.

After partitioning the disk, the partition table should look similar to the following:

Device       Start        End    Sectors  Size Type
/dev/sda1     2048       6143       4096    2M BIOS boot
/dev/sda2     6144    1054719    1048576  512M Linux filesystem
/dev/sda3  1054720 1953525134 1952470415  931G Linux RAID

Perform or replicate the partition table to the second disk. After which, we will begin the configuration of the mirror.

Certainly, it's possible to securely erase one disk, partition it, then copy it to the other disk via dd if=/dev/sda of=/dev/sdb status=progress.

We will create two mirrors for this configuration, one for the /boot partition and another for the LUKS container:

# mdadm --create /dev/md1 --level=mirror --raid-devices=2 /dev/sda2 /dev/sdb2
# mdadm --create /dev/md2 --level=mirror --raid-devices=2 /dev/sda3 /dev/sdb3

After creating the mirrors, we need to create the LUKS container and the format the /boot partition.

Boot Partition:

# mkfs.ext4 -L boot /dev/md1

LUKS Container:

When configuring encrypted containers, there are lot of different options and parameters to choose from. For example, there are various cryptography schemes and modes to choose from. AES-XTS-PLAIN64 is a solid choice since most CPU's will have extensions for doing AES, increasing the throughput. I personally, have been looking into the other AES finalists such as Twofish and Serpent.

# cryptsetup -v \
             --type luks \
             --cipher twofish-xts-plain64 \
             --key-size 512 \
             --hash sha512 \
             --iter-time 5000 \
             --use-random \
             --verify-passphrase \
             luksFormat \
             /dev/md2

Once the LUKS container is created, open it:

# cryptsetup open /dev/md2 cryptroot

Now, we can begin creating the LVM volumes:

# pvcreate /dev/mapper/cryptroot
# vgcreate vg0 /dev/mapper/cryptroot
# lvcreate -L 1G vg0 -n root
# lvcreate -L 10G vg0 -n var
# lvcreate -L 20G vg0 -n opt
# lvcreate -L 32G vg0 -n swap
# lvcreate -L 100G vg0 -n nix
# lvcreate -L 100G vg0 -n home
# lvcreate -L 100G vg0 -n docker

Notice, there is no /usr in our LVM configuration. Furthermore, notice / is particularly small. NixOS is particularly different when it comes Filesystem Hierarchy. Notably, there is a large portion of the volume set aside for /nix. The majority of the "system" will be in this directory.

Now we need to format the volumes:

# mkfs.ext4 -L root /dev/mapper/vg0-root
# mkfs.ext4 -L var /dev/mapper/vg0-var
# mkfs.ext4 -L opt /dev/mapper/vg0-opt
# mkswap /dev/mapper/vg0-swap
# mkfs.xfs -L nix /dev/mapper/vg0-nix
# mkfs.xfs -L home /dev/mapper/vg0-home
# mkfs.btrfs -L docker /dev/mapper/vg0-docker

Most volumes will be formatted with the ext4 filesystem, typical for standard GNU/Linux systems. However, we will use XFS for /nix and /home. XFS is particularly well suited for purposes of these directories. Furthermore, since Docker is an (unfortunate) necessity, creating a proper COW filesystem using Btrfs, we get better management of Docker images.

Next, we will mount these volumes into various folders to begin the installation, creating the folder trees as necessary to mount:

# mount /dev/mapper/vg0-root /mnt/
# mkdir -p /mnt/{var,nix,home,boot,opt}
# mount /dev/md1 /mnt/boot
# mount /dev/mapper/vg0-opt /mnt/opt
# mount /dev/mapper/vg0-var /mnt/var
# mount /dev/mapper/vg0-home /mnt/home
# mount /dev/mapper/vg0-nix /mnt/nix
# mkdir -p /mnt/var/lib/docker
# mount /dev/mapper/vg0-docker /mnt/var/lib/docker

Desktop

The desktop preparation and configuration are very similar to the laptop. However, as noted above, the complication comes from the fact that instead of a single pair of drives, we will have 3 pairs of drives. Everything else is essentially the same.

Disk Preparation

We first start by securely erasing all the devices:

# cryptsetup open --type plain --key-file /dev/urandom /dev/nvme0n1 wipe-me
# dd if=/dev/zero of=/dev/mapper/wipe-me
# cryptsetup close wipe-me

Remember, we don't have to securely erase every device since we will be mirroring several of them together. This does require that each drive are identical. If they are not identical, it is likely safer to erase every drive.

Next, we will begin by partitioning each of the devices:

# gdisk /dev/nvme0n1
Command (? for help): n
Partition number (1-128, default 1): 1
First sector:
Last sector: +512M
Hex code or GUID: EF00
Command (? for help): n
First sector: 
Last sector: 
Hex code or GUID: FD00
Command (? for help): w

This will create the boot EFI system partition and the first encrypted container partition.

We do essentially the same thing for each of the pairs. However, the next two only need a single partition for the md container.

Unlike the secure erasing above, we do need to create the partition tables for each device.

After partitioning the drives, we will construct the mirrors:

# mdadm --create /dev/md1 --level=mirror --raid-devices=2 --metadata 1.0 /dev/nvme0n1p1 /dev/nvme1n1p1
# mdadm --create /dev/md2 --level=mirror --raid-devices=2 /dev/nvme0n1p2 /dev/nvme1n1p2
# mdadm --create /dev/md3 --level=mirror --raid-devices=2 /dev/sda1 /dev/sdb1
# mdadm --create /dev/md4 --level=mirror --raid-devices=2 /dev/sdd1 /dev/sde1

We need to create the /boot mirror with metadata 1.0 so that the super blocks are put at the end of the RAID such that the UEFI does not get confused when attempting to boot the system. Otherwise, we use the default for all other mirrors.

To monitor the progress of the mirror synchronization, use the following command:

# watch cat /proc/mdstat

It's not vitally important that the mirrors are synchronized before continuing. Although, from a reliability perspective, it is "safer".

It's also possible to specify the second device as missing in each of the above commands. This way, the synchronization process can effectively be deferred until the end.

After creating each of the mirrors, we need to format the /boot EFI system partition. This is a UEFI system, therefore, we will be using vfat for the filesystem.

# mkfs.vfat -n boot /dev/md1

Now, we must create the various LUKS containers:

# cryptsetup -v \
             --type luks \
             --cipher twofish-xts-plain64 \
             --key-size 512 \
             --hash sha512 \
             --iter-time 5000 \
             --use-random \
             --verify-passphrase \
             luksFormat \
             /dev/md2
# cryptsetup -v \
             --type luks \
             --cipher twofish-xts-plain64 \
             --key-size 512 \
             --hash sha512 \
             --iter-time 5000 \
             --use-random \
             --verify-passphrase \
             luksFormat \
             /dev/md3
# cryptsetup -v \
             --type luks \
             --cipher twofish-xts-plain64 \
             --key-size 512 \
             --hash sha512 \
             --iter-time 5000 \
             --use-random \
             --verify-passphrase \
             luksFormat \
             /dev/md4

Next, we will open and start creating our LVM volumes:

# cryptsetup open /dev/md2 cvg0
# cryptsetup open /dev/md3 cvg1
# cryptsetup open /dev/md4 cvg2

Now the LVM setup:

# pvcreate /dev/mapper/cvg0
# vgcreate vg0 /dev/mapper/cvg0
# pvcreate /dev/mapper/cvg1
# vgcreate vg1 /dev/mapper/cvg1
# pvcreate /dev/mapper/cvg2
# vgcreate vg2 /dev/mapper/cvg2

Now that the volume groups are created, we will start creating the actual logical volumes:

# lvcreate -L 1G -n root vg0
# lvcreate -L 100G -n nix vg0
# lvcreate -L 15G -n opt vg0
# lvcreate -L 20G -n var vg1
# lvcreate -L 100G -n docker vg1
# lvcreate -L 64G -n swap vg1
# lvcreate -L 1T -n home vg2

Finally, we can format each of the partitions:

# mkfs.ext4 -L root /dev/mapper/vg0-root
# mkfs.ext4 -L opt /dev/mapper/vg0-opt
# mkfs.xfs -L nix /dev/mapper/vg0-nix
# mkfs.ext4 -L var /dev/mapper/vg1-var
# mkfs.btrfs -L docker /dev/mapper/vg1-docker
# mkfs.xfs -L home /dev/mapper/vg2-home
# mkswap /dev/mapper/vg1-swap

Before moving onto the next step, we first need to mount each of volumes in the desired path:

# mount /dev/mapper/vg0-root /mnt
# mkdir -p /mnt/{boot,home,nix,var,opt}
# mount /dev/md1 /mnt/boot
# mount /dev/mapper/vg0-nix /mnt/nix
# mount /dev/mapper/vg0-opt /mnt/opt
# mount /dev/mapper/vg1-var /mnt/var
# mkdir -p /mnt/var/lib/docker
# mount /dev/mapper/vg1-docker /mnt/docker
# mount /dev/mapper/vg2-home /mnt/home

NixOS Configuration and Installation

Once the disk preparation is complete, we can follow the steps from the NixOS Manual to create the initial configuration:

# nixos-generate-config --root /mnt

After this is done, we can move onto configuring the system the way we want. However, this is where we will deviate slightly from the manual. First, we will need to install git so we can pull down our configuration.

The following steps are very personal. You're free to use my configuration if you do not have your own, or if you would like to try it out. However, you will likely want different things from your system. Change the following steps as necessary.

# nix-env -i git
# cd /mnt/etc/
# mv nixos nixos.bak
# git clone git://git.devnulllabs.io/cfg.nix.git nixos
# cd nixos
# cp ../nixos.bak/hardware-configuration.nix .

My set of Nix configuration includes subfolders for each machine. To setup a new machine, I soft link ("symlink") the machine's configuration.nix into the [/mnt]/etc/nixos folder. If this is a new machine or a rebuild, I typically merge the differences between the hardware-configuration.nix files. After which, I perform the regular installation.

nixos-install --no-root-passwd

Once this finishes, the installation and configuration is done. Reboot the machine, remove the installation/live media, use the freshly installed machine as if it was always there.

UEFI Notes

Aside from learning about the mdadm metadata placement being an issue for UEFI systems to boot, I also had played around with the various settings for GRUB to install correctly without errors and warnings.

Here's the full GRUB configuration:

boot.loader.systemd-boot = {
  enable = true;
  editor = false;
};
boot.loader.efi = {
  canTouchEfiVariables = false;
};
boot.loader.grub = {
  enable = true;
  copyKernels = true;
  efiInstallAsRemovable = true;
  efiSupport = true;
  fsIdentifier = "uuid";
  splashMode = "stretch";
  version = 2;
  device = "nodev";
  extraEntries = ''
    menuentry "Reboot" {
      reboot
    }
    menuentry "Poweroff" {
      halt
    }
  '';
};

Of particular importance are the following variables:

boot.loader.systemd-boot.enable
boot.loader.efi.canTouchEfiVariables
boot.loader.grub.efiInstallAsRemovable
boot.loader.grub.device

Ideally, boot.loader.grub.efiSupport would be sufficient to tell GRUB to install the UEFI payload instead. However, as it turns out, there is a few more settings required to ensure proper booting in UEFI environments, particularly when using RAID.

According to the manual, it's required to set boot.loader.systemd-boot.enable to true. Setting boot.loader.grub.device or boot.loader.grub.devices to anything other than "nodev" or [ "nodev" ] disables boot.loader.grub.efiSupport. Moreover, with boot.loader.efi.canTouchEfiVariables, the installation/build process attempts to run efibootmgr to modify the NVRAM of the motherboard, setting the boot targets, this fails when used with boot.loader.grub.device = "nodev". Therefore, it is required to set boot.loader.efi.canTouchEfiVariables = false and boot.loader.grub.efiInstallAsRemovable such that installation process simply places the GRUB UEFI payload in the "default" search location for the motherboard, consulted before the NVRAM settings.

Docker, `nftables`, and NixOS Notes

In developing the system configuration, I came across some issues with respect to Docker and nftables. The nftables project became standard in the Linux kernel in version 3.13 and replaces the myriad of existing {ip,ip6,arp,eb}_tables tools and (kernel) code. Specifically, any Linux kernel above 3.13, iptables and friends are now simply a user-space front-end to the nftables kernel backend. However, Docker still does not support nftables directly; there's an issue from 2016.

With some digging and work, there's a way to get nftables and Docker to work nicely with each other.

Specifically, we configure Docker to not modify the iptables rules using the --iptables=false configuration flag for the daemon. In this configuration, we can tightly control the firewall with whatever tool we wish, in this case, nftables. This comes with the added benefit of bound ports are not automatically opened to the world.

However, when using NixOS, any modification to the nftables ruleset will require a reload. However, with Docker loaded as well, this reload process can actually bring down the firewall completely since Docker (even with --iptables=false) will attempt to load the iptables kernel module, blocking the resulting nftables module load. When using a system such as Gentoo this was never an issue, since the configuration completely ignore the iptables subsystem (since it was compiled out). In NixOS, there's a bit more dance involved for the time being.

This is really a minor annoyance as the firewall rules are only seldom changed.

NixOS Setup and Configuration

Table of Contents

NixOS

System Installation and Configuration

Laptop

Disk Preparation

Desktop

Disk Preparation

NixOS Configuration and Installation

UEFI Notes

Docker, nftables, and NixOS Notes

Docker, `nftables`, and NixOS Notes