Debugging Distribution Boot Problems

I use Debian, but this method can easily be adapted to other distributions.

In a couple of decades or so running Debian Unstable (and occasionally, Testing), I realized that the most time-consuming bugs to analyze are due to packages breaking during boot time.

This includes:

  • bootloaders,

  • initramfs problems,

  • init system.

With the goal of quickly submit an upstream report, I need to tell if the issue is due to my machine’s unique setup, or if it is reproducible on a vanilla install. I have thus found a quick method for testing: creating a VM and saving boot logs via serial console to files.

The steps are roughly:

  • install Vagrant and Ansible (apt install vagrant vagrant-libvirt ansible).

  • create an empty folder, for instance ~/Projects/boot-problems.

  • create a logs folder inside. Here the boot logs will be dumped.

  • create the following files in the topmost folder:

    requirements.yaml
    # SPDX-FileCopyrightText: 2024 Matteo Settenvini <matteo.settenvini@montecristosoftware.eu>
    # SPDX-License-Identifier: CC0-1.0
    
    collections:
      - name: stackhpc.linux
    
    Vagrantfile
    # SPDX-FileCopyrightText: 2023 Matteo Settenvini <matteo.settenvini@montecristosoftware.eu>
    # SPDX-License-Identifier: CC-BY-SA-4.0
    
    Vagrant.configure("2") do |config|
      config.vm.box = "generic/debian12"
    
      config.vm.provider :libvirt do |provider|
        provider.serial :type => :file, :source => {
          :path => File.join(__dir__, "logs", "console.log")
        }
      end
    
      config.vm.provision "ansible" do |ansible|
        ansible.become = true
        ansible.compatibility_mode = "2.0"
        ansible.playbook = "playbook.yml"
        ansible.galaxy_role_file = "requirements.yml"
        ansible.verbose = true
      end
    end
    
    playbook.yml
    # SPDX-FileCopyrightText: 2023 Matteo Settenvini <matteo.settenvini@montecristosoftware.eu>
    # SPDX-License-Identifier: CC-BY-SA-4.0
    ---
    - name: Enable serial logging while booting
      hosts: all
      vars:
        kernel_cmdline:
          - console=ttyS0,115200
          - console=tty0
      tasks:
        - name: Change kernel commandline in grub
          ansible.builtin.include_role:
            name: stackhpc.linux.grubcmdline
    
      handlers:
        - name: reboot
          ansible.builtin.reboot:
    

Now it’s time to bring up the VM with vagrant up. Once rebooted, vagrant ssh will give you a console.

Once it is running, you can easily trigger the update you think broke your system and get full logs of it as dumped under logs.

This is invaluable if you need to attach something reliable to Debian’s reportbug.