I Broke My Proxmox Home Lab with a GPU Passthrough - Here's How I Fixed It
A developer recounts how a GPU passthrough attempt on their Proxmox home lab led to an infinite crash loop. The lab runs on a Mini PC with Proxmox, hosting LXC containers and KVM VMs for Jellyfin, Pi-hole, k3s Kubernetes cluster, Cloudflare tunnels, and other self-hosted services. The goal was to pass through the host's AMD GPU to VM 104 (k3s-vm-worker) for GPU-accelerated workloads. They enabled IOMMU in BIOS, added kernel parameters (amd_iommu=on, iommu=pt) in GRUB, set machine type to q35, and added the PCI device as hostpci0. After rebooting, the host entered an infinite crash loop. The fix involved booting into recovery mode, editing GRUB to add 'nomodeset' and 'radeon.modeset=0' kernel parameters to disable the GPU driver on the host, then rebooting successfully. They then removed the GPU passthrough from the VM config and reverted the GRUB changes. The article emphasizes the importance of testing GPU passthrough in a non-production environment and having recovery tools like a live USB or recovery mode.
Shows how GPU passthrough can crash a hypervisor and a GRUB-level recovery method.