DEV CommunitySunday · May 17, 2026FREE

I squeezed my iGPU dry, then added an eGPU — a GPU buying guide for AI on mini PCs

llmegpumini-pcinference

The article recounts a developer's experience optimizing a Minisforum AI X1 Pro (Ryzen AI 9 HX 370, 96GB RAM) for local LLM inference with LM Studio, running models like Gemma 4 E4B and Peach 2.0. The Radeon 890M iGPU's shared memory architecture limited bandwidth to ~120 GB/s versus 448 GB/s on a dedicated GPU, causing issues with long contexts (32K+) and multi-model loading. Software optimizations (multi-model loading, continuous batching, KV cache quantization) helped but couldn't overcome the physical bottleneck. The solution was an OCuLink eGPU dock (PCIe 4.0 x4, costing ¥200–400 / ~$30–60), which offers 2x the bandwidth of USB4. The chosen GPU was an RTX 5060 Ti 16GB, with the author noting that for LLM inference, PCIe 4.0 x4 bandwidth impact is under 5% since model loading is brief and inference is compute-bound. The post includes real pricing and brand teardown data, serving as a decision log rather than a review.

// why it matters

Enables affordable dedicated GPU acceleration for local AI on mini PCs.

Sources

Primary · DEV Community

▸ Read original at dev.to

I squeezed my iGPU dry, then added an eGPU — a GPU buying guide for AI on mini PCs

Sources

Like this? Get the next digest.