Show HN: KV-psi, using Linux PSI to to trim an LLM KV cache
By infiniteregrets · 2026-06-27 · 2 points · 0 comments
https://github.com/infiniteregrets/kv-psi
I thought it'd be interesting to use Linux PSI (Pressure Stall Information) for an LLM runtime to trim the KV cache. This is mainly useful imo for edge devices like the Jetson Orin super nano kit which have unified memory. I haven't benched much, but plan to do so more over time…
Open the full discussion on BetterNews