Decoupling Compute and Memory for Async GPUs
By yiyingzhang · 2026-06-25 · 5 points · 1 comments
Cool open-source project that introduces a new programming model for decoupling compute and memory for NVIDIA GPUs that supports asynchronous memory operations (e.g., Hopper). 12% perf improvement over SOTA and 67% less kernel code. Paper: "VDCores: Resource Decoupled Programmin…
Open the full discussion on BetterNews