Wednesday, May 7, 2025

Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels https://ift.tt/kXTPjpD

Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels Hey – we just published a blog post about Kevin-32B = K(ernel D)evin. It's to our knowledge the first open-source model that's RL-trained on CUDA kernels. Our goal was to demonstrate multi-turn RL using GRPO. We used 180 Python->CUDA conversion tasks from the KernelBench dataset. The results were surprisingly strong! We were able to outperform top reasoning model like o3 & o4-mini. We're sharing our training setup and learnings in the blogpost. Also the model is on HuggingFace: https://ift.tt/aqge680 https://ift.tt/8m2LWQ0 May 7, 2025 at 01:18AM

No comments:

Post a Comment

Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding https://ift.tt/IwYJfju

Show HN: Pocket2Linkding – Migrate from Mozilla Pocket to Linkding With the Mozilla Pocket shutdown coming up in about two weeks, I thought ...