Saturday, June 6, 2026

Show HN: I nerfed our coding agents on purpose https://ift.tt/QYc7o41

Show HN: I nerfed our coding agents on purpose Tl;dr: I trained a classifier to route to the least expensive model and reasoning depth to complete the request. Coupling that with additional automated token efficiency techniques has yielded 3x usage for the same spend. For anyone interested in trying it themselves: https://nerfguard.com Various teammates and I switched over to Codex from Claude Code recently. We still bounce between the tools, but Codex’s speed and steerability coupled with performance gains were hard to ignore. One of the downsides was that the per token pricing kicked in way sooner. This is happening across the board, but we felt it in Codex more acutely. We’re a startup filled with people who work around the clock and are obsessed with building — naturally our daily bill alone was striking. Luckily we’re going after a big mission and speed matters significantly more than marginal token spend on the edges. Still, it got us thinking about how it was ludicrous that while our product has a side effect of decreasing token spend and speeding up agentic workflows by many orders of magnitude, we were using these top tier models for all types of internal coding tasks without any of those optimizations. The waste felt pretty ridiculous — the most glaring culprit was that we were seemingly using the max intelligence model on max reasoning for every task even when the task clearly didn’t require it. As a company who spends a lot of time on cached intelligence, it was also easy for us to see how there was plenty of other low hanging fruit as well. So, on a recent weekend, I quickly built a tool to optimize our usage. At its core is a very fast classifier that classifies your requests to the least intelligence required for the task and includes some nice token optimizations on top. The result is roughly the same quality for multiples lower token spend. But even more exciting for us, is that the properly bin packed intelligence and reasoning levels meant our speed also went up considerably. This wasn’t negligible. We’ve observed up to 3x savings and hours per day per person in saved time that we would have otherwise been waiting on tool turns and coding agent responses. For us, that means improved engineering velocity and significantly higher usage for the same spend. It also means more usage before getting throttled. As I told friends about this, they also wanted to start using it to maximize the usage they could get out of their coding agent plans. There are now engineers across many of the most cutting edge AI companies using this tool to optimize their token utilization in this way. Not just to save money, but to maximize output. Turns out that the best way to avoid getting nerfed by Claude is to intentionally nerf yourself selectively. We decided to release it for the rest of the builder community to use as well. You can now turn on Nerfguard for yourself and start getting more usage today. June 6, 2026 at 04:49AM

Show HN: OWASP VulnerableApp Modern Extensible and Scalable vulnerable app https://ift.tt/VP09O8J

Show HN: OWASP VulnerableApp Modern Extensible and Scalable vulnerable app https://ift.tt/sdChPX5 June 6, 2026 at 01:49AM

Show HN: I rebuilt a tiny old volleyball game I loved https://ift.tt/tE6apCl

Show HN: I rebuilt a tiny old volleyball game I loved https://volleyhop.com/ June 6, 2026 at 01:42AM

Show HN: Bash Runtime for AWS Lambda https://ift.tt/MotXugz

Show HN: Bash Runtime for AWS Lambda Hi HN, I built a Bash runtime for AWS Lambda to make writing glue code simpler and faster. Sometimes, all you need is a bit of `sed`, `awk`, maybe a loop and a few HTTP API calls, and this runtime gives you all the tools to do that. It comes bundled with `jq` and `curl` so you can handle JSON payloads and string together HTTP API calls right out of the box, including calling AWS services with `curl --aws-sigv4`. In keeping with the theme, the Lambda handler contract is also made as simple as practical: read from stdin, write to stdout, return 0 for success and non-0 for error. You can run shell scripts, call binaries (either what's available in `al2023.provided` or you can package your own static binaries with your handler), or a combination of both. If you remember nodding along to Adam Drake's post about how bash and coreutils can be faster than a Hadoop cluster, I hope you give this a whirl and find it useful. The runtime is packaged as a Lambda layer, so it should drop right into your normal AWS infrastructure. https://ift.tt/lEcHeyn June 6, 2026 at 12:42AM

Friday, June 5, 2026

Show HN: Bot or Not – Spot AI-generated randomness https://ift.tt/MUswigI

Show HN: Bot or Not – Spot AI-generated randomness https://play-bot-or-not.vercel.app/ June 5, 2026 at 01:26AM

Show HN: Cost.dev (YC W21) – making agents cost-aware and cheaper to call https://ift.tt/Omtavou

Show HN: Cost.dev (YC W21) – making agents cost-aware and cheaper to call We launched Infracost on HN five years ago ( https://ift.tt/63xg7p0 ) where our CLI generated cost estimates for infra-as-code, e.g. "this Terraform PR adds $400/mo". The idea was to shift cloud costs (FinOps) left, so engineers get visibility of costs before deployment and make better decisions. Earlier this year we started seeing agent traffic in our logs and it looked like coding agents were calling our CLI. But that CLI wasn't designed with coding agents in mind. We went down a philosophical rabbit hole to see if a CLI is even needed anymore given that Claude, Copilot et al. already follow best practices. Ultimately we decided to create a new CLI from the ground up with coding agents in mind for two reasons: 1. We optimized the CLI for agent callers and cut Claude's output token usage by up to 79% and API cost by up to 67% versus a bare-Claude baseline. We wrote a blog documenting our lessons on optimizing user token usage when designing a CLI, e.g. using predicate flags so the agent doesn't compose jq | python | wc pipelines, output format that strips JSON's redundant field names. The blog is here: https://ift.tt/NV0lH67... 2. With cloud costs, precision matters. Telling a coding agent "make this Terraform cost-optimized" can be expensive and lossy. You burn tokens loading code and policy context into every conversation. Your agent could make up a price and you wouldn't know because it's difficult to verify that across the ~10M price points that AWS, Azure and Google have. The CLI runs static analysis on the code, uses the latest prices from cloud vendors, and passes that context to the coding agent. So that's what we're launching today - Cost.dev: https://cost.dev/ . - It runs locally. Your code never leaves your machine, you get a fast feedback loop, and you're not burning API calls per character when you want to fetch prices. - The CLI does the deterministic work. Fetching price points, scanning the code, validating fixes. The coding agent does the natural-language part. You don't have to trust the LLM to remember the rules, and can verify it called the right CLI command. - It provides a consistent rule layer across every tool you use. Get cost estimates in your IDE and your coding agent with a single install. We support Claude Code, GitHub Copilot, Cursor, Windsurf, OpenAI Codex, Gemini CLI, as well as IDEs like VS Code and JetBrains Before we keep building more in that direction, I want to sanity-check with HN: is "agents writing IaC in prod" actually a thing yet, or am I betting on a future that's still a year out? I know software developers are using coding agents heavily, but are platform/infra folks doing that for prod too? Also, if you have any feedback on Cost.dev, I'd love to hear it! https://cost.dev/ June 4, 2026 at 05:00PM

Thursday, June 4, 2026

Show HN: Fork of Rsync https://ift.tt/KSHcWhG

Show HN: Fork of Rsync Hello. After hearing of the problematic LLM commits in rsync, I made a fork of rsync. I decided to fork it off release 3.4.1, since I heard that's the last release without the LLM code. https://ift.tt/soL9pir June 4, 2026 at 03:50AM

Show HN: I nerfed our coding agents on purpose https://ift.tt/QYc7o41

Show HN: I nerfed our coding agents on purpose Tl;dr: I trained a classifier to route to the least expensive model and reasoning depth to co...