Thursday, May 14, 2026

Show HN: Petri – Drop-in Postgres image that forks a DB per test https://ift.tt/APIWNEf

Show HN: Petri – Drop-in Postgres image that forks a DB per test Rolling it out at work to parallelize 4,257 tests across 5 services. It fixes our tests running in band and DB mocking in API tests. It's a drop-in Postgres image, with a Golang proxy. :5432 is passthrough, :5433 forks the DB per conn (CREATE DATABASE … TEMPLATE …, dropped on disconnect). If you use it, let me know what you like or don't like, so I can make it better. Cheers! https://ift.tt/z1RdQH7 May 14, 2026 at 05:02AM

Reserve Parking in a City-Owned Garage: SFMTA Makes Parking Easier for Everyone

Reserve Parking in a City-Owned Garage: SFMTA Makes Parking Easier for Everyone
By Pamela Johnson

Parking in San Francisco can be challenging, but we’re here to help make it less stressful. We recently announced two new mobile apps that let you pay for parking by phone and even get alerts when your meter is winding down. Now, our teams are making it easier to find parking, too! Our online parking reservation system is expanding to cover the majority of SFMTA-owned garages across the city. That means you can reserve parking at garages located conveniently near the ballet, opera, symphony, theaters and shopping districts. The best part: rates at SFMTA garages are typically 30–40% less than...



Published 2026-05-13T00:00:00Z
https://ift.tt/SA2mXi0

Show HN: Neural window manager, neural network moving windows from mouse actions https://ift.tt/yRUXYcA

Show HN: Neural window manager, neural network moving windows from mouse actions I'd been mulling over this crazy idea for a while. Can programs be generated? Inspired by recent advances in world models, I wondered if we could do away with source code and generate pixels directly and interactively. As an experiment to answer this, I set out to create a neural window manager, training a neural network to predict what the screen would look like next. Basically, the idea was to generate the next frame based on the last two frames and the mouse position. That's it: moving windows without programming an event system, just a simple convolutional neural network guessing pixels. To implement the experiment, I used Pygame to simulate a turquoise desktop background, a gray window with a navy blue title bar, a white cursor, and four colors in total. Then, a bot randomly dragged the window, and I recorded everything, processing the frames as color index matrices (not RGB, to avoid complications) and the mouse delta (dx, dy, click) that caused each transition. 8000 frames, a few minutes in Colab. The model is a unitary neural network (UNET). The encoder compresses the stacked frames, the decoder reconstructs the next one, and the mouse vector coordinates are projected with a linear layer to fit the spatial size of the bottleneck. There, they are concatenated before decoding, so that motion information feeds each jump connection. And it works! Which still surprises me a little. You can drag, and the window follows you; when you release, it stops. There's no internal state, no (x, y) coordinates anywhere. The model infers the position from what it sees, which works until it doesn't. But after a couple of seconds of strange movement, the window starts to distort. This will probably improve with more computing power for training and more examples, but to narrow the scope of the experiment and test it within a web browser, I decided to abandon the rendering aspect and have the model predict primitives instead of pixels, simply converting the motion engine into a neural network. Basically, I trained a small MLP to receive (distance to the title bar, distance to the resize point, click) and generate (dx, dy, dw, dh), with two separate heads: one for moving and one for resizing. The trick is that they share nothing except the click signal, so the model can't confuse dragging with resizing. I then exported it to ONNX as well, and now everything runs in the browser, without a server, just a canvas element and two small neural networks communicating with each other. With this new approach, the renderer remains deterministic, with rectangles drawn in JavaScript, but the window's behavior (where it moves, how it resizes) is learned from examples. It feels like a peculiar middle ground between traditional and neural, so you can feel the space the network has learned by interacting with it: dragging near the title bar moves it, but approaching the corner resizes the window. There are no conditionals or hitbox code; the network simply learned where those areas are from examples. Sometimes it gets confused near the edges, which, frankly, is more interesting than if it worked perfectly; you can perceive how the probability changes. This makes sense when you think about it, because no (x, y) coordinates are stored in these models; the position is implied in the activations. It works well for short sequences, but fails when asked to maintain state over time. Update: A few weeks later, Meta published the Neural Computers article (2604.06425, it's worth reading). The premise is the same, but they go much further: cli and uis, real programs. Their failure modes are practically identical to those I found with the pure pixel version: "challenges persist with routine reuse, controlled updates, and symbolic stability." which is a fancy way of saying that the window blurs after a few seconds (that was the reason for choosing deterministic rendering). https://lusob.github.io/neural-os/ May 13, 2026 at 11:16PM

Wednesday, May 13, 2026

Show HN: Statewright – Visual state machines that make AI agents reliable https://ift.tt/dxfnmvp

Show HN: Statewright – Visual state machines that make AI agents reliable Agentic problem solving in its current state is very brittle. I fell in love with it, but it creates as many problems as it solves. I'm Ben Cochran, I spent 20+ years in the trenches with full-stack Engineering, DevOps, high performance computing & ML with stints at NVIDIA, AMD and various other organizations most recently as a Distinguished Engineer. For agents to work reliably you either need massive parameter counts or massive context windows to keep the solution spaces workable. Most people are brute forcing reliability with bigger models and longer prompts. What if I made the problem smaller instead of making the model bigger? I took a different approach by using smaller models: models in the 13-20B parameter range and set them to task solving real SWE-bench problems. I constrained the tool and solution spaces using formal state machines. Each state in the machine defines which tools the model can access, how many iterations it gets and what transitions are valid. A planning state gets read-only tools. An implementation state gets edit tools (scoped to prevent mega edits) and write friendly bash tools. The testing state gets bash but only for testing commands. The model cannot physically skip steps or use the wrong tool at the wrong time. It is enforced via protocol, not via prompts. The results were more promising than I would have expected. Across multiple model families irrespective of age (qwen-coder, gpt-oss, gemma4) and the improvements were consistent above the 13B parameter inflection point. Below that, models can navigate the state machine but can't retain enough context to produce accurate edits. More on the research bit: https://ift.tt/4kc1Y2H Surprisingly this yielded improvements in frontier models as well. Haiku and Sonnet start to punch above their weight and Opus solves more reliably with fewer tokens and death spirals. Fine tuning did not yield these kinds of functional improvements for me. The takeaway it seems is that context window utilization matters more than raw context size - a tightly scoped working context at each step outperforms a model given carte blanche over everything. Constraining LLMs which are non-idempotent by using deterministic code is a pattern that nobody is currently talking about. So, I built Statewright. Its core is a Rust engine that evaluates state machine definitions: states, transitions, guards and tool restrictions. Its orchestration doesn't use an LLM, just enforces the state machine. On top of that is a plugin layer that integrates with Claude Code (and soon Codex, Cursor and others) via MCP. When you activate a workflow, hooks enforce the guardrails per state automatically. The model sees 5 tools available instead of dozens, gets clear instructions for the current phase and transitions when conditions are met. Importantly it tells the model when it's attempting to do something that isn't in scope, incorrect or when it needs to try something else after getting stuck. You can use your agent via MCP to build a state machine for you to solve a problem in your current context. The visual editor at statewright.ai lets you tweak these workflows in a graph view... You can clearly see the failure paths, the retry loops and the approval gates. State machines aren't DAGs; they loop and retry, which is what agentic work actually needs. Statewright is currently live with a free tier, try it out in Claude Code by running the following: /plugin marketplace add statewright/statewright /plugin install statewright /reload-plugins Then "start the bugfix workflow" or /statewright start bugfix. You'll need to paste your API key when prompted. The latest versions of Claude may complain -- paste the API key again and say you really mean it, Claude is just being cautious here. Feedback is welcome on the workflow editor, the plugin experience, and tell me what workflows you'd want to build first. Agents are suggestions, states are laws. https://ift.tt/NZf7wQm May 12, 2026 at 07:54PM

Tuesday, May 12, 2026

New Parking Payment Options: More Flexibility and Helpful Reminders

New Parking Payment Options: More Flexibility and Helpful Reminders
By Pamela Johnson

Learn how our new parking payment options offer more flexibility so you can get where you need to go. Paying for metered parking in San Francisco has never been easier or more flexible thanks to two new mobile payment options we’re offering. With HotSpot and ParkMobile, you can pay for metered parking directly from your phone. By offering more choices, we are making it more convenient than ever to pay for parking. With both HotSpot and ParkMobile, you can: Pay for parking at any SFMTA parking meter using your phone Get reminders before your session expires, helping you avoid citations Extend...



Published 2026-05-11T00:00:00Z
https://ift.tt/qH9K0wc

Show HN: Mimik – open-source local-first alternative to Scribe and Tango https://ift.tt/WNTRrpS

Show HN: Mimik – open-source local-first alternative to Scribe and Tango https://ift.tt/Z7lqKiy May 11, 2026 at 11:18PM

Show HN: SyncBank – Self-hosted bank sync for EU banks https://ift.tt/QBrmnDl

Show HN: SyncBank – Self-hosted bank sync for EU banks https://syncbank.app/ May 11, 2026 at 11:32PM

Show HN: Petri – Drop-in Postgres image that forks a DB per test https://ift.tt/APIWNEf

Show HN: Petri – Drop-in Postgres image that forks a DB per test Rolling it out at work to parallelize 4,257 tests across 5 services. It fix...