What we learned building sandbox for document agent

Cross-posted from the Raycaster blog; I’ve spent the last several months building this, and here is my take. 2025 brought us the new idiom for building AI (beginning with Manus and Claude Code): give it tools to operate a computer. This is a break from the past default approach represented by ChatGPT, which is LLM + a menu of bespoke API connections to plug in to various systems of record. Our first attempt at a document agent was to ingest documents, parse them into plaintext pages, expose search/read/write tools, and let the LLM operate over virtual directories of artifacts and pages backed by a SQL database. ...

May 18, 2026 · 5 min

Test-Driven Development with an LLM for Fun and Profit

Welcome to the very first post in a new blog! Here I will discuss software development, SRE work, and other fun stuff. Sometimes an idea is just too good to pass up. I hope this blog will motivate me to turn sparks and little pieces into general knowledge in writing the words down. The other day I was discussing Tabby with a coworker. We talked about whether we should consider AI-autocompleted code harmful and ditch everyone’s newfound habit due to LLM’s inherent unreliability and their tendency toward spaghetti code, throwing traditional software engineering principles like DRY out the window. I disagreed: what if we could have a framework that integrates AI development tooling while also making everything better and more reliable instead? This instantly reminds me of Test-Driven Development, or TDD, which I think is great when combined with the use of a Large Language Model. ...

January 16, 2025 · 7 min