The Agent is The Loop
The llm-loop-plugin gives Simon Willison's LLM CLI the ability to loop and iterate autonomously. Instead of being a bottleneck feeding prompts one by one, you can set a goal and watch it work file by file until complete. The magic isn't in the AI model—it's in the loop.
The Day the Skeptic Blinked
Kenton Varda, a Cloudflare engineer who was skeptical of AI, tested Claude by building an OAuth library. The code was surprisingly good, leading him to realize the power isn't in AI replacing humans, but in the combination of AI speed and human expertise.
Hear me out: “Adversarial Pair Coding with AI Agents” — feels nice, keeps me in the flow and — velocity is immense!
+----------------------------+ | Coder Agent | | - Generates Code | | - Learns patterns | | - Optimizes logic | +----------------------------+ | +----------------------------+ | Shared Understanding | | - Language rules | | - Functional goals | | - Iterative improvement | +----------------------------+ | +----------------------------+ | Adversary Agent | | - Finds bugs | | - Suggests attacks | | - Tests edge cases | +----------------------------+
AI Agents Dashboard
A web UI for deploying and managing AI agents in containers
Simplify AI operations with AI Agents Dashboard—a single web interface that combines container-use, Coder AgentAPI, and Claude. Launch a primary agent instance from the dashboard, which then spins up additional isolated agent environments in containers. Monitor resource usage, health, and logs in real time, and start, stop, or scale any agent without using the command line.
“Orchestrate AI at scale, one container at a time.”
Target market: DevOps teams, AI researchers, and software engineers who need an easy way to deploy, observe, and control multiple Claude agents within containerized workflows.
When AI Does Research: An End-to-End Experiment
AI agents can now handle end-to-end research workflows--from conceiving studies to final publication. This experiment revealed that SOTA models excel at research thinking, full reproducibility becomes trivial, and human time can finally be redistributed to the most valuable parts: thinking and doing better.
The Amplification of Bottlenecks
AI doesn't just make work faster--it amplifies hidden constraints. At Anthropic, eliminating coding bottlenecks revealed decision-making, integration, and context as the real limitations. Every breakthrough follows this pattern: solve one constraint, amplify the next.
Personal Website CMS
I’ve toyed with creating a personal website for years—something beyond just a blog. I’ve started many before but never had the discipline to maintain them. Now, finally I have something close to my mental ideal. In addition to ths Astro based site, I did a light weight git based CMS for managing “collections” (think: posts, etc.) with Groq integration to write frontmatter. This was one-shotted with Claude 4 Sonnet (prompt design), v0 (prototype), and then fixed to make it work with Claude Code.
Current Focus
Still reading “AI Engineer” by @chipro—it’s exceptionally well-written and thoroughly researched, warm recommendation. But, as always, I have several books around the house in various stages of reading; Stephen King’s On Writing, some Lee Child book (fascinated by the sheer flow of words), and Superagency by Reid Hoffman on my Kindle.
Daily Routine
It is really hard to hit 10k steps, last night I did 76 minutes of walking just to hit the goal, barely. The day was slow.
How AI Agents Are Reshaping Creation
Today's AI agents excel at computer operation and research, maintain coherence for hours, favor curious problem-solvers over technical experts, and are democratizing software creation while challenging traditional employment models.
What Sourcegraph learned building AI coding agents
AI coding agents work best with inversion of control, curated context over comprehensive, usage-based pricing for real work, emergent behaviors over engineered features, rich feedback loops, and agent-native workflows. The revolution is here--adapt or be displaced.
Mastering Claude Code: Boris Cherny's Guide & Cheatsheet
A practical guide to Claude Code, including setup, codebase Q&A, tool usage, context best practices, scripting, and power user tips, distilled from Boris Cherny's talk.
Why Senior Engineers Overlook Small AI Wins
Senior engineers often dismiss small AI coding power-ups (like smarter autocomplete or better error messages), not realizing these tweaks can totally change how users feel about a product.
The only way you’re going to figure this out is by getting your hands dirty and seeing what works.

Claude 4 Sonnet loves complex dashboard visualisations. I have been playing with my Garmin data to better understand agentic future of data science research.
AI Coding Agent Pricing
Current AI coding agents have misaligned pricing—users pay for agent inefficiencies and over-iteration. Credit burn rates are unpredictable and scale with agent behavior, not user value. Solutions include fair-use models, temporal arbitrage, outcome-based pricing, and hybrid local/remote approaches.
It is wild watching an AI agent pursue dependency chains with robotic determination, burning computational resources chasing “just one more fix.” It’s just what happens when you engage with complex systems, whether you’re carbon-based or running on silicon. The yak always needs shaving, apparently.

Wrote a short research paper with help from Cursor and based on the survey I did with v0 and distributed during my O’Reilly talk.
Human requests are binary: fix this thing, answer this question. But agents operate in probabilistic space, spawning subprocess after subprocess, each one justified by some internal logic tree I never asked for. The billing model assumes perfect alignment between what I want and what the machine thinks I need. Spoiler: there isn’t any.

Claude 4 as image critic.
Blink, and the entire AI landscape could shift
The AI developer tooling market is moving faster than ever, with big players acquiring startups and releasing powerful coding agents. Interfaces are becoming commoditized, token economics will drive cost efficiency, spec-driven workflows prevail, memory persistence is key, and incumbents' flywheel grows stronger.