
Field Notes - Dec 05, '25
Executive Signals
- Agents are the new interns: schedule tiny merges that steadily raise baseline quality
- Ensembles beat experts: diverse reviewers catch distinct defects before merge
- Split planes, ship faster: local speed, cloud breadth prevent tool thrash
- Noisy alerts : alert-to-PR pipelines convert detection into concrete fixes
- Guardrails enable velocity: fenced automations move fast without compliance surprises
- Time budgets over taste: workflows handle sleeps, retries, and durability
CEO
Adopt an Agent Framework Before the Glue Hardens
If you’re adding memory, guardrails, tools, and an operator studio, you’re rebuilding a framework. Pick one intentionally before bespoke adapters and scripts calcify. Favor a thin vendor adapter so protocol churn doesn’t cascade across your stack. Pilot in a sandbox with PMs and one backend service; expand only when review/merge times fall meaningfully.
- Select frameworks with memory, guardrails, tool calling, simple RAG, workflows, HITL, and an operator studio
- Keep MCP and similar protocols behind a thin adapter to absorb churn
- Gate rollout on measurable cuts to review/merge time
When to Use Specialized AI Code Review Platforms
General models with redundancy carry you far. Bring in a specialized, org‑aware reviewer when throughput or variance overwhelms human cycles: think >30 PRs/week, >36 hours median time‑to‑merge, or reviewers as the bottleneck. Encode standards once; let the platform enforce, learn exceptions, and thread fixes. Keep human ownership on security and schema changes.
- Add org‑aware reviewers when scale or variance degrades merge velocity
- Encode standards centrally; use feedback to teach exceptions
- Require human owners for security and schema diffs
Engineering
Automate One‑a‑Day PRs for Compound Gains
Use agentic “find‑and‑fix” loops to ship tiny, daily improvements: doc truthiness, missing tests, and slow queries. Maintain a repo map so agents chase cross‑service causes, not just noisy surfaces. Add a “do nothing if clean” rule so humans review and merge in minutes.
- Schedule daily fixes per repo for docs, tests, and slowest queries
- Maintain a cross‑service repo map to trace upstream causes
- Enforce no‑op if clean; keep review effort measured in minutes
Run a Multi‑Model Code Review Gauntlet
Different reviewers catch different error classes. Put 2–3 distinct code reviewers (models/tools) on every PR, auto‑apply fixes, and re‑review until the signal drops. Gate on “no unresolved P1s,” CI green, and typecheck clean. Cap auto cycles at two, then a human ships or kills with a “ready to merge” pass.
- Use architecture, program analysis/runtime, and UI/Docs reviewers in parallel
- Gate merges on P1=0, CI green, and typecheck clean
- Cap at two auto cycles; require a final “ready to merge” pass
Split Local IDE from Cloud Agents
Keep local dev fast and isolated; keep cloud automations cross‑repo and scheduled. Treat them as different planes until the market converges. Local gets agentic IDEs with per‑workspace worktrees so agents don’t collide. Cloud runners trigger from schedules and alerts and open PRs across repos. Hide model vendors behind a thin adapter you can swap in 4–6 months.
- Isolate agent work in per‑workspace worktrees locally
- Run cross‑repo automations on cloud runners via schedules/alerts
- Abstract vendors behind a thin adapter for swap‑ability
Wire Observability Directly to PRs
Turn alerts into action. Pipe error rates and latency SLOs into an agent that files tickets, proposes diffs, and opens PRs. Review becomes the constraint, not detection. Include a nightly “optimize slowest query” job that preserves output shape while improving runtime. Rate‑limit fixes and require owner review on hot paths.
- Build alert → ticket → PR as one automated path
- Schedule nightly “optimize slowest query” with output‑shape guarantees
- Rate‑limit agent fixes; require owner review on hot paths
Guardrails for Agentic Dev in Compliance Environments
Let agents move fast, but fence blast radius. Enforce tiered scanning: branches get lint/typecheck/quick scans; main and pre‑prod get full SAST/DAST/secret scans; prod deploys run policy checks with human sign‑off for sensitive paths. Protect config directories with mandatory human owners. Block agent edits to guardrail files, allow adjacent fixes.
- Enforce tiered scans by environment and path sensitivity
- Protect configs (linters, auth, encryption, migrations) with human ownership
- Block agent edits to guardrail files; allow adjacent changes only
Pick Workflow Runners by Time Budget, Not Taste
Use workflow runners when tasks exceed an interactive session or need timers/checkpoints. Optimize for mental overhead per unit of throughput. If a job runs >15 minutes or relies on sleeps/retries, move it to workflows. If >60 minutes, chunk and persist state. Keep interactive tweaks local; reserve workflows for durable automations.
- Move jobs >15 minutes or with timers into workflows; chunk >60 minutes
- Prefer hosted runners with sleeps, retries, and HITL primitives
- Keep interactive tweaks local; automate durable jobs in workflows