homeblogabout
  • rss

  • twitter

  • linkedin

© 2025

Field Notes

Field Notes are fast, from-the-trenches observations. Time-bound and may age poorly. Summarized from my real notes by . Optimized for utility. Not investment or legal advice.

Notebook background
░░░░░░░▄█▄▄▄█▄
▄▀░░░░▄▌─▄─▄─▐▄░░░░▀▄
█▄▄█░░▀▌─▀─▀─▐▀░░█▄▄█
░▐▌░░░░▀▀███▀▀░░░░▐▌
████░▄█████████▄░████
=======================
Field Note Clanker
=======================
⏺ Agent start
│
├── 1 data sources
└── Total 1.6k words
⏺ Spawning 1 Sub-Agents
│
├── GPT-5: Summarize → Web Search Hydrate
├── GPT-5-mini: Score (Originality, Relevance)
└── Return Good Notes
⏺ Field Note Agent
│
├── Sorted to 1 of 7 sections
├── Extracting 5 key signals
└── Posting Approval
⏺ Publishing
┌────────────────────────────────────────┐
│ Warning: Field notes are recursively │
│ summarized by agents. These likely age │
│ poorly. Exercise caution when reading. │
└────────────────────────────────────────┘

Field Notes - Nov 24, '25

Executive Signals

  • Events are the new logs: orchestration truth lives in durable job events
  • Hardware localness matters: optimize where it runs, not where you code
  • Interrupts are product, not bugs: treat banners and modals as versioned dependencies
  • Fresh sessions find truth: reset state or E2E tests mislead every release
  • Artifacts or it didn’t happen: centralized run evidence shortens postmortems and review cycles

Engineering

Reset State or Your E2E Tests Lie

Manual repros often close consent popups and bypass the real failure path. Start every repro with a clean session so adapters face the exact gates users do. Make this the default locally and in CI so failures are observable, not masked.

  • Launch test browsers in incognito with storage and cache cleared
  • Add a preflight that detects and dismisses consent/popups before flows
  • Assert “no popups present” after login; fail the run if present

Make the Queue the Source of Truth

If workers complete jobs but the dashboard never records them, you have luck, not orchestration. Treat events as the ledger: every unit of work must emit durable status so visibility, retries, and SLAs rely on facts, not logs.

  • Emit start, heartbeat, and terminal events per job; treat missing acks as failures
  • Separate broker from result backend; power dashboards from the result store
  • Block deploys if telemetry coverage for queued tasks is not 100%

Centralize Run Artifacts Early

Scattered logs and screenshots turn debugging into folklore. Use a shared object store and tie artifacts to job IDs so anyone can reconstruct a run in minutes and postmortems move from guesswork to evidence.

  • Standardize paths: org/env/jobID/timestamp/*
  • Attach artifact URIs to job records and surface them in the UI
  • Set retention by severity; keep failures for 30–90 days

Optimize Where It Runs, Not Where You Code

Local timings rarely predict server reality. Optimize adapters with server-side benchmarks and guardrails, and only celebrate wins that move production latency, not laptop microbenchmarks.

  • Add server performance CI with per-adapter SLAs
  • Alert on regressions greater than 10% versus the last green baseline
  • Prioritize hot paths; ignore improvements that don’t shift server latency

Treat Consent/UI Interrupts as First-Class Requirements

OEM and retail sites mutate UI with banners and one-off modals. Handle interrupts as a shared dependency, not bespoke fixes, so resilience improves with each new variant.

  • Maintain a shared interrupts library (selectors, close actions, timeouts)
  • Version it and roll updates across adapters via a single dependency
  • Track an interrupt miss rate and drive it toward zero
PreviousNov 21, 2025
NextNo future notes
Back to Blog