homeblogabout
  • rss

  • twitter

  • linkedin

© 2026

Field Notes

Field Notes are fast, from-the-trenches observations. Time-bound and may age poorly. Summarized from my real notes by . Optimized for utility. Not investment or legal advice.

Notebook background
░░░░░░░▄█▄▄▄█▄
▄▀░░░░▄▌─▄─▄─▐▄░░░░▀▄
█▄▄█░░▀▌─▀─▀─▐▀░░█▄▄█
░▐▌░░░░▀▀███▀▀░░░░▐▌
████░▄█████████▄░████
=======================
Field Note Clanker
=======================
⏺ Agent start
│
├── 1 data sources
└── Total 4.6k words
⏺ Spawning 1 Sub-Agents
│
├── GPT-5: Summarize → Web Search Hydrate
├── GPT-5-mini: Score (Originality, Relevance)
└── Return Good Notes
⏺ Field Note Agent
│
├── Sorted to 2 of 7 sections
├── Extracting 5 key signals
└── Posting Approval
⏺ Publishing
┌────────────────────────────────────────┐
│ Warning: Field notes are recursively │
│ summarized by agents. These likely age │
│ poorly. Exercise caution when reading. │
└────────────────────────────────────────┘

Field Notes - Jan 8, '26

Executive Signals

  • Batch beats bravado: deployment freezes during runs protect the money hour
  • Stateful beats stateless: retries as one job, logs and idempotency preserve truth
  • Calendar is capacity planning: capacity mirrors calendar spikes, not comforting long-run averages
  • Portals before parsing: source status where truth lives, email only when forced
  • AI for the long tail: triage evidence fast, humans own ambiguous outcomes

Customer Success

Make Upstream Gaps First-Class Issues

Many “failures” are upstream data gaps, not system faults. Classify them as automation-failed with human-readable notes and avoid silent fallbacks. When a portal URL mismatches the CRM, fail and escalate to the issuer instead of selecting “Other,” which risks contaminating defaults. Fix once at the source, then re-run cleanly.

  • Standardize a fault taxonomy; auto-write CRM notes with the exact blocker
  • Auto-template issuer escalations with evidence and expected URL; track SLAs
  • Provide one-click Return to New to re-run clean after data fix

Status Tracking: Portal First, Email Last

Pull status from issuer portals wherever available; resort to inbound email parsing only when the issuer mandates it. Standardize on one inbound provider and schema so retries, poison queues, and audits are consistent. Store raw messages and parsed artifacts, and hold a visible status SLO.

  • Pick a single inbound provider; define schema, retries, and a poison queue
  • Target 95% status updates reflected within 15 minutes of change
  • Archive raw messages and parsed artifacts for auditability

Engineering

Operate the Plant: Batch Windows and Calendar Bursts

Mid-run redeploys multiplied failures; freeze deploys during active batch windows, breaking glass only for genuine infra emergencies. Load spikes concentrate around the 5th—plan capacity and processes around the calendar, not the average. Provide a scheduler kill switch plus pause/resume so operators can fix issues without compounding failures.

  • Enforce no-deploy windows around peaks with tooling and alerts
  • Scale workers 3–5x from the 3rd–7th; watch queue age and tail latency
  • Add scheduler pause/kill and post-run smoke tests before resuming

Stateful Retries and Atomic Deletes

Chained retries hid context and dropped arguments. Collapse into a single self-restarting job that preserves inputs and appends per-attempt logs; retries re-queue to the tail to reduce contention. Deletion should be per-job and atomic: soft-delete with a tombstone and reason, prevent cross-attempt cascades, and allow a short undelete window for operator error.

  • Persist attempt count, timestamps, stdout/stderr; backoff with a hard cap
  • Keep stable job IDs for idempotency; block orphaned children on delete
  • Implement soft-delete with audit trail and a short undelete window

AI-Assisted QA for Evidence Review

Bundling queue JSON and screenshot URLs into an LLM produced fast pass/fail labels and reconciled results with the tracking sheet. Use it to clear the long tail while maintaining human oversight. Treat model output as triage, not ground truth, especially where stakes are high.

  • Automate export and store prompt, inputs, and outputs with each job
  • Human spot-check 5–10% and all ambiguous cases
  • Escalate deviations; never treat the model’s label as final truth
PreviousJan 7, 2026
NextJan 9, 2026
Back to Blog