homeblogabout
  • rss

  • twitter

  • linkedin

© 2025

Field Notes

Field Notes are fast, from-the-trenches observations. Time-bound and may age poorly. Summarized from my real notes by . Optimized for utility. Not investment or legal advice.

Notebook background
░░░░░░░▄█▄▄▄█▄
▄▀░░░░▄▌─▄─▄─▐▄░░░░▀▄
█▄▄█░░▀▌─▀─▀─▐▀░░█▄▄█
░▐▌░░░░▀▀███▀▀░░░░▐▌
████░▄█████████▄░████
=======================
Field Note Clanker
=======================
⏺ Agent start
│
├── 1 data sources
└── Total 3.7k words
⏺ Spawning 1 Sub-Agents
│
├── GPT-5: Summarize → Web Search Hydrate
├── GPT-5-mini: Score (Originality, Relevance)
└── Return Good Notes
⏺ Field Note Agent
│
├── Sorted to 2 of 7 sections
├── Extracting 4 key signals
└── Posting Approval
⏺ Publishing
┌────────────────────────────────────────┐
│ Warning: Field notes are recursively │
│ summarized by agents. These likely age │
│ poorly. Exercise caution when reading. │
└────────────────────────────────────────┘

Field Notes - Nov 05, '25

Executive Signals

  • Checkpoints beat long context: reboot model memory before cliff to cut fabrications
  • Dry-runs are production: simulate side effects to de-risk partner dependencies
  • Provenance before dashboards: tag automation at write, sample and alert
  • Windows are ops, not engineering: decouple calendar from correctness with daily dry-runs

Sales

Tag Automation In The CRM, Not In Notes

Provenance should live on the record, not buried in logs or personal docs. Add an immutable Automation Source field (Human | System) set only by the integration user. Route early System records to a review queue, then keep a lightweight sampling rhythm after burn-in. Alert when humans touch system-written records within 24 hours so operators can catch drift and misuse.

  • Lock the field from manual edits; populate via the integration profile
  • Build a “System” review queue; sample ~5% after stabilization
  • Alert on human edits to system-written records within 24 hours

Engineering

Stop End‑of‑Context Hallucinations With Handoff Resets

LLMs start “finishing the test” near the 70-80% context cliff. Force a structured reset: ask the model to write a long, explicit handoff for “a junior dev with amnesia,” then start a fresh thread seeded only with that handoff. This reboots working memory and cuts fabricate-to-finish behavior. Treat each reset like a checkpoint and diff outputs before writes.

  • At 70-80% token budget, request an exhaustive handoff
  • New thread: paste handoff, restate task, require stepwise diffs
  • Compare outputs before committing writes

De‑risk External APIs With A Dry‑Run Track

Partner integrations shouldn’t gate progress. Ship a flag-gated “no‑submit” mode that renders artifacts but suppresses side effects. Run real data through this dry‑run while credentials and objects are provisioned. Hold the integration owner to a tight SLA; parallelize, escalate, or add a temporary manual bridge if they slip.

  • Ship “no‑submit” mode that produces artifacts for review
  • Run 5–10 representative one‑offs per week with cancel/ignore paths
  • Set a 48–72h SLA for setup; escalate or add a manual bridge if breached

Decouple Build From The Business Window

Submission windows are an ops constraint, not an engineering one. Do daily dry‑runs and selective real submissions with a pathfinder cohort 10–14 days ahead so the “at‑scale” day is procedural, not a first run. Maintain kill switches and cancel protocols per brand or tenant. By window day, limit risk to volume and queuing, not correctness.

  • Start daily dry‑runs 10–14 days pre‑window with a pathfinder cohort
  • Maintain per‑tenant kill switches and cancel protocols
  • Reserve window‑day risk to throughput and queuing, not correctness
PreviousNov 3, 2025
NextNov 6, 2025
Back to Blog