Field Notes - Nov 24, '25

Executive Signals

Events are the new logs: orchestration truth lives in durable job events
Hardware localness matters: optimize where it runs, not where you code
Interrupts are product, not bugs: treat banners and modals as versioned dependencies
Fresh sessions find truth: reset state or E2E tests mislead every release
Artifacts or it didn’t happen: centralized run evidence shortens postmortems and review cycles

Manual repros often close consent popups and bypass the real failure path. Start every repro with a clean session so adapters face the exact gates users do. Make this the default locally and in CI so failures are observable, not masked.

Launch test browsers in incognito with storage and cache cleared
Add a preflight that detects and dismisses consent/popups before flows
Assert “no popups present” after login; fail the run if present

Make the Queue the Source of Truth

If workers complete jobs but the dashboard never records them, you have luck, not orchestration. Treat events as the ledger: every unit of work must emit durable status so visibility, retries, and SLAs rely on facts, not logs.

Emit start, heartbeat, and terminal events per job; treat missing acks as failures
Separate broker from result backend; power dashboards from the result store
Block deploys if telemetry coverage for queued tasks is not 100%

Centralize Run Artifacts Early

Scattered logs and screenshots turn debugging into folklore. Use a shared object store and tie artifacts to job IDs so anyone can reconstruct a run in minutes and postmortems move from guesswork to evidence.

Standardize paths: org/env/jobID/timestamp/*
Attach artifact URIs to job records and surface them in the UI
Set retention by severity; keep failures for 30–90 days

Optimize Where It Runs, Not Where You Code

Local timings rarely predict server reality. Optimize adapters with server-side benchmarks and guardrails, and only celebrate wins that move production latency, not laptop microbenchmarks.

Add server performance CI with per-adapter SLAs
Alert on regressions greater than 10% versus the last green baseline
Prioritize hot paths; ignore improvements that don’t shift server latency

Treat Consent/UI Interrupts as First-Class Requirements

OEM and retail sites mutate UI with banners and one-off modals. Handle interrupts as a shared dependency, not bespoke fixes, so resilience improves with each new variant.

Maintain a shared interrupts library (selectors, close actions, timeouts)
Version it and roll updates across adapters via a single dependency
Track an interrupt miss rate and drive it toward zero

Executive Signals

Events are the new logs: orchestration truth lives in durable job events

Hardware localness matters: optimize where it runs, not where you code

Interrupts are product, not bugs: treat banners and modals as versioned dependencies

Fresh sessions find truth: reset state or E2E tests mislead every release

Artifacts or it didn’t happen: centralized run evidence shortens postmortems and review cycles

Engineering

Launch test browsers in incognito with storage and cache cleared

Add a preflight that detects and dismisses consent/popups before flows

Assert “no popups present” after login; fail the run if present

Emit start, heartbeat, and terminal events per job; treat missing acks as failures

Separate broker from result backend; power dashboards from the result store

Block deploys if telemetry coverage for queued tasks is not 100%

Standardize paths: org/env/jobID/timestamp/*

Attach artifact URIs to job records and surface them in the UI

Set retention by severity; keep failures for 30–90 days

Local timings rarely predict server reality. Optimize adapters with server-side benchmarks and guardrails, and only celebrate wins that move production latency, not laptop microbenchmarks.

Add server performance CI with per-adapter SLAs

Alert on regressions greater than 10% versus the last green baseline

Prioritize hot paths; ignore improvements that don’t shift server latency

OEM and retail sites mutate UI with banners and one-off modals. Handle interrupts as a shared dependency, not bespoke fixes, so resilience improves with each new variant.

Maintain a shared interrupts library (selectors, close actions, timeouts)

Version it and roll updates across adapters via a single dependency

Track an interrupt miss rate and drive it toward zero

Field Notes

Field Notes - Nov 24, '25

Executive Signals

Engineering

Reset State or Your E2E Tests Lie

Make the Queue the Source of Truth

Centralize Run Artifacts Early

Optimize Where It Runs, Not Where You Code

Treat Consent/UI Interrupts as First-Class Requirements

Field Notes

Field Notes - Nov 24, '25

Executive Signals

Engineering

Reset State or Your E2E Tests Lie

Make the Queue the Source of Truth

Centralize Run Artifacts Early

Optimize Where It Runs, Not Where You Code

Treat Consent/UI Interrupts as First-Class Requirements