Beacon’s testing strategy is built on three principles: test behavior not implementation, explicitly test concurrency, and treat every data integrity invariant as a first-class test case.

Framework: Vitest + @cloudflare/vitest-pool-workers (real workerd runtime for all integration and E2E tests).

See testing for local test setup and CI configuration.


Coverage Model

LayerStatus
Pure functions (parsing, hashing, transforms)Covered
AI pipeline quota + cache invalidationCovered — quota cache, bypass round-trip, setMediaQuotaBypass invalidation
Error recovery (isRecoverableErrorMessage, retry counter)Covered — 43 parametrized unit tests
Deduplication (intra-batch + concurrent)Covered — seenInBatch set, INSERT OR IGNORE idempotency
Pillar classifier (keyword + AI)Covered
Queue consumer (all message types)Covered — backfill_concept_pillars, regenerate_weekly, media_analysis, standard export
Cron scheduled handlerCovered — dual-schedule routing, stuck export detection, parallel sub-tasks
Public API cursor paginationCovered — no-overlap pages, null cursor on final page, offset vs cursor response shapes
Destructive operationsCovered — /clear scope, /replay/all missing R2 skip, /replay/export 4xx
Data integrity (multi-source, media dedup)Covered — DI-1/3/4/5/6/8; DI-2 (days_processed ≠ digests) not yet written
Failure scenariosCovered — F2/F4/F6/F8; F1 (quota mid-batch partial days) not yet written
Performance pathsGap — staging only; not in CI

1. Unit Tests

Pure functions, transformations, and state machines with no I/O.

Priority gaps:

FunctionFileWhat to assert
assignPillarsKeywordOnly(graph)lib/concepts/pillar_classifier.tsEach node gets keyword pillar or null; no AI called
assignPillarsToGraph()sameNodes updated; keyword fallback used when AI throws
keywordFallback(label)sameAll 6 pillars matched by representative keywords
isRecoverableErrorMessage(msg)lib/retry/self-healing.tsTrue for quota/rate/timeout/D1_ERROR; false for auth/schema
canAutoRetry(msg, max)sameRespects MAX_ATTEMPTS=3; false on 4th attempt
nextAutoRetryError(msg)sameCounter increments correctly
canMakeAICall(db, 'daily')lib/ai/quota.tsFalse at limit; true with bypass; respects neuron cap
computeContentHash(content)lib/whatsapp/dedupe.tsDeterministic; NFKC-normalized input produces same hash
invalidateMediaConfigCache()lib/media/quota.tsNext read re-fetches from D1
setMediaQuotaBypass()lib/media/quota.tsWrites to D1; immediately invalidates in-memory cache
Cursor pagination next_cursorapps/pulse-public/src/index.tsEquals last week_start when results.length == limit; null otherwise

2. Integration Tests

Multi-component flows using real D1/R2/Queue bindings via cloudflare:test.

Queue Consumer — All Message Types

Message typeKey assertions
backfill_concept_pillarsLoads graph from D1; runs AI pillar assignment; saves; no-op if graph missing
regenerate_weeklyCorrect source_id in weekly_summaries_public; quota exhaustion calls message.retry()
regenerate_weekly_allAll sources processed; parallelizes up to SOURCE_CONCURRENCY=3
media_analysisOnly processes pending_analysis; skips other statuses; sets analyzed_at
media_deferred_batchResumes up to quota limit; correct status transitions
Standard exportFull path: parse → dedup → media artifacts → daily digests → status=completed

Cron Scheduled Handler

Two separate schedules must route correctly:

CronTasks invokedTasks NOT invoked
*/5 * * * *finalizeCompleted, retryFailed, retryMedia, resetDeferred, requeueStaleenqueueDeferredBatches, cleanupRetained
*/15 * * * *enqueueDeferredBatches, cleanupRetainedAll five high-priority tasks

Additional assertions:

  • Export with status=processing, days_processed < days_total, updated_at > 10 min ago is re-enqueued by */5
  • Export updated 9 min ago is NOT re-enqueued (threshold respected)
  • If one sub-task throws, the scheduled handler still returns without propagating

Quota Round-Trip

  • Fill daily_calls to limit → next canMakeAICall returns false
  • Set bypass=true → calls proceed regardless of limit
  • setMediaQuotaBypass(db, true) then getMediaQuotaStatus(db) in same invocation → bypassEnabled=true (cache invalidation)

3. End-to-End Tests

Complete admin workflows using SELF.fetch() against the real Worker.

E2E-1: Full Export Lifecycle Upload → queue → assert status transitions (queued → processing → weekly_processing → completed) → assert daily_digests and weekly_summaries_public created → GET /pulse/history.json returns the week.

E2E-2: Quota Exhaustion + Recovery Fill quota → upload → assert re-enqueued with delay → POST /quota/bypass → GET /quota/status returns bypass_enabled=true immediately → process again → assert completed.

E2E-3: Replay After Failure Create export with status=failed → POST /replay/export → assert re-enqueued → consumer → assert status=completed.

E2E-4: Clear + Verify Scope Create digests for source A and source B on same date → POST /clear with source_id=A → assert source A deleted, source B untouched, overlapping weeks deleted for A only.

E2E-5: Cursor Pagination Insert 50 weekly_summaries_public rows → paginate with limit=20 and cursor → assert no overlap, no gap across all pages, final page returns next_cursor=null.


4. Concurrency & Parallelism Tests

The highest-risk gap. All parallel execution paths must have explicit idempotency assertions.

C1. Concurrent Deduplication Two queue messages for same community/source with overlapping messages processed via Promise.all. Assert: exactly N rows in message_hashes, not 2N. INSERT OR IGNORE must be idempotent.

C2. Parallel Image Stage 1 Atomicity reserveMediaUsage(db, 'imageStage1', 2) called once before Promise.all([classifier, detector]). Assert: exactly 2 units reserved; if quota allows 1 but not 2, the whole Stage 1 is rejected (no partial reservation).

C3. Video Frame Phase 1/Phase 2 Ordering Phase 1 (all frame extraction + Stage 1) runs in parallel. Phase 2 (Stage 2 caption decisions) is sequential. With 3 frames and maxStage2=2: assert exactly 2 Stage 2 calls (cap honored despite parallel Phase 1); stage2Used counter correct.

C4. Weekly Regen SOURCE_CONCURRENCY=3 5-source community → regenerateWeeklySummariesForCommunity. Assert: all 5 sources processed; no cross-source collision (source A summary doesn’t overwrite source B); weeks within each source sequential.

C5. Parallel Cron Sub-Tasks Spy all 5 high-priority tasks. Run scheduled({ cron: '*/5 * * * *' }). Assert: all 5 invoked; start times overlap (not sequential); one throwing doesn’t prevent others from completing.

C6. Deferred Media Batch Limit 20 exports with deferred media → enqueueDeferredMediaBatches() → assert exactly 12 media_deferred_batch messages enqueued (LIMIT=12 respected).


5. Failure Scenarios

F1. Quota Exhaustion Mid-Batch Export with 10 days; quota allows only 3. Assert: days 1–3 processed, days_processed=3, export re-enqueued with delay secondsUntilReset + 300, status=processing (not failed).

F2. AI Schema Validation Failure Mock AI returns malformed JSON. Assert: JSON repair attempted (guards.ts); fallback neutral digest returned on repair failure; no crash; export continues to next day.

F3. R2 Object Missing on Replay Delete R2 object; POST /replay/export. Assert: HTTP 4xx returned; export status unchanged.

F4. D1 too-many-SQL-variables Mock D1 to throw on INSERT batch. Assert: retried with 30s × 2^attempt delay (capped 120s); attempt counter incremented in error_message.

F5. Error Classification Parametrized test across error types: quota/rate/D1_ERROR/network → isRecoverableErrorMessage()=true → retry. Auth/schema → false → status=failed.

F6. Max Retry Exhaustion Force recoverable error 4 times (> AUTO_RETRY_MAX_ATTEMPTS=3). Assert: 4th failure → status=failed permanently; retryRecoverableFailedExports() skips at max.

F7. Backfill Pillars — Graph Not Found Enqueue backfill_concept_pillars for non-existent week. Assert: warning logged; returns; no D1 write.


6. Data Integrity Tests

Invariants that, if violated, cause silent data corruption.

DI-1. Multi-Source Collision Prevention Insert digest for (community, sourceA, 2026-01-01) and (community, sourceB, 2026-01-01). Assert: both rows exist. The PK (community_id, source_id, day_date) added in fix_daily_digests_weekly_summaries_pk.sql must hold.

DI-2. days_processed ≠ digests exist Set days_processed = days_total; delete daily_digests rows for 2 days. Assert: pipeline re-processes the 2 missing days (queries daily_digests directly, not just the counter).

DI-3. Media Dedup Cross-Community Prevention Analyzed media with content_hash X in community A. New export in community B with same content_hash. Assert: batchFindExistingAnalysisByContentHash does NOT return community A’s result; community B enqueues its own analysis job.

DI-4. Intra-Batch Message Dedup Export with 5 duplicate messages on same day. Assert: message_hashes contains exactly 1 row; seenInBatch set prevents double-INSERT.

DI-5. Weekly Summary Thresholds < 3 days OR < 20 messages → no weekly_summaries_public row created; existing row deleted if re-processing drops below threshold.

DI-6. Raw Export Cleanup — Dual Deletion Run cleanupRetainedRawExports(). Assert: R2 object deleted AND exports.raw_deleted_at set; exports.status unchanged.

DI-7. Async Pillar Backfill — Null Handled Gracefully Public API queried before backfill_concept_pillars runs. Assert: response has pillar fields (keyword-assigned or null); no crash. After backfill: pillars are AI-quality.

DI-8. COALESCE NULL source_id Insert with source_id=null twice. Assert: ON CONFLICT updates (doesn’t create duplicate row). COALESCE(source_id, '') = COALESCE(?, '') queries correct across all tables.


7. Destructive Action Tests

DA-1. /clear Scope

ScenarioAssert
source_id=AOnly source A records deleted; source B untouched
source_id=nullOnly community-wide records deleted
clear_hashes=falsemessage_hashes untouched
clear_hashes=truemessage_hashes deleted in date range
start_date=end_dateOnly that exact date cleared
Missing community_id400 error; no deletion

DA-2. /replay/all — Missing R2 5 exports; 2 have R2 objects deleted. Assert: 3 enqueued; 2 skipped; response includes skipped_missing_raw_exports: 2.

DA-3. /replay/stuck + Cron Idempotency Both trigger concurrently. Assert: export enqueued at most twice; second processing run produces same final state (idempotent).

DA-4. Community DELETE Cascade DELETE /communities/{id}. Assert: all D1 tables zero rows for that community; R2 objects deleted.


8. Performance Baselines

Run in staging or against large test fixtures. Purpose: catch regressions automatically in CI.

TestMetricBaseline
Parser throughput10,000-message export< 500ms
Dedup INSERT scale1,000 messages × 7 days< 2,000ms per 1,000 messages
Media dedup batch100 artifacts, CHUNK_SIZE=24≤ 5 D1 queries; < 100ms
Cursor vs offset (deep page)offset=980 vs cursor equivalentCursor ≥ 50% faster
Parallel cron sub-tasks7 tasks × 200ms eachTotal < 400ms
Parallel image Stage 1Classifier + detector≤ 60% of sequential time
Weekly regen (5 sources × 10 weeks)SOURCE_CONCURRENCY=3≤ 40% of serial time

9. Test Infrastructure

D1 Isolation

Apply all migrations at test start via a shared applyMigrations(db) helper (packages/db/migrations/ in order). Each test file gets a fresh D1 instance from the workerd runtime.

Seed Factories

New helpers needed in test/helpers/:

  • buildExport(overrides), buildDailyDigest(overrides), buildMediaArtifact(overrides)
  • buildConceptGraph(nodeCount), buildWhatsAppZip(days)

AI Mock

  • Unit tests: lib/ai/mock.ts (existing)
  • Integration tests: vi.spyOn(env.AI, 'run') with configurable responses
  • Quota tests: mock AI to return neuron counts after N calls, then throw

Queue Mock

const enqueued: unknown[] = [];
vi.spyOn(env.beacon_pulse_uploads, 'send').mockImplementation(async (msg) => {
  enqueued.push(msg);
});

Cron Testing

// vitest-pool-workers exposes worker.scheduled()
await worker.scheduled({ cron: '*/5 * * * *' }, env, ctx);
await worker.scheduled({ cron: '*/15 * * * *' }, env, ctx);

10. Known Gaps

GapWhy HardMitigation
Cache API (caches.default)Behaves differently in workerd test runtimeSpy on cache calls; validate in staging
True D1 concurrency racesworkerd serializes D1 in test modeUse Promise.all to verify idempotency; document production behavior
KV eventual consistency60s TTL hard to test deterministicallyvi.useFakeTimers() + mock KV
Real AI calls cost neuronsCan’t run live AI in CIAI mock for all tests; staging validation only
Media frame extractionMEDIA binding unavailable in test runtimeMock env.MEDIA; validate video in staging
Queue DLQCF Queues has no configurable DLQTest retry exhaustion path; document that cron compensates

11. Implemented Test Files

All tests use Vitest + @cloudflare/vitest-pool-workers. 109 tests across 11 files as of the last update. Pre-existing failures in export-status.test.ts and several pulse-public tests are unrelated to this work.

pulse-ingest

FileTestsWhat it covers
test/unit/pillar-classifier.test.ts17assignPillarsKeywordOnly, assignPillarsToGraph, classifyConceptPillar keyword fallback, AI cache dedup
test/unit/quota-cache.test.ts9setMediaQuotaBypass invalidates cache; isMediaQuotaBypassEnabled null/bool handling
test/unit/error-recovery.test.ts43isRecoverableErrorMessage (all recoverable + terminal types), canAutoRetry boundary conditions, nextAutoRetryError counter, formatAutoRetryError
test/integration/cron.test.ts7Dual-schedule routing, stuck export detection SQL, re-enqueue threshold, error isolation
test/integration/queue-message-types.test.ts5backfill_concept_pillars (load/AI/save, no-op), regenerate_weekly quota retry, mixed-type batch
test/integration/data-integrity.test.ts7DI-1 multi-source PK, DI-3 media dedup community scope, DI-4 intra-batch dedup, DI-5 weekly threshold, DI-6 cleanup guard + raw_deleted_at, DI-8 COALESCE
test/integration/destructive.test.ts7/clear 400 validation, source scoping, clear_hashes flag; /replay/all missing R2 skip + count; /replay/export 4xx
test/integration/concurrency.test.ts6C1 seenInBatch dedup, C2 atomic reserveMediaUsage(2), C4 SOURCE_CONCURRENCY=3 no cross-source collision, C5 parallel cron tasks, C6 deferred batch LIMIT=12
test/integration/failure-scenarios.test.ts8F2 schema failure no crash, F4 too-many-SQL-variables exponential backoff, F6 max retry exhaustion (skip vs re-enqueue), F8 missing R2 on media_analysis

pulse-public

FileTestsWhat it covers
test/integration/pagination.test.ts7Cursor next_cursor correctness, no page overlap, null on final page, offset vs cursor response shape, 404 on empty