Testing Strategy

Beacon’s testing strategy is built on three principles: test behavior not implementation, explicitly test concurrency, and treat every data integrity invariant as a first-class test case.

Framework: Vitest + @cloudflare/vitest-pool-workers (real workerd runtime for all integration and E2E tests).

See testing for local test setup and CI configuration.

Coverage Model

Layer	Status
Pure functions (parsing, hashing, transforms)	Covered
AI pipeline quota + cache invalidation	Covered — quota cache, bypass round-trip, `setMediaQuotaBypass` invalidation
Error recovery (`isRecoverableErrorMessage`, retry counter)	Covered — 43 parametrized unit tests
Deduplication (intra-batch + concurrent)	Covered — `seenInBatch` set, `INSERT OR IGNORE` idempotency
Pillar classifier (keyword + AI)	Covered
Queue consumer (all message types)	Covered — `backfill_concept_pillars`, `regenerate_weekly`, `media_analysis`, standard export
Cron scheduled handler	Covered — dual-schedule routing, stuck export detection, parallel sub-tasks
Public API cursor pagination	Covered — no-overlap pages, null cursor on final page, offset vs cursor response shapes
Destructive operations	Covered — `/clear` scope, `/replay/all` missing R2 skip, `/replay/export` 4xx
Data integrity (multi-source, media dedup)	Covered — DI-1/3/4/5/6/8; DI-2 (days_processed ≠ digests) not yet written
Failure scenarios	Covered — F2/F4/F6/F8; F1 (quota mid-batch partial days) not yet written
Performance paths	Gap — staging only; not in CI

1. Unit Tests

Pure functions, transformations, and state machines with no I/O.

Priority gaps:

Function	File	What to assert
`assignPillarsKeywordOnly(graph)`	`lib/concepts/pillar_classifier.ts`	Each node gets keyword pillar or null; no AI called
`assignPillarsToGraph()`	same	Nodes updated; keyword fallback used when AI throws
`keywordFallback(label)`	same	All 6 pillars matched by representative keywords
`isRecoverableErrorMessage(msg)`	`lib/retry/self-healing.ts`	True for quota/rate/timeout/D1_ERROR; false for auth/schema
`canAutoRetry(msg, max)`	same	Respects MAX_ATTEMPTS=3; false on 4th attempt
`nextAutoRetryError(msg)`	same	Counter increments correctly
`canMakeAICall(db, 'daily')`	`lib/ai/quota.ts`	False at limit; true with bypass; respects neuron cap
`computeContentHash(content)`	`lib/whatsapp/dedupe.ts`	Deterministic; NFKC-normalized input produces same hash
`invalidateMediaConfigCache()`	`lib/media/quota.ts`	Next read re-fetches from D1
`setMediaQuotaBypass()`	`lib/media/quota.ts`	Writes to D1; immediately invalidates in-memory cache
Cursor pagination `next_cursor`	`apps/pulse-public/src/index.ts`	Equals last `week_start` when `results.length == limit`; null otherwise

2. Integration Tests

Multi-component flows using real D1/R2/Queue bindings via cloudflare:test.

Queue Consumer — All Message Types

Message type	Key assertions
`backfill_concept_pillars`	Loads graph from D1; runs AI pillar assignment; saves; no-op if graph missing
`regenerate_weekly`	Correct source_id in `weekly_summaries_public`; quota exhaustion calls `message.retry()`
`regenerate_weekly_all`	All sources processed; parallelizes up to SOURCE_CONCURRENCY=3
`media_analysis`	Only processes `pending_analysis`; skips other statuses; sets `analyzed_at`
`media_deferred_batch`	Resumes up to quota limit; correct status transitions
Standard export	Full path: parse → dedup → media artifacts → daily digests → status=completed

Cron Scheduled Handler

Two separate schedules must route correctly:

Cron	Tasks invoked	Tasks NOT invoked
`/5 * * *`	finalizeCompleted, retryFailed, retryMedia, resetDeferred, requeueStale	enqueueDeferredBatches, cleanupRetained
`/15 * * *`	enqueueDeferredBatches, cleanupRetained	All five high-priority tasks

Additional assertions:

Export with status=processing, days_processed < days_total, updated_at > 10 min ago is re-enqueued by */5
Export updated 9 min ago is NOT re-enqueued (threshold respected)
If one sub-task throws, the scheduled handler still returns without propagating

Quota Round-Trip

Fill daily_calls to limit → next canMakeAICall returns false
Set bypass=true → calls proceed regardless of limit
setMediaQuotaBypass(db, true) then getMediaQuotaStatus(db) in same invocation → bypassEnabled=true (cache invalidation)

3. End-to-End Tests

Complete admin workflows using SELF.fetch() against the real Worker.

E2E-1: Full Export Lifecycle Upload → queue → assert status transitions (queued → processing → weekly_processing → completed) → assert daily_digests and weekly_summaries_public created → GET /pulse/history.json returns the week.

E2E-2: Quota Exhaustion + Recovery Fill quota → upload → assert re-enqueued with delay → POST /quota/bypass → GET /quota/status returns bypass_enabled=true immediately → process again → assert completed.

E2E-3: Replay After Failure Create export with status=failed → POST /replay/export → assert re-enqueued → consumer → assert status=completed.

E2E-4: Clear + Verify Scope Create digests for source A and source B on same date → POST /clear with source_id=A → assert source A deleted, source B untouched, overlapping weeks deleted for A only.

E2E-5: Cursor Pagination Insert 50 weekly_summaries_public rows → paginate with limit=20 and cursor → assert no overlap, no gap across all pages, final page returns next_cursor=null.

4. Concurrency & Parallelism Tests

The highest-risk gap. All parallel execution paths must have explicit idempotency assertions.

C1. Concurrent Deduplication Two queue messages for same community/source with overlapping messages processed via Promise.all. Assert: exactly N rows in message_hashes, not 2N. INSERT OR IGNORE must be idempotent.

C2. Parallel Image Stage 1 Atomicity reserveMediaUsage(db, 'imageStage1', 2) called once before Promise.all([classifier, detector]). Assert: exactly 2 units reserved; if quota allows 1 but not 2, the whole Stage 1 is rejected (no partial reservation).

C3. Video Frame Phase 1/Phase 2 Ordering Phase 1 (all frame extraction + Stage 1) runs in parallel. Phase 2 (Stage 2 caption decisions) is sequential. With 3 frames and maxStage2=2: assert exactly 2 Stage 2 calls (cap honored despite parallel Phase 1); stage2Used counter correct.

C4. Weekly Regen SOURCE_CONCURRENCY=3 5-source community → regenerateWeeklySummariesForCommunity. Assert: all 5 sources processed; no cross-source collision (source A summary doesn’t overwrite source B); weeks within each source sequential.

C5. Parallel Cron Sub-Tasks Spy all 5 high-priority tasks. Run scheduled({ cron: '*/5 * * * *' }). Assert: all 5 invoked; start times overlap (not sequential); one throwing doesn’t prevent others from completing.

C6. Deferred Media Batch Limit 20 exports with deferred media → enqueueDeferredMediaBatches() → assert exactly 12 media_deferred_batch messages enqueued (LIMIT=12 respected).

5. Failure Scenarios

F1. Quota Exhaustion Mid-Batch Export with 10 days; quota allows only 3. Assert: days 1–3 processed, days_processed=3, export re-enqueued with delay secondsUntilReset + 300, status=processing (not failed).

F2. AI Schema Validation Failure Mock AI returns malformed JSON. Assert: JSON repair attempted (guards.ts); fallback neutral digest returned on repair failure; no crash; export continues to next day.

F3. R2 Object Missing on Replay Delete R2 object; POST /replay/export. Assert: HTTP 4xx returned; export status unchanged.

F4. D1 too-many-SQL-variables Mock D1 to throw on INSERT batch. Assert: retried with 30s × 2^attempt delay (capped 120s); attempt counter incremented in error_message.

F5. Error Classification Parametrized test across error types: quota/rate/D1_ERROR/network → isRecoverableErrorMessage()=true → retry. Auth/schema → false → status=failed.

F6. Max Retry Exhaustion Force recoverable error 4 times (> AUTO_RETRY_MAX_ATTEMPTS=3). Assert: 4th failure → status=failed permanently; retryRecoverableFailedExports() skips at max.

F7. Backfill Pillars — Graph Not Found Enqueue backfill_concept_pillars for non-existent week. Assert: warning logged; returns; no D1 write.

6. Data Integrity Tests

Invariants that, if violated, cause silent data corruption.

DI-1. Multi-Source Collision Prevention Insert digest for (community, sourceA, 2026-01-01) and (community, sourceB, 2026-01-01). Assert: both rows exist. The PK (community_id, source_id, day_date) added in fix_daily_digests_weekly_summaries_pk.sql must hold.

DI-2. days_processed ≠ digests exist Set days_processed = days_total; delete daily_digests rows for 2 days. Assert: pipeline re-processes the 2 missing days (queries daily_digests directly, not just the counter).

DI-3. Media Dedup Cross-Community Prevention Analyzed media with content_hash X in community A. New export in community B with same content_hash. Assert: batchFindExistingAnalysisByContentHash does NOT return community A’s result; community B enqueues its own analysis job.

DI-4. Intra-Batch Message Dedup Export with 5 duplicate messages on same day. Assert: message_hashes contains exactly 1 row; seenInBatch set prevents double-INSERT.

DI-5. Weekly Summary Thresholds < 3 days OR < 20 messages → no weekly_summaries_public row created; existing row deleted if re-processing drops below threshold.

DI-6. Raw Export Cleanup — Dual Deletion Run cleanupRetainedRawExports(). Assert: R2 object deleted AND exports.raw_deleted_at set; exports.status unchanged.

DI-7. Async Pillar Backfill — Null Handled Gracefully Public API queried before backfill_concept_pillars runs. Assert: response has pillar fields (keyword-assigned or null); no crash. After backfill: pillars are AI-quality.

DI-8. COALESCE NULL source_id Insert with source_id=null twice. Assert: ON CONFLICT updates (doesn’t create duplicate row). COALESCE(source_id, '') = COALESCE(?, '') queries correct across all tables.

7. Destructive Action Tests

DA-1. /clear Scope

Scenario	Assert
source_id=A	Only source A records deleted; source B untouched
source_id=null	Only community-wide records deleted
`clear_hashes=false`	message_hashes untouched
`clear_hashes=true`	message_hashes deleted in date range
start_date=end_date	Only that exact date cleared
Missing community_id	400 error; no deletion

DA-2. /replay/all — Missing R2 5 exports; 2 have R2 objects deleted. Assert: 3 enqueued; 2 skipped; response includes skipped_missing_raw_exports: 2.

DA-3. /replay/stuck + Cron Idempotency Both trigger concurrently. Assert: export enqueued at most twice; second processing run produces same final state (idempotent).

DA-4. Community DELETE Cascade DELETE /communities/{id}. Assert: all D1 tables zero rows for that community; R2 objects deleted.

8. Performance Baselines

Run in staging or against large test fixtures. Purpose: catch regressions automatically in CI.

Test	Metric	Baseline
Parser throughput	10,000-message export	< 500ms
Dedup INSERT scale	1,000 messages × 7 days	< 2,000ms per 1,000 messages
Media dedup batch	100 artifacts, CHUNK_SIZE=24	≤ 5 D1 queries; < 100ms
Cursor vs offset (deep page)	offset=980 vs cursor equivalent	Cursor ≥ 50% faster
Parallel cron sub-tasks	7 tasks × 200ms each	Total < 400ms
Parallel image Stage 1	Classifier + detector	≤ 60% of sequential time
Weekly regen (5 sources × 10 weeks)	SOURCE_CONCURRENCY=3	≤ 40% of serial time

9. Test Infrastructure

D1 Isolation

Apply all migrations at test start via a shared applyMigrations(db) helper (packages/db/migrations/ in order). Each test file gets a fresh D1 instance from the workerd runtime.

Seed Factories

New helpers needed in test/helpers/:

buildExport(overrides), buildDailyDigest(overrides), buildMediaArtifact(overrides)
buildConceptGraph(nodeCount), buildWhatsAppZip(days)

AI Mock

Unit tests: lib/ai/mock.ts (existing)
Integration tests: vi.spyOn(env.AI, 'run') with configurable responses
Quota tests: mock AI to return neuron counts after N calls, then throw

Queue Mock

const enqueued: unknown[] = [];
vi.spyOn(env.beacon_pulse_uploads, 'send').mockImplementation(async (msg) => {
  enqueued.push(msg);
});

Cron Testing

// vitest-pool-workers exposes worker.scheduled()
await worker.scheduled({ cron: '*/5 * * * *' }, env, ctx);
await worker.scheduled({ cron: '*/15 * * * *' }, env, ctx);

10. Known Gaps

Gap	Why Hard	Mitigation
Cache API (`caches.default`)	Behaves differently in workerd test runtime	Spy on cache calls; validate in staging
True D1 concurrency races	workerd serializes D1 in test mode	Use `Promise.all` to verify idempotency; document production behavior
KV eventual consistency	60s TTL hard to test deterministically	`vi.useFakeTimers()` + mock KV
Real AI calls cost neurons	Can’t run live AI in CI	AI mock for all tests; staging validation only
Media frame extraction	`MEDIA` binding unavailable in test runtime	Mock `env.MEDIA`; validate video in staging
Queue DLQ	CF Queues has no configurable DLQ	Test retry exhaustion path; document that cron compensates

11. Implemented Test Files

All tests use Vitest + @cloudflare/vitest-pool-workers. 109 tests across 11 files as of the last update. Pre-existing failures in export-status.test.ts and several pulse-public tests are unrelated to this work.

pulse-ingest

File	Tests	What it covers
`test/unit/pillar-classifier.test.ts`	17	`assignPillarsKeywordOnly`, `assignPillarsToGraph`, `classifyConceptPillar` keyword fallback, AI cache dedup
`test/unit/quota-cache.test.ts`	9	`setMediaQuotaBypass` invalidates cache; `isMediaQuotaBypassEnabled` null/bool handling
`test/unit/error-recovery.test.ts`	43	`isRecoverableErrorMessage` (all recoverable + terminal types), `canAutoRetry` boundary conditions, `nextAutoRetryError` counter, `formatAutoRetryError`
`test/integration/cron.test.ts`	7	Dual-schedule routing, stuck export detection SQL, re-enqueue threshold, error isolation
`test/integration/queue-message-types.test.ts`	5	`backfill_concept_pillars` (load/AI/save, no-op), `regenerate_weekly` quota retry, mixed-type batch
`test/integration/data-integrity.test.ts`	7	DI-1 multi-source PK, DI-3 media dedup community scope, DI-4 intra-batch dedup, DI-5 weekly threshold, DI-6 cleanup guard + `raw_deleted_at`, DI-8 COALESCE
`test/integration/destructive.test.ts`	7	`/clear` 400 validation, source scoping, `clear_hashes` flag; `/replay/all` missing R2 skip + count; `/replay/export` 4xx
`test/integration/concurrency.test.ts`	6	C1 `seenInBatch` dedup, C2 atomic `reserveMediaUsage(2)`, C4 SOURCE_CONCURRENCY=3 no cross-source collision, C5 parallel cron tasks, C6 deferred batch LIMIT=12
`test/integration/failure-scenarios.test.ts`	8	F2 schema failure no crash, F4 too-many-SQL-variables exponential backoff, F6 max retry exhaustion (skip vs re-enqueue), F8 missing R2 on media_analysis

pulse-public

File	Tests	What it covers
`test/integration/pagination.test.ts`	7	Cursor `next_cursor` correctness, no page overlap, null on final page, offset vs cursor response shape, 404 on empty

Beacon Platform Docs

Explorer

Testing Strategy

Coverage Model

1. Unit Tests

2. Integration Tests

Queue Consumer — All Message Types

Cron Scheduled Handler

Quota Round-Trip

3. End-to-End Tests

4. Concurrency & Parallelism Tests

5. Failure Scenarios

6. Data Integrity Tests

7. Destructive Action Tests

8. Performance Baselines

9. Test Infrastructure

D1 Isolation

Seed Factories

AI Mock

Queue Mock

Cron Testing

10. Known Gaps

11. Implemented Test Files

pulse-ingest

pulse-public

Graph View

Table of Contents

Backlinks

Beacon Platform Docs

Explorer

Testing Strategy

Coverage Model

1. Unit Tests

2. Integration Tests

Queue Consumer — All Message Types

Cron Scheduled Handler

Quota Round-Trip

3. End-to-End Tests

4. Concurrency & Parallelism Tests

5. Failure Scenarios

6. Data Integrity Tests

7. Destructive Action Tests

8. Performance Baselines

9. Test Infrastructure

D1 Isolation

Seed Factories

AI Mock

Queue Mock

Cron Testing

10. Known Gaps

11. Implemented Test Files

pulse-ingest

pulse-public

Related

Graph View

Table of Contents

Backlinks