Beacon’s testing strategy is built on three principles: test behavior not implementation, explicitly test concurrency, and treat every data integrity invariant as a first-class test case.
Framework: Vitest + @cloudflare/vitest-pool-workers (real workerd runtime for all integration and E2E tests).
See testing for local test setup and CI configuration.
Coverage Model
| Layer | Status |
|---|---|
| Pure functions (parsing, hashing, transforms) | Covered |
| AI pipeline quota + cache invalidation | Covered — quota cache, bypass round-trip, setMediaQuotaBypass invalidation |
Error recovery (isRecoverableErrorMessage, retry counter) | Covered — 43 parametrized unit tests |
| Deduplication (intra-batch + concurrent) | Covered — seenInBatch set, INSERT OR IGNORE idempotency |
| Pillar classifier (keyword + AI) | Covered |
| Queue consumer (all message types) | Covered — backfill_concept_pillars, regenerate_weekly, media_analysis, standard export |
| Cron scheduled handler | Covered — dual-schedule routing, stuck export detection, parallel sub-tasks |
| Public API cursor pagination | Covered — no-overlap pages, null cursor on final page, offset vs cursor response shapes |
| Destructive operations | Covered — /clear scope, /replay/all missing R2 skip, /replay/export 4xx |
| Data integrity (multi-source, media dedup) | Covered — DI-1/3/4/5/6/8; DI-2 (days_processed ≠ digests) not yet written |
| Failure scenarios | Covered — F2/F4/F6/F8; F1 (quota mid-batch partial days) not yet written |
| Performance paths | Gap — staging only; not in CI |
1. Unit Tests
Pure functions, transformations, and state machines with no I/O.
Priority gaps:
| Function | File | What to assert |
|---|---|---|
assignPillarsKeywordOnly(graph) | lib/concepts/pillar_classifier.ts | Each node gets keyword pillar or null; no AI called |
assignPillarsToGraph() | same | Nodes updated; keyword fallback used when AI throws |
keywordFallback(label) | same | All 6 pillars matched by representative keywords |
isRecoverableErrorMessage(msg) | lib/retry/self-healing.ts | True for quota/rate/timeout/D1_ERROR; false for auth/schema |
canAutoRetry(msg, max) | same | Respects MAX_ATTEMPTS=3; false on 4th attempt |
nextAutoRetryError(msg) | same | Counter increments correctly |
canMakeAICall(db, 'daily') | lib/ai/quota.ts | False at limit; true with bypass; respects neuron cap |
computeContentHash(content) | lib/whatsapp/dedupe.ts | Deterministic; NFKC-normalized input produces same hash |
invalidateMediaConfigCache() | lib/media/quota.ts | Next read re-fetches from D1 |
setMediaQuotaBypass() | lib/media/quota.ts | Writes to D1; immediately invalidates in-memory cache |
Cursor pagination next_cursor | apps/pulse-public/src/index.ts | Equals last week_start when results.length == limit; null otherwise |
2. Integration Tests
Multi-component flows using real D1/R2/Queue bindings via cloudflare:test.
Queue Consumer — All Message Types
| Message type | Key assertions |
|---|---|
backfill_concept_pillars | Loads graph from D1; runs AI pillar assignment; saves; no-op if graph missing |
regenerate_weekly | Correct source_id in weekly_summaries_public; quota exhaustion calls message.retry() |
regenerate_weekly_all | All sources processed; parallelizes up to SOURCE_CONCURRENCY=3 |
media_analysis | Only processes pending_analysis; skips other statuses; sets analyzed_at |
media_deferred_batch | Resumes up to quota limit; correct status transitions |
| Standard export | Full path: parse → dedup → media artifacts → daily digests → status=completed |
Cron Scheduled Handler
Two separate schedules must route correctly:
| Cron | Tasks invoked | Tasks NOT invoked |
|---|---|---|
*/5 * * * * | finalizeCompleted, retryFailed, retryMedia, resetDeferred, requeueStale | enqueueDeferredBatches, cleanupRetained |
*/15 * * * * | enqueueDeferredBatches, cleanupRetained | All five high-priority tasks |
Additional assertions:
- Export with
status=processing,days_processed < days_total,updated_at > 10 min agois re-enqueued by*/5 - Export updated 9 min ago is NOT re-enqueued (threshold respected)
- If one sub-task throws, the scheduled handler still returns without propagating
Quota Round-Trip
- Fill
daily_callsto limit → nextcanMakeAICallreturns false - Set bypass=true → calls proceed regardless of limit
setMediaQuotaBypass(db, true)thengetMediaQuotaStatus(db)in same invocation →bypassEnabled=true(cache invalidation)
3. End-to-End Tests
Complete admin workflows using SELF.fetch() against the real Worker.
E2E-1: Full Export Lifecycle Upload → queue → assert status transitions (queued → processing → weekly_processing → completed) → assert daily_digests and weekly_summaries_public created → GET /pulse/history.json returns the week.
E2E-2: Quota Exhaustion + Recovery Fill quota → upload → assert re-enqueued with delay → POST /quota/bypass → GET /quota/status returns bypass_enabled=true immediately → process again → assert completed.
E2E-3: Replay After Failure Create export with status=failed → POST /replay/export → assert re-enqueued → consumer → assert status=completed.
E2E-4: Clear + Verify Scope Create digests for source A and source B on same date → POST /clear with source_id=A → assert source A deleted, source B untouched, overlapping weeks deleted for A only.
E2E-5: Cursor Pagination
Insert 50 weekly_summaries_public rows → paginate with limit=20 and cursor → assert no overlap, no gap across all pages, final page returns next_cursor=null.
4. Concurrency & Parallelism Tests
The highest-risk gap. All parallel execution paths must have explicit idempotency assertions.
C1. Concurrent Deduplication
Two queue messages for same community/source with overlapping messages processed via Promise.all. Assert: exactly N rows in message_hashes, not 2N. INSERT OR IGNORE must be idempotent.
C2. Parallel Image Stage 1 Atomicity
reserveMediaUsage(db, 'imageStage1', 2) called once before Promise.all([classifier, detector]). Assert: exactly 2 units reserved; if quota allows 1 but not 2, the whole Stage 1 is rejected (no partial reservation).
C3. Video Frame Phase 1/Phase 2 Ordering
Phase 1 (all frame extraction + Stage 1) runs in parallel. Phase 2 (Stage 2 caption decisions) is sequential. With 3 frames and maxStage2=2: assert exactly 2 Stage 2 calls (cap honored despite parallel Phase 1); stage2Used counter correct.
C4. Weekly Regen SOURCE_CONCURRENCY=3
5-source community → regenerateWeeklySummariesForCommunity. Assert: all 5 sources processed; no cross-source collision (source A summary doesn’t overwrite source B); weeks within each source sequential.
C5. Parallel Cron Sub-Tasks
Spy all 5 high-priority tasks. Run scheduled({ cron: '*/5 * * * *' }). Assert: all 5 invoked; start times overlap (not sequential); one throwing doesn’t prevent others from completing.
C6. Deferred Media Batch Limit
20 exports with deferred media → enqueueDeferredMediaBatches() → assert exactly 12 media_deferred_batch messages enqueued (LIMIT=12 respected).
5. Failure Scenarios
F1. Quota Exhaustion Mid-Batch
Export with 10 days; quota allows only 3. Assert: days 1–3 processed, days_processed=3, export re-enqueued with delay secondsUntilReset + 300, status=processing (not failed).
F2. AI Schema Validation Failure
Mock AI returns malformed JSON. Assert: JSON repair attempted (guards.ts); fallback neutral digest returned on repair failure; no crash; export continues to next day.
F3. R2 Object Missing on Replay Delete R2 object; POST /replay/export. Assert: HTTP 4xx returned; export status unchanged.
F4. D1 too-many-SQL-variables
Mock D1 to throw on INSERT batch. Assert: retried with 30s × 2^attempt delay (capped 120s); attempt counter incremented in error_message.
F5. Error Classification
Parametrized test across error types: quota/rate/D1_ERROR/network → isRecoverableErrorMessage()=true → retry. Auth/schema → false → status=failed.
F6. Max Retry Exhaustion
Force recoverable error 4 times (> AUTO_RETRY_MAX_ATTEMPTS=3). Assert: 4th failure → status=failed permanently; retryRecoverableFailedExports() skips at max.
F7. Backfill Pillars — Graph Not Found
Enqueue backfill_concept_pillars for non-existent week. Assert: warning logged; returns; no D1 write.
6. Data Integrity Tests
Invariants that, if violated, cause silent data corruption.
DI-1. Multi-Source Collision Prevention
Insert digest for (community, sourceA, 2026-01-01) and (community, sourceB, 2026-01-01). Assert: both rows exist. The PK (community_id, source_id, day_date) added in fix_daily_digests_weekly_summaries_pk.sql must hold.
DI-2. days_processed ≠ digests exist
Set days_processed = days_total; delete daily_digests rows for 2 days. Assert: pipeline re-processes the 2 missing days (queries daily_digests directly, not just the counter).
DI-3. Media Dedup Cross-Community Prevention
Analyzed media with content_hash X in community A. New export in community B with same content_hash. Assert: batchFindExistingAnalysisByContentHash does NOT return community A’s result; community B enqueues its own analysis job.
DI-4. Intra-Batch Message Dedup
Export with 5 duplicate messages on same day. Assert: message_hashes contains exactly 1 row; seenInBatch set prevents double-INSERT.
DI-5. Weekly Summary Thresholds
< 3 days OR < 20 messages → no weekly_summaries_public row created; existing row deleted if re-processing drops below threshold.
DI-6. Raw Export Cleanup — Dual Deletion
Run cleanupRetainedRawExports(). Assert: R2 object deleted AND exports.raw_deleted_at set; exports.status unchanged.
DI-7. Async Pillar Backfill — Null Handled Gracefully
Public API queried before backfill_concept_pillars runs. Assert: response has pillar fields (keyword-assigned or null); no crash. After backfill: pillars are AI-quality.
DI-8. COALESCE NULL source_id
Insert with source_id=null twice. Assert: ON CONFLICT updates (doesn’t create duplicate row). COALESCE(source_id, '') = COALESCE(?, '') queries correct across all tables.
7. Destructive Action Tests
DA-1. /clear Scope
| Scenario | Assert |
|---|---|
| source_id=A | Only source A records deleted; source B untouched |
| source_id=null | Only community-wide records deleted |
clear_hashes=false | message_hashes untouched |
clear_hashes=true | message_hashes deleted in date range |
| start_date=end_date | Only that exact date cleared |
| Missing community_id | 400 error; no deletion |
DA-2. /replay/all — Missing R2
5 exports; 2 have R2 objects deleted. Assert: 3 enqueued; 2 skipped; response includes skipped_missing_raw_exports: 2.
DA-3. /replay/stuck + Cron Idempotency
Both trigger concurrently. Assert: export enqueued at most twice; second processing run produces same final state (idempotent).
DA-4. Community DELETE Cascade
DELETE /communities/{id}. Assert: all D1 tables zero rows for that community; R2 objects deleted.
8. Performance Baselines
Run in staging or against large test fixtures. Purpose: catch regressions automatically in CI.
| Test | Metric | Baseline |
|---|---|---|
| Parser throughput | 10,000-message export | < 500ms |
| Dedup INSERT scale | 1,000 messages × 7 days | < 2,000ms per 1,000 messages |
| Media dedup batch | 100 artifacts, CHUNK_SIZE=24 | ≤ 5 D1 queries; < 100ms |
| Cursor vs offset (deep page) | offset=980 vs cursor equivalent | Cursor ≥ 50% faster |
| Parallel cron sub-tasks | 7 tasks × 200ms each | Total < 400ms |
| Parallel image Stage 1 | Classifier + detector | ≤ 60% of sequential time |
| Weekly regen (5 sources × 10 weeks) | SOURCE_CONCURRENCY=3 | ≤ 40% of serial time |
9. Test Infrastructure
D1 Isolation
Apply all migrations at test start via a shared applyMigrations(db) helper (packages/db/migrations/ in order). Each test file gets a fresh D1 instance from the workerd runtime.
Seed Factories
New helpers needed in test/helpers/:
buildExport(overrides),buildDailyDigest(overrides),buildMediaArtifact(overrides)buildConceptGraph(nodeCount),buildWhatsAppZip(days)
AI Mock
- Unit tests:
lib/ai/mock.ts(existing) - Integration tests:
vi.spyOn(env.AI, 'run')with configurable responses - Quota tests: mock AI to return neuron counts after N calls, then throw
Queue Mock
const enqueued: unknown[] = [];
vi.spyOn(env.beacon_pulse_uploads, 'send').mockImplementation(async (msg) => {
enqueued.push(msg);
});Cron Testing
// vitest-pool-workers exposes worker.scheduled()
await worker.scheduled({ cron: '*/5 * * * *' }, env, ctx);
await worker.scheduled({ cron: '*/15 * * * *' }, env, ctx);10. Known Gaps
| Gap | Why Hard | Mitigation |
|---|---|---|
Cache API (caches.default) | Behaves differently in workerd test runtime | Spy on cache calls; validate in staging |
| True D1 concurrency races | workerd serializes D1 in test mode | Use Promise.all to verify idempotency; document production behavior |
| KV eventual consistency | 60s TTL hard to test deterministically | vi.useFakeTimers() + mock KV |
| Real AI calls cost neurons | Can’t run live AI in CI | AI mock for all tests; staging validation only |
| Media frame extraction | MEDIA binding unavailable in test runtime | Mock env.MEDIA; validate video in staging |
| Queue DLQ | CF Queues has no configurable DLQ | Test retry exhaustion path; document that cron compensates |
11. Implemented Test Files
All tests use Vitest + @cloudflare/vitest-pool-workers. 109 tests across 11 files as of the last update. Pre-existing failures in export-status.test.ts and several pulse-public tests are unrelated to this work.
pulse-ingest
| File | Tests | What it covers |
|---|---|---|
test/unit/pillar-classifier.test.ts | 17 | assignPillarsKeywordOnly, assignPillarsToGraph, classifyConceptPillar keyword fallback, AI cache dedup |
test/unit/quota-cache.test.ts | 9 | setMediaQuotaBypass invalidates cache; isMediaQuotaBypassEnabled null/bool handling |
test/unit/error-recovery.test.ts | 43 | isRecoverableErrorMessage (all recoverable + terminal types), canAutoRetry boundary conditions, nextAutoRetryError counter, formatAutoRetryError |
test/integration/cron.test.ts | 7 | Dual-schedule routing, stuck export detection SQL, re-enqueue threshold, error isolation |
test/integration/queue-message-types.test.ts | 5 | backfill_concept_pillars (load/AI/save, no-op), regenerate_weekly quota retry, mixed-type batch |
test/integration/data-integrity.test.ts | 7 | DI-1 multi-source PK, DI-3 media dedup community scope, DI-4 intra-batch dedup, DI-5 weekly threshold, DI-6 cleanup guard + raw_deleted_at, DI-8 COALESCE |
test/integration/destructive.test.ts | 7 | /clear 400 validation, source scoping, clear_hashes flag; /replay/all missing R2 skip + count; /replay/export 4xx |
test/integration/concurrency.test.ts | 6 | C1 seenInBatch dedup, C2 atomic reserveMediaUsage(2), C4 SOURCE_CONCURRENCY=3 no cross-source collision, C5 parallel cron tasks, C6 deferred batch LIMIT=12 |
test/integration/failure-scenarios.test.ts | 8 | F2 schema failure no crash, F4 too-many-SQL-variables exponential backoff, F6 max retry exhaustion (skip vs re-enqueue), F8 missing R2 on media_analysis |
pulse-public
| File | Tests | What it covers |
|---|---|---|
test/integration/pagination.test.ts | 7 | Cursor next_cursor correctness, no page overlap, null on final page, offset vs cursor response shape, 404 on empty |