Beacon enforces per-day AI quota limits to stay within Cloudflare Workers AI free-tier neuron budgets. Two separate quota systems exist: one for text (daily/weekly digest AI) and one for media analysis.
Tables
| Table | Purpose |
|---|---|
ai_quota_usage | Daily call counts and neuron totals for text AI (daily and weekly digest models) |
ai_usage_log | Per-call log with model type, neurons consumed, and token counts |
media_ai_quota_usage | Daily counters for media-specific usage: image stage1/2, audio minutes, video frame calls |
quota_settings | Configuration overrides for all limits, pipeline mode, bypass flags |
Text Quota (Daily & Weekly Digests)
Limits (configurable via quota_settings)
| Setting key | Default | Purpose |
|---|---|---|
daily_model_limit | 800 calls/day | Max analysis+narrative calls for daily digests |
weekly_model_limit | 50 calls/day | Max calls for weekly summaries |
daily_neuron_limit | 8 000 neurons/day | Hard cap on Cloudflare billing units |
weekly_neuron_limit | 2 000 neurons/day | Cap for weekly model (heavier tokens) |
Reset timing
Both daily counters reset at midnight UTC. The ai_quota_usage table stores one row per date (usage_date = 'YYYY-MM-DD'). There is no API to manually advance the reset — wait for midnight or activate bypass.
Exhaustion behavior
When canMakeAICall() returns false:
- The queue message catches the quota error.
- The export is updated with an error message recording the retry attempt.
- The message is re-enqueued with a delay until the quota resets at midnight UTC.
- Subsequent queue ticks:
retryRecoverableFailedExports()(cron) re-enqueues exports once quota recovers.
This applies to both the new-messages path and the duplicate-only path (where all messages are already in message_hashes but digests haven’t been generated yet).
Media Quota
Limits (configurable via quota_settings)
| Setting key | Default | Purpose |
|---|---|---|
media_max_images_per_export | 64 | Max images analyzed per export |
media_max_stage2_per_export | varies | Max Stage 2 (vision caption) calls per export |
media_max_audio_minutes_per_export | 60 | Max audio transcription minutes per export |
media_max_video_minutes_per_export | 30 | Max video processing minutes per export |
media_max_frames_per_video | 3 | Frames sampled per video |
media_max_video_frame_stage2_per_export | 5 | Max Stage 2 calls on video frames per export |
Reservation pattern
reserveMediaUsage(db, metric, amount) uses an atomic conditional UPDATE:
UPDATE media_ai_quota_usage
SET column = column + ?
WHERE usage_date = ? AND column + ? <= limitIf changes = 0 the quota is exhausted and an error is thrown. This prevents over-quota usage even under concurrent processing.
Deferred processing
When media quota is exhausted mid-export, remaining items are set to status='deferred_quota'. The */15 cron calls enqueueDeferredMediaBatches(), selecting up to 12 exports with deferred items and enqueuing a media_deferred_batch message for each. The queue consumer then calls processDeferredMediaBatch() which re-reads the original ZIP from R2 and resumes analysis.
Deferred drain throughput
Each media_deferred_batch invocation processes up to 10 items per export (capped to avoid Worker CPU/network timeout on large ZIPs). With up to 12 exports queued per */15 tick:
| Condition | Items/tick | Items/hour |
|---|---|---|
| Normal quota, 1 export with deferred items | 10 | 40 |
| Normal quota, 12 exports | up to 120 | up to 480 |
| Bypass enabled | same 10/tick limit applies | 40–480 |
Bypass affects quota enforcement, not extraction throughput. With media_bypass_enabled=true, computeDeferredMediaBudgets returns { image: 200, audio: 200, video: 200 } so no items are blocked by per-export caps. But the 10-item batch ceiling still applies per invocation — the limiter is ZIP extraction time, not quota.
Example: 1,328 deferred items on 1 export, bypass enabled:
- 10 items/tick × 4 ticks/hour = 40 items/hour
- Estimated drain time: 1,328 / 40 = ~33 hours
To drain faster, increase the selectDeferredMediaCandidates max-total argument in processDeferredMediaBatch (index.ts). The current value of 10 is conservative for ZIPs where bytes actually need to be read and written to R2.
Missing-attachment drain path
Some deferred items can never be analyzed because the media file was referenced in the WhatsApp chat message but not included in the ZIP export. These are identified when processDeferredMediaBatch re-reads the ZIP and the filename lookup returns nothing.
Drain cycle for these items:
loadDeferredMediaCandidatesselects up to 10 items withstatus='deferred_quota'extractZipAttachmentsByFilenameFromR2reads the ZIP central directory, looks up the 10 filenames — returns 0 matches- Each item is updated to
status='missing_attachment'and exits the deferred queue remaining(count ofdeferred_quota + deferred_processing + skipped_quota) drops by 10- The
*/15cron re-selects the next 10 on the next tick
The drain log shows status=empty, reason=no_extracted_assets, selected=10, extracted=0, remaining=N for each such tick — this is expected and not an error. The export’s “waiting for quota/cap reset” count in the admin UI will decrease by 10 per tick while “missing” increases by the same amount.
A large block of missing-attachment items indicates the ZIP was exported without media (e.g. “Export Without Media” was selected in WhatsApp, or attachments were unavailable at export time). The analysis that did complete was for items extracted during the initial ingest before the byte budget was hit.
The cleanupRetainedRawExports cron will not delete the ZIP while any deferred_quota items exist — the cleanup query includes AND NOT EXISTS (SELECT 1 FROM export_media WHERE export_id = e.export_id AND status IN ('pending_analysis', 'deferred_quota', 'deferred_processing', 'skipped_quota')). So the ZIP remains available throughout the drain cycle.
Bypass Flags
Two independent bypass flags disable quota enforcement for testing or recovery:
| Setting key | Scope |
|---|---|
bypass_enabled | Text quota (daily/weekly digest AI) |
media_bypass_enabled | Media analysis quota |
When bypass is active:
canMakeAICall()always returnstrue(text)reserveMediaUsage()callsrecordMediaUsage()directly without checking limits (media)- A warning is logged on every queue invocation:
⚠️ QUOTA BYPASS IS ENABLED
Set via POST /quota/bypass with { scope: 'text' | 'media' | 'all', enabled: true }.
KV caching of bypass flag (when provisioned)
When the beacon_pulse_cache KV namespace is provisioned:
- Writing bypass via
/quota/bypassalso storesmedia_bypass_enabledin KV with a 60-second TTL. - Queue-consumer invocations check KV before D1, reducing DB reads during media analysis.
- A 30-second module-level cache additionally eliminates redundant reads within a single invocation.
See wrangler.jsonc for the commented-out KV namespace block to activate.
Neuron Billing
Cloudflare charges AI usage in neurons (billing units). Beacon tracks this per invocation:
- Formula:
(inputTokens / 1M) × inputRate + (outputTokens / 1M) × outputRate - Model rates are stored in
quota_settingswith fallback defaults in code. - Estimated cost in the admin UI:
neurons × $0.011 / 1000.
Daily model (~8B): roughly 7.5 neurons per digest call pair.
Weekly model (~70B): roughly 75 neurons per weekly call pair.
Admin Operations
| Endpoint | Purpose |
|---|---|
GET /quota/status | Current usage, remaining budget, model info, bypass state |
POST /quota/set | Manually set usage counters |
GET /POST /quota/bypass | Read or change bypass flags |
POST /quota/config | Update quota limits |
GET /POST /pipeline/daily-config | Read or change pipeline mode and concurrency |