Beacon enforces per-day AI quota limits to stay within Cloudflare Workers AI free-tier neuron budgets. Two separate quota systems exist: one for text (daily/weekly digest AI) and one for media analysis.


Tables

TablePurpose
ai_quota_usageDaily call counts and neuron totals for text AI (daily and weekly digest models)
ai_usage_logPer-call log with model type, neurons consumed, and token counts
media_ai_quota_usageDaily counters for media-specific usage: image stage1/2, audio minutes, video frame calls
quota_settingsConfiguration overrides for all limits, pipeline mode, bypass flags

Text Quota (Daily & Weekly Digests)

Limits (configurable via quota_settings)

Setting keyDefaultPurpose
daily_model_limit800 calls/dayMax analysis+narrative calls for daily digests
weekly_model_limit50 calls/dayMax calls for weekly summaries
daily_neuron_limit8 000 neurons/dayHard cap on Cloudflare billing units
weekly_neuron_limit2 000 neurons/dayCap for weekly model (heavier tokens)

Reset timing

Both daily counters reset at midnight UTC. The ai_quota_usage table stores one row per date (usage_date = 'YYYY-MM-DD'). There is no API to manually advance the reset — wait for midnight or activate bypass.

Exhaustion behavior

When canMakeAICall() returns false:

  1. The queue message catches the quota error.
  2. The export is updated with an error message recording the retry attempt.
  3. The message is re-enqueued with a delay until the quota resets at midnight UTC.
  4. Subsequent queue ticks: retryRecoverableFailedExports() (cron) re-enqueues exports once quota recovers.

This applies to both the new-messages path and the duplicate-only path (where all messages are already in message_hashes but digests haven’t been generated yet).


Media Quota

Limits (configurable via quota_settings)

Setting keyDefaultPurpose
media_max_images_per_export64Max images analyzed per export
media_max_stage2_per_exportvariesMax Stage 2 (vision caption) calls per export
media_max_audio_minutes_per_export60Max audio transcription minutes per export
media_max_video_minutes_per_export30Max video processing minutes per export
media_max_frames_per_video3Frames sampled per video
media_max_video_frame_stage2_per_export5Max Stage 2 calls on video frames per export

Reservation pattern

reserveMediaUsage(db, metric, amount) uses an atomic conditional UPDATE:

UPDATE media_ai_quota_usage
SET column = column + ?
WHERE usage_date = ? AND column + ? <= limit

If changes = 0 the quota is exhausted and an error is thrown. This prevents over-quota usage even under concurrent processing.

Deferred processing

When media quota is exhausted mid-export, remaining items are set to status='deferred_quota'. The */15 cron calls enqueueDeferredMediaBatches(), selecting up to 12 exports with deferred items and enqueuing a media_deferred_batch message for each. The queue consumer then calls processDeferredMediaBatch() which re-reads the original ZIP from R2 and resumes analysis.

Deferred drain throughput

Each media_deferred_batch invocation processes up to 10 items per export (capped to avoid Worker CPU/network timeout on large ZIPs). With up to 12 exports queued per */15 tick:

ConditionItems/tickItems/hour
Normal quota, 1 export with deferred items1040
Normal quota, 12 exportsup to 120up to 480
Bypass enabledsame 10/tick limit applies40–480

Bypass affects quota enforcement, not extraction throughput. With media_bypass_enabled=true, computeDeferredMediaBudgets returns { image: 200, audio: 200, video: 200 } so no items are blocked by per-export caps. But the 10-item batch ceiling still applies per invocation — the limiter is ZIP extraction time, not quota.

Example: 1,328 deferred items on 1 export, bypass enabled:

  • 10 items/tick × 4 ticks/hour = 40 items/hour
  • Estimated drain time: 1,328 / 40 = ~33 hours

To drain faster, increase the selectDeferredMediaCandidates max-total argument in processDeferredMediaBatch (index.ts). The current value of 10 is conservative for ZIPs where bytes actually need to be read and written to R2.

Missing-attachment drain path

Some deferred items can never be analyzed because the media file was referenced in the WhatsApp chat message but not included in the ZIP export. These are identified when processDeferredMediaBatch re-reads the ZIP and the filename lookup returns nothing.

Drain cycle for these items:

  1. loadDeferredMediaCandidates selects up to 10 items with status='deferred_quota'
  2. extractZipAttachmentsByFilenameFromR2 reads the ZIP central directory, looks up the 10 filenames — returns 0 matches
  3. Each item is updated to status='missing_attachment' and exits the deferred queue
  4. remaining (count of deferred_quota + deferred_processing + skipped_quota) drops by 10
  5. The */15 cron re-selects the next 10 on the next tick

The drain log shows status=empty, reason=no_extracted_assets, selected=10, extracted=0, remaining=N for each such tick — this is expected and not an error. The export’s “waiting for quota/cap reset” count in the admin UI will decrease by 10 per tick while “missing” increases by the same amount.

A large block of missing-attachment items indicates the ZIP was exported without media (e.g. “Export Without Media” was selected in WhatsApp, or attachments were unavailable at export time). The analysis that did complete was for items extracted during the initial ingest before the byte budget was hit.

The cleanupRetainedRawExports cron will not delete the ZIP while any deferred_quota items exist — the cleanup query includes AND NOT EXISTS (SELECT 1 FROM export_media WHERE export_id = e.export_id AND status IN ('pending_analysis', 'deferred_quota', 'deferred_processing', 'skipped_quota')). So the ZIP remains available throughout the drain cycle.


Bypass Flags

Two independent bypass flags disable quota enforcement for testing or recovery:

Setting keyScope
bypass_enabledText quota (daily/weekly digest AI)
media_bypass_enabledMedia analysis quota

When bypass is active:

  • canMakeAICall() always returns true (text)
  • reserveMediaUsage() calls recordMediaUsage() directly without checking limits (media)
  • A warning is logged on every queue invocation: ⚠️ QUOTA BYPASS IS ENABLED

Set via POST /quota/bypass with { scope: 'text' | 'media' | 'all', enabled: true }.

KV caching of bypass flag (when provisioned)

When the beacon_pulse_cache KV namespace is provisioned:

  • Writing bypass via /quota/bypass also stores media_bypass_enabled in KV with a 60-second TTL.
  • Queue-consumer invocations check KV before D1, reducing DB reads during media analysis.
  • A 30-second module-level cache additionally eliminates redundant reads within a single invocation.

See wrangler.jsonc for the commented-out KV namespace block to activate.


Neuron Billing

Cloudflare charges AI usage in neurons (billing units). Beacon tracks this per invocation:

  • Formula: (inputTokens / 1M) × inputRate + (outputTokens / 1M) × outputRate
  • Model rates are stored in quota_settings with fallback defaults in code.
  • Estimated cost in the admin UI: neurons × $0.011 / 1000.

Daily model (~8B): roughly 7.5 neurons per digest call pair.
Weekly model (~70B): roughly 75 neurons per weekly call pair.


Admin Operations

EndpointPurpose
GET /quota/statusCurrent usage, remaining budget, model info, bypass state
POST /quota/setManually set usage counters
GET /POST /quota/bypassRead or change bypass flags
POST /quota/configUpdate quota limits
GET /POST /pipeline/daily-configRead or change pipeline mode and concurrency