AI Quota Model

Beacon enforces per-day AI quota limits to stay within Cloudflare Workers AI free-tier neuron budgets. Two separate quota systems exist: one for text (daily/weekly digest AI) and one for media analysis.

Tables

Table	Purpose
`ai_quota_usage`	Daily call counts and neuron totals for text AI (daily and weekly digest models)
`ai_usage_log`	Per-call log with model type, neurons consumed, and token counts
`media_ai_quota_usage`	Daily counters for media-specific usage: image stage1/2, audio minutes, video frame calls
`quota_settings`	Configuration overrides for all limits, pipeline mode, bypass flags

Text Quota (Daily & Weekly Digests)

Limits (configurable via `quota_settings`)

Setting key	Default	Purpose
`daily_model_limit`	800 calls/day	Max analysis+narrative calls for daily digests
`weekly_model_limit`	50 calls/day	Max calls for weekly summaries
`daily_neuron_limit`	8 000 neurons/day	Hard cap on Cloudflare billing units
`weekly_neuron_limit`	2 000 neurons/day	Cap for weekly model (heavier tokens)

Reset timing

Both daily counters reset at midnight UTC. The ai_quota_usage table stores one row per date (usage_date = 'YYYY-MM-DD'). There is no API to manually advance the reset — wait for midnight or activate bypass.

Exhaustion behavior

When canMakeAICall() returns false:

The queue message catches the quota error.
The export is updated with an error message recording the retry attempt.
The message is re-enqueued with a delay until the quota resets at midnight UTC.
Subsequent queue ticks: retryRecoverableFailedExports() (cron) re-enqueues exports once quota recovers.

This applies to both the new-messages path and the duplicate-only path (where all messages are already in message_hashes but digests haven’t been generated yet).

Media Quota

Limits (configurable via `quota_settings`)

Setting key	Default	Purpose
`media_max_images_per_export`	64	Max images analyzed per export
`media_max_stage2_per_export`	varies	Max Stage 2 (vision caption) calls per export
`media_max_audio_minutes_per_export`	60	Max audio transcription minutes per export
`media_max_video_minutes_per_export`	30	Max video processing minutes per export
`media_max_frames_per_video`	3	Frames sampled per video
`media_max_video_frame_stage2_per_export`	5	Max Stage 2 calls on video frames per export

Reservation pattern

reserveMediaUsage(db, metric, amount) uses an atomic conditional UPDATE:

UPDATE media_ai_quota_usage
SET column = column + ?
WHERE usage_date = ? AND column + ? <= limit

If changes = 0 the quota is exhausted and an error is thrown. This prevents over-quota usage even under concurrent processing.

Deferred processing

When media quota is exhausted mid-export, remaining items are set to status='deferred_quota'. The */15 cron calls enqueueDeferredMediaBatches(), selecting up to 12 exports with deferred items and enqueuing a media_deferred_batch message for each. The queue consumer then calls processDeferredMediaBatch() which re-reads the original ZIP from R2 and resumes analysis.

Deferred drain throughput

Each media_deferred_batch invocation processes up to 10 items per export (capped to avoid Worker CPU/network timeout on large ZIPs). With up to 12 exports queued per */15 tick:

Condition	Items/tick	Items/hour
Normal quota, 1 export with deferred items	10	40
Normal quota, 12 exports	up to 120	up to 480
Bypass enabled	same 10/tick limit applies	40–480

Bypass affects quota enforcement, not extraction throughput. With media_bypass_enabled=true, computeDeferredMediaBudgets returns { image: 200, audio: 200, video: 200 } so no items are blocked by per-export caps. But the 10-item batch ceiling still applies per invocation — the limiter is ZIP extraction time, not quota.

Example: 1,328 deferred items on 1 export, bypass enabled:

10 items/tick × 4 ticks/hour = 40 items/hour
Estimated drain time: 1,328 / 40 = ~33 hours

To drain faster, increase the selectDeferredMediaCandidates max-total argument in processDeferredMediaBatch (index.ts). The current value of 10 is conservative for ZIPs where bytes actually need to be read and written to R2.

Missing-attachment drain path

Some deferred items can never be analyzed because the media file was referenced in the WhatsApp chat message but not included in the ZIP export. These are identified when processDeferredMediaBatch re-reads the ZIP and the filename lookup returns nothing.

Drain cycle for these items:

loadDeferredMediaCandidates selects up to 10 items with status='deferred_quota'
extractZipAttachmentsByFilenameFromR2 reads the ZIP central directory, looks up the 10 filenames — returns 0 matches
Each item is updated to status='missing_attachment' and exits the deferred queue
remaining (count of deferred_quota + deferred_processing + skipped_quota) drops by 10
The */15 cron re-selects the next 10 on the next tick

The drain log shows status=empty, reason=no_extracted_assets, selected=10, extracted=0, remaining=N for each such tick — this is expected and not an error. The export’s “waiting for quota/cap reset” count in the admin UI will decrease by 10 per tick while “missing” increases by the same amount.

A large block of missing-attachment items indicates the ZIP was exported without media (e.g. “Export Without Media” was selected in WhatsApp, or attachments were unavailable at export time). The analysis that did complete was for items extracted during the initial ingest before the byte budget was hit.

The cleanupRetainedRawExports cron will not delete the ZIP while any deferred_quota items exist — the cleanup query includes AND NOT EXISTS (SELECT 1 FROM export_media WHERE export_id = e.export_id AND status IN ('pending_analysis', 'deferred_quota', 'deferred_processing', 'skipped_quota')). So the ZIP remains available throughout the drain cycle.

Bypass Flags

Two independent bypass flags disable quota enforcement for testing or recovery:

Setting key	Scope
`bypass_enabled`	Text quota (daily/weekly digest AI)
`media_bypass_enabled`	Media analysis quota

When bypass is active:

canMakeAICall() always returns true (text)
reserveMediaUsage() calls recordMediaUsage() directly without checking limits (media)
A warning is logged on every queue invocation: ⚠️ QUOTA BYPASS IS ENABLED

Set via POST /quota/bypass with { scope: 'text' | 'media' | 'all', enabled: true }.

KV caching of bypass flag (when provisioned)

When the beacon_pulse_cache KV namespace is provisioned:

Writing bypass via /quota/bypass also stores media_bypass_enabled in KV with a 60-second TTL.
Queue-consumer invocations check KV before D1, reducing DB reads during media analysis.
A 30-second module-level cache additionally eliminates redundant reads within a single invocation.

See wrangler.jsonc for the commented-out KV namespace block to activate.

Neuron Billing

Cloudflare charges AI usage in neurons (billing units). Beacon tracks this per invocation:

Formula: (inputTokens / 1M) × inputRate + (outputTokens / 1M) × outputRate
Model rates are stored in quota_settings with fallback defaults in code.
Estimated cost in the admin UI: neurons × $0.011 / 1000.

Daily model (~8B): roughly 7.5 neurons per digest call pair.
Weekly model (~70B): roughly 75 neurons per weekly call pair.

Admin Operations

Endpoint	Purpose
`GET /quota/status`	Current usage, remaining budget, model info, bypass state
`POST /quota/set`	Manually set usage counters
`GET /POST /quota/bypass`	Read or change bypass flags
`POST /quota/config`	Update quota limits
`GET /POST /pipeline/daily-config`	Read or change pipeline mode and concurrency

Beacon Platform Docs

Explorer

AI Quota Model

Tables

Text Quota (Daily & Weekly Digests)

Limits (configurable via `quota_settings`)

Reset timing

Exhaustion behavior

Media Quota

Limits (configurable via `quota_settings`)

Reservation pattern

Deferred processing

Deferred drain throughput

Missing-attachment drain path

Bypass Flags

KV caching of bypass flag (when provisioned)

Neuron Billing

Admin Operations

Graph View

Table of Contents

Backlinks

Beacon Platform Docs

Explorer

AI Quota Model

Tables

Text Quota (Daily & Weekly Digests)

Limits (configurable via quota_settings)

Reset timing

Exhaustion behavior

Media Quota

Limits (configurable via quota_settings)

Reservation pattern

Deferred processing

Deferred drain throughput

Missing-attachment drain path

Bypass Flags

KV caching of bypass flag (when provisioned)

Neuron Billing

Admin Operations

Related

Graph View

Table of Contents

Backlinks

Limits (configurable via `quota_settings`)

Limits (configurable via `quota_settings`)