Every uploaded export has a status field in the exports D1 table that drives all processing decisions.


State Machine

                         ┌─────────────────┐
                         │  pending_source  │ ← no source_id assigned
                         └────────┬────────┘
                                  │ source assigned (UI or /admin/exports/assign-source)
                                  ▼
upload → ─────────────────────► queued
                                  │ queue consumer picks up message
                                  ▼
                             processing
                           /             \
          quota exhausted /               \ all days processed
                         ▼               ▼
             (re-enqueued         weekly_processing
              with delay)               │ weekly summaries generated
                                        ▼
                                    completed
                                        
At any stage: ──────────────────────► failed

Status Definitions

StatusMeaning
pending_sourceUploaded but no source_id assigned; not processed.
queuedIn queue or waiting to be picked up.
processingQueue consumer is actively parsing and generating daily digests.
weekly_processingDaily digests complete; generating weekly summaries.
completedAll digests and summaries generated successfully.
failedProcessing failed; error_message contains details and retry count.

Transition Conditions

queued → processing

Queue consumer reads the export from R2, validates source_id, and updates status.

processing → weekly_processing

All days in the export have been processed (days_processed = days_total). The pipeline then queues weekly summary generation.

weekly_processing → completed

finalizeCompletedWeeklyExports() (cron) detects that weeks_processed >= weeks_total or that weekly work is already complete and sets status = 'completed'.

* → failed

Any unhandled error during processing sets status to failed and records the error in error_message.

failed → queued (auto-retry)

retryRecoverableFailedExports() (cron) checks failed exports every 5 minutes. If the error is recoverable (quota, transient network, etc.) and AUTO_RETRY_MAX_ATTEMPTS has not been reached, the export is reset to queued and re-enqueued.


Cron Retry Logic

The cron (*/5 * * * *) uses this query to find stuck exports:

SELECT * FROM exports
WHERE status IN ('processing', 'weekly_processing', 'queued')
  AND (days_processed < days_total OR weeks_processed < weeks_total OR status = 'queued')
  AND minutes_since_update >= 10

Important: The cron only retries processing, weekly_processing, and queued exports. Exports with status = 'completed' are never automatically retried — even if they have zero daily digests (e.g., because quota exhausted after writing hashes but before generating digests). To re-trigger a completed export with missing digests: reset its status to processing via D1 or use the /replay/export endpoint.


Manual Recovery

Re-queue a stuck export

POST /replay/export
{ "export_id": "...", "community_id": "..." }

Reads the raw export from R2 and re-enqueues it. Fails if the raw R2 object has already been deleted.

Re-queue all exports for a community

POST /replay/all
{ "community_id": "...", "clear_first": false }

Skips exports whose raw R2 object is missing and reports how many were skipped.

Re-queue stuck exports manually

POST /replay/stuck
{ "community_id": "...", "max_age_minutes": 10 }

Defaults to 30-minute staleness threshold at the API layer.


Raw Export Retention

Raw exports stored in R2 are subject to automatic cleanup:

ScenarioDefault retention
Completed processing, no pending media72 hours after completion
Failed processing168 hours after failure
Pending media still attachedHeld until media is complete or quota clears

Cleanup runs in batches of 75 per cron tick (raised from 25 in April 2026). Once the raw object is deleted, replay from R2 is no longer possible.


Source Assignment

Exports uploaded without a source_id are immediately set to pending_source and are not processed until a source is assigned. If the community has exactly one source configured in chat_sources, the worker auto-assigns it at upload time. Otherwise, use the admin UI or /admin/exports/assign-source.


Progress Tracking

ColumnMeaning
days_totalNumber of unique dates in the export file
days_processedDays for which daily digests have been written
weeks_totalNumber of weeks containing daily data
weeks_processedWeeks for which weekly summaries have been written

days_processed = days_total does not guarantee all daily digests exist — it means the processing loop completed. The pipeline queries daily_digests directly to determine which days still need work. See ingest for the duplicate-only path.