Key decisions
- Edge-first architecture on Cloudflare Workers with D1, R2, Queues, and Workers AI.
- Privacy boundary: raw exports, message hashes, and daily digests are private; only weekly summaries are public.
- Daily digests are generated from raw messages; weekly summaries are generated from daily digests only.
- Week window is Sunday-Saturday (UTC) to align with SQLite weekday 0.
- Weekly summaries are generated only when a week has >=3 days or >=20 messages.
- Deduplication is keyed by (community_id, source_id, day_date, content_hash).
- AI inputs are redacted for phones/emails/URLs; outputs are validated for privacy and schema.
- /pulse/daily.json can be public only if PUBLIC_DAILY_DIGESTS is enabled; otherwise it should be admin-only.
Sources: beacon-platform/docs/architecture.md, beacon-platform/docs/privacy.md, beacon-platform/docs/operations.md, beacon-platform/AUDIT_REPORT.md
Tradeoffs and constraints
- Retention is not enforced in code; requires R2 lifecycle rules and D1 cleanup jobs.
- Admin access relies on Cloudflare Access or ADMIN_TOKEN/ADMIN_SECRET configuration.
- Public summaries sacrifice granularity for privacy; message and participant counts are not exposed in public weekly outputs.
Sources: beacon-platform/docs/privacy.md, beacon-platform/AUDIT_REPORT.md, beacon-platform/REMEDIATION_PLAN.md
Open questions to confirm
- Is ADMIN_TOKEN/ADMIN_SECRET or Access JWT enforced in production?
- Is R2 lifecycle retention configured for the exports bucket?
- Should /pulse/daily.json be public in production?
- Who is the data controller and DSAR contact?
- Are minors or special-category data expected in exports?
- What are Cloudflare log retention policies for this deployment?
Sources: beacon-platform/AUDIT_REPORT.md