Privacy protection is built into every layer: storage, processing, and output.


Overview

Privacy protection is built into storage, processing, and output. Public data is always aggregate-only.

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts


What we never show publicly

  • Names or phone numbers.
  • Direct message quotes.
  • Specific timestamps (e.g. “3:45 PM”).
  • Group names or identifiers.
  • Any information that could identify individuals.

Sources: beacon-platform/apps/pulse-public/src/index.ts


What we do show publicly

  • Weekly aggregate summaries.
  • Sentiment scores and themes.
  • Message and participant counts (totals only).
  • Date ranges (day-level or week-level).

Sources: beacon-platform/apps/pulse-public/src/index.ts


Privacy layers

1) Data separation

Private zone (never public):

  • Raw chat exports in R2.
  • Message hashes for deduplication.
  • Daily digests.

Public zone:

  • Weekly summaries only.
  • Aggregated metrics.
  • Sanitized themes.

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts


Public vs private data boundary

flowchart LR
  subgraph PrivateZone
    RawExports[R2 exports]
    MessageHashes
    DailyDigests
  end
  subgraph PublicZone
    WeeklyPublic
    PulseApi[/pulse.json + /pulse/history.json/]
  end
  RawExports --> MessageHashes --> DailyDigests --> WeeklyPublic --> PulseApi

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts

2) Input protection

Before AI processing:

  • Phone numbers redacted.
  • Email addresses redacted.
  • URLs redacted.
  • Mentions optional filtering.

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts

3) AI guardrails

Prompt instructions enforce:

  • No direct quotes.
  • No names or identifiers.
  • No timestamps.
  • Aggregate analysis only.

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts

4) Output validation

Every AI output is checked for:

  • Valid JSON schema.
  • No PII patterns (regex).
  • Score within range [-1, 1].
  • Theme sanitization (max 5 themes).

If validation fails:

  1. Retry once with stricter instructions.
  2. Fall back to a neutral summary.

Sources: beacon-platform/docs/privacy.md, beacon-platform/apps/pulse-public/src/index.ts


AI guardrails flow

flowchart TD
  Raw[Raw messages] --> Redact[Redact phones/emails/URLs]
  Redact --> LLM[AI analysis + narrative]
  LLM --> Validate{Schema + privacy checks}
  Validate -- pass --> Store[Store summary]
  Validate -- fail --> Retry[Retry once with stricter prompt]
  Retry --> Validate
  Validate -- fail --> Fallback[Neutral fallback summary]

Sources: beacon-platform/apps/pulse-public/src/index.ts


Data lifecycle

Storage

  • R2: raw exports (encrypted at rest).
  • D1: hashes, digests, summaries.
  • Public access: only weekly_summaries_public.

Cleanup

  • /clear endpoint removes summaries, digests, and optionally hashes.
  • R2 deletion events trigger cleanup.
  • Retention enforced via R2 lifecycle rules or manual cleanup.

Sources: beacon-platform/docs/privacy.md, beacon-platform/infra/CLEANUP_GUIDE.md, beacon-platform/apps/pulse-public/src/index.ts


Limitations and best practices

Limitations:

  • Perfect anonymization is not guaranteed.
  • Admin endpoints currently lack authentication.
  • AI may occasionally generate unexpected output (mitigated by validation).

Best practices:

  • Protect admin endpoints with Cloudflare Access or an auth gateway.
  • Monitor AI outputs regularly.
  • Keep /pulse/daily.json internal unless explicitly intended to be public.

Sources: beacon-platform/apps/pulse-public/src/index.ts, beacon-platform/AUDIT_REPORT.md