SmartStore AI — Governance Documentation

Per Volume 4, Chapter 11 of the bootcamp: written, even informal, answers to four questions, before a real customer, investor, or security review asks them as a surprise.

1. Data retention

User queries and generated answers: retained in PostgreSQL indefinitely as part of the audit trail (Section 3), but the raw text of a query is not displayed back to any user other than the one who asked it.
Session state (Redis, Phase 9): expires automatically after 1 hour (ttl_seconds=3600) — this is genuinely ephemeral, not a durable record.
Semantic cache entries (Qdrant, Phase 9): no automatic expiry yet — action item: add a TTL or catalog-update-triggered invalidation before this goes to production with real changing inventory (Volume 6, Ch.9's cache-invalidation exercise, not yet implemented).

2. Vendor data handling

Questions and retrieved context are sent to Anthropic's API (Claude) and OpenAI's API (embeddings) to generate answers. Both are processed under each provider's API terms — action item before production launch: confirm whether the specific account tier in use has zero-data-retention or no-training-on-API-data terms, and document the answer here explicitly, rather than assuming.
No customer data is sent to any provider beyond what's necessary for that specific request (the question text and retrieved product context) — full user records (email, role) are never included in a prompt.

3. Audit trail

Every /ask, /ask/agent, and /ask/image request is traced (Phase 10, OpenTelemetry) with store_id and result metadata attached to spans.
Action item: Phase 10's tracing currently exports to console (ConsoleSpanExporter) — production needs this pointed at a real backend (Grafana/Tempo, or a managed APM) with retention matching the policy stated in Section 1, and a way to look up "what did the assistant tell user X on date Y" by user ID and timestamp, not just by trace ID.
Phase 6's RLS policies and RBAC checks mean that even with full database access, reconstructing "what could user X have seen" is answerable by replaying their role and store scope — this traceability is a direct benefit of Phase 6's design, not an afterthought.

4. Usage policy / what the assistant can do

The assistant answers only from retrieved, grounded product/store data (Phase 3's system prompt) — it does not generate unsupported claims about pricing, promotions, or stock beyond what's in the database.
Any action with real side effects (Phase 7's agent tools, and any future tool with write access) requires the confirmation/idempotency guardrails from Volume 3, Chapter 11 of the bootcamp — action item: check_store_hours and get_product_location (Phase 7) are both read-only today; the first tool with a real side effect (e.g., "flag low stock") must not ship without this guardrail explicitly implemented and tested, not just documented as a principle.

Standing action items (carried forward, not resolved by writing this document)

Confirm Anthropic/OpenAI account-level data retention terms — Section 2
Add semantic cache invalidation — Section 1
Point OpenTelemetry export at a real, retained backend — Section 3
Build confirmation/idempotency guardrails before any write-capable tool ships — Section 4

This document should be revisited every time a new tool, data source, or vendor is added — not written once and forgotten.