Skip to main content

License management

Every QRY tenant runs against a license: a GCP service-account JSON key that, on validation, returns the tenant's user cap, datasource cap, and feature flags. The license is checked on backend startup and every 6 hours thereafter, so tenants are bounded to what they paid for.

The key word is "bounded" — QRY doesn't crash without a license, it gracefully degrades. This page covers the model and the operational concerns.

What the license carries

For each tenant:

  • Maximum users — hard cap on accounts active at any time.
  • Maximum datasources — cap on configured datasources.
  • Feature flags — granular: rag, batch-profiling, scheduled-tasks, workspaces, domain-agents, forge, lakeflow, nexus, ml-hub, etc.
  • Expiration — when the license becomes invalid.
  • Tenant id — scoping; one license can't be reused for another tenant.

Validation cadence

  • On startup — the backend container won't accept queries until validation passes.
  • Every 6 hours — a background validator re-checks. Detects revocations, expiries, and provider outages.
  • 24-hour grace period — if validation fails (e.g. GCP outage), QRY stays operational on the last-known-good license for up to 24 hours. After that, the tenant is suspended.

The grace period is the difference between "GCP had a hiccup, no impact" and "GCP outage took down our tenant for half a day". 24h is generous; configurable for high-sensitivity tenants who'd rather fail closed.

Usage snapshots

A daily cronjob writes a usage snapshot per tenant:

  • Active users today.
  • Active datasources.
  • Total queries.
  • Total LLM tokens.
  • Total scheduled-task executions.

Snapshots feed the licensing dashboard for plan-bumping decisions and cost attribution.

Enforcement

  • User cap exceeded — admin can't create new users until existing ones are removed or the plan is bumped.
  • Datasource cap exceeded — admin can't add new datasources.
  • Feature flag off — the feature's API endpoints return 403; the UI hides the navigation entry.

User-facing copy on enforcement is in Admin > Branding > License messages and can be customised.

Configuring on a tenant

When a tenant is provisioned, the license JSON key is dropped into the tenant's namespace as a Kubernetes secret (qry-license-key). The provisioning script (provision_tenant.sh) handles this — see Multi-tenant provisioning.

For an existing tenant, rotate via:

kubectl create secret generic qry-license-key -n qry-<tenant> \
--from-file=key.json=/path/to/new-key.json \
--dry-run=client -o yaml | kubectl apply -f -

kubectl rollout restart deployment/qry-backend -n qry-<tenant>

Validation re-runs on backend restart and picks up the new key.

Key rotation

Plan: 180-day rotation. Beyond that, the GCP service account key is considered stale.

Rotation steps:

  1. Create a new key for the tenant's service account in GCP IAM.
  2. Update the qry-license-key secret as above.
  3. Restart qry-backend.
  4. Verify validation passes (backend logs show "License validated" on startup).
  5. After 24-48 hours of healthy operation, delete the old key in GCP.

Don't delete the old key before the secret is updated — the in-flight grace period uses the active key, but deletion immediately invalidates it.

What happens during a GCP outage

  • Validation calls fail.
  • Within the 24h grace, the tenant operates normally on the cached last-known-good license.
  • If GCP recovers within 24h, you don't notice.
  • After 24h, the tenant moves to a degraded mode: read-only conversations, no new users / datasources / scheduled tasks, no feature gates re-checked. (Partial; configurable.)
  • After validation succeeds again, normal operation resumes.

Common issues

Backend startup fails with "license invalid". The service-account JSON key is malformed, expired, or for the wrong tenant. Check kubectl get secret qry-license-key -n qry-<tenant> -o yaml and decode the value.

License says my plan covers 50 users but admin can't create the 41st. Maybe the cap counts inactive accounts too. Soft-deleted users count for retention purposes. Either bump the plan or hard-delete users beyond the retention window.

Feature flag is on but the UI doesn't show the feature. Browser cache. The user has to hard-refresh after a feature flag toggle. Backend reflects it immediately on next request.

Daily snapshot job hasn't run. Check the Celery beat scheduler. The job is license.usage_snapshot running daily at UTC 00:00 by default.

24h grace doesn't fit our compliance posture. Set license.grace_hours to a smaller value (e.g. 1). Below 0 means fail-closed immediately on first validation failure.

See also

QRYA product of IXEN.