Compute Governance

Hard budget limits.
For every LLM call you make.

Wallet-drain attacks are the silent killer of AI apps. Thskyshield wraps your LLM calls with deterministic spend enforcement — per user, per plan, per day. If a user hits their limit, the call is cancelled before a single token is consumed.

Under 15ms pre-call check
Per-user daily limits
Per-plan tier enforcement
Fail-open by design
Two-Phase Enforcement

The financial kill-switch for every LLM call.

A two-phase wrapper around your LLM calls. Phase A checks the budget before the call — in under 15ms. Phase B records the exact cost after. Redis handles the hot path. Supabase gets the permanent record. Watch both phases run in the live demo →

01

Pre-Call Budget Check (Phase A)

Before your app calls any LLM, the SDK pings /api/governance/check. We look up that user's daily spend in Upstash Redis — scoped to their plan tier if you pass one. If they've hit their configured limit, we return allowed: false in under 15ms. The API call never happens — zero tokens consumed, zero cost.

02

Per-Plan Tier Enforcement

Pass the user's plan on check() — 'free', 'pro', 'enterprise', or any string you define. Each plan enforces a separate daily budget in isolated Redis spend buckets. A free user hitting $5/day is blocked while a pro user continues up to $50/day — enforced at the same atomic layer, with zero cross-plan leakage. Falls back to your site's flat budget if no plan is passed.

03

Post-Call Usage Log (Phase B)

After the LLM responds, the SDK sends actual token counts to /api/governance/log. We calculate the exact dollar cost using our model pricing registry, reconcile the atomic reservation in Redis, and write a permanent record to Supabase with the model, cost, plan, and outcome.

04

Dashboard Visibility

Every spend event appears in your dashboard. Per-user breakdowns, per-site totals, per-plan activity, daily charts, and a real-time governance feed — all scoped to your API key.

User sends message to your AI app
SDK calls → /api/governance/check
Resolve plan budget (plan cache → site_plans → fallback)
Redis: user spend < configured daily budget?
✓ allowed: true → LLM call proceeds
✗ allowed: false → Call cancelled. $0 cost.
SDK calls → /api/governance/log (actual tokens)
Redis reconciled + Supabase record written
The Technical Moat

Atomic Lua reservation.
No race condition, no silent overrun.

Without atomic enforcement, two parallel requests read the same balance, both pass, both fire — budget exceeded silently. Every naïve implementation has this bug.

Thskyshield runs the entire reservation — read balance, compare budget, increment spend, set TTL — as a single Lua script inside Redis. No interleaving is possible. A user cannot overrun their budget regardless of request concurrency.

LiteLLM and Helicone log spend after the fact. They cannot prevent the call. We block it before it starts.

Without atomic enforcement
Request A reads balance: $0.80 / $1.00
Request B reads balance: $0.80 / $1.00
Both pass — both fire
Actual spend: $1.60 ✗ Budget blown
With Thskyshield — single Lua script
Request A: atomicReserve → $0.80 → pass
Request B: atomicReserve → $1.00 → block
Budget enforced atomically ✓
Actual spend: $0.80 ✓ Limit held
What Gets Tracked

Full cost visibility. Zero guesswork.

Every compute event is logged with enough detail to audit, debug, and optimise your LLM usage.

Per-User Spend

Every call is attributed to an external user ID you provide. See exactly which users are consuming budget, and enforce per-user daily limits independently.

Per-Plan Budgets

Define separate daily budgets for each subscription tier — free, pro, enterprise, or any plan name you use. Each plan's spend is tracked in its own isolated Redis bucket. No cross-plan leakage.

Exact Token Costs

Input and output tokens are recorded separately. Cost is calculated using our model pricing registry — updated as providers change their rates.

Model Breakdown

Track spend across GPT-4o, Claude, Gemini, or any model your app uses. Compare cost-per-call across models to inform your model selection decisions.

Prompt Replay Detection

The governance engine automatically detects when the same prompt fires repeatedly. Identical requests beyond 10× in 60 seconds are short-circuited before a token is consumed — no changes required on your end.

Permanent Audit Trail

Every governance event — allowed, blocked, or logged — is written to Supabase with action, reason, plan, model, and exact cost. Full history, queryable, exportable.

Integration

60 seconds to governed.

Two calls wrap your existing LLM logic. No architecture changes. No infrastructure to manage. Pass a plan string and per-plan enforcement is live instantly.

Compatible with Next.js 14+ App & Pages Router
Works with OpenAI, Anthropic, Gemini, and any LLM
TypeScript-first SDK with full type safety
Per-plan budgets with a single config field
Fail-open: governance errors never break your app
// middleware.ts
import { shield } from "@thskyshield/next";

export default shield({
  routes: ["/api/*", "/dashboard/*"],
  policy: "strict"
});