Compute Governance

Hard budget limits.
For every LLM call you make.

Wallet-drain attacks are the silent killer of AI apps. Thskyshield wraps your LLM calls with deterministic spend enforcement — per user, per plan, per day. If a user hits their limit, the call is cancelled before a single token is consumed.

Under 15ms pre-call check

Per-user daily limits

Per-plan tier enforcement

Fail-open by design

Two-Phase Enforcement

The financial kill-switch for every LLM call.

A two-phase wrapper around your LLM calls. Phase A checks the budget before the call — in under 15ms. Phase B records the exact cost after. Redis handles the hot path. Supabase gets the permanent record. Watch both phases run in the live demo →

Pre-Call Budget Check (Phase A)

Before your app calls any LLM, the SDK pings /api/governance/check. We look up that user's daily spend in Upstash Redis — scoped to their plan tier if you pass one. If they've hit their configured limit, we return allowed: false in under 15ms. The API call never happens — zero tokens consumed, zero cost.

Per-Plan Tier Enforcement

Pass the user's plan on check() — 'free', 'pro', 'enterprise', or any string you define. Each plan enforces a separate daily budget in isolated Redis spend buckets. A free user hitting $5/day is blocked while a pro user continues up to $50/day — enforced at the same atomic layer, with zero cross-plan leakage. Falls back to your site's flat budget if no plan is passed.

Post-Call Usage Log (Phase B)

After the LLM responds, the SDK sends actual token counts to /api/governance/log. We calculate the exact dollar cost using our model pricing registry, reconcile the atomic reservation in Redis, and write a permanent record to Supabase with the model, cost, plan, and outcome.

Dashboard Visibility

Every spend event appears in your dashboard. Per-user breakdowns, per-site totals, per-plan activity, daily charts, and a real-time governance feed — all scoped to your API key.

User sends message to your AI app

SDK calls → /api/governance/check

Resolve plan budget (plan cache → site_plans → fallback)

Redis: user spend < configured daily budget?

✓ allowed: true → LLM call proceeds

✗ allowed: false → Call cancelled. $0 cost.

SDK calls → /api/governance/log (actual tokens)

Redis reconciled + Supabase record written

The Technical Moat

Atomic Lua reservation.
No race condition, no silent overrun.

Without atomic enforcement, two parallel requests read the same balance, both pass, both fire — budget exceeded silently. Every naïve implementation has this bug.

Thskyshield runs the entire reservation — read balance, compare budget, increment spend, set TTL — as a single Lua script inside Redis. No interleaving is possible. A user cannot overrun their budget regardless of request concurrency.

LiteLLM and Helicone log spend after the fact. They cannot prevent the call. We block it before it starts.

Without atomic enforcement

Request A reads balance: $0.80 / $1.00

Request B reads balance: $0.80 / $1.00

Both pass — both fire

Actual spend: $1.60 ✗ Budget blown

With Thskyshield — single Lua script

Request A: atomicReserve → $0.80 → pass

Request B: atomicReserve → $1.00 → block

Budget enforced atomically ✓

Actual spend: $0.80 ✓ Limit held

What Gets Tracked

Full cost visibility. Zero guesswork.

Every compute event is logged with enough detail to audit, debug, and optimise your LLM usage.

Per-User Spend

Every call is attributed to an external user ID you provide. See exactly which users are consuming budget, and enforce per-user daily limits independently.

Per-Plan Budgets

Define separate daily budgets for each subscription tier — free, pro, enterprise, or any plan name you use. Each plan's spend is tracked in its own isolated Redis bucket. No cross-plan leakage.

Exact Token Costs

Input and output tokens are recorded separately. Cost is calculated using our model pricing registry — updated as providers change their rates.

Model Breakdown

Track spend across GPT-4o, Claude, Gemini, or any model your app uses. Compare cost-per-call across models to inform your model selection decisions.

Prompt Replay Detection

The governance engine automatically detects when the same prompt fires repeatedly. Identical requests beyond 10× in 60 seconds are short-circuited before a token is consumed — no changes required on your end.

Permanent Audit Trail

Every governance event — allowed, blocked, or logged — is written to Supabase with action, reason, plan, model, and exact cost. Full history, queryable, exportable.

Integration

60 seconds to governed.

Two calls wrap your existing LLM logic. No architecture changes. No infrastructure to manage. Pass a plan string and per-plan enforcement is live instantly.

Compatible with Next.js 14+ App & Pages Router

Works with OpenAI, Anthropic, Gemini, and any LLM

TypeScript-first SDK with full type safety

Per-plan budgets with a single config field

Fail-open: governance errors never break your app

// middleware.ts
import { shield } from "@thskyshield/next";

export default shield({
  routes: ["/api/*", "/dashboard/*"],
  policy: "strict"
});

Hard budget limits.For every LLM call you make.