Hard budget limits.
For every LLM call you make.
Wallet-drain attacks are the silent killer of AI apps. Thskyshield wraps your LLM calls with deterministic spend enforcement — per user, per plan, per day. If a user hits their limit, the call is cancelled before a single token is consumed.
The financial kill-switch for every LLM call.
A two-phase wrapper around your LLM calls. Phase A checks the budget before the call — in under 15ms. Phase B records the exact cost after. Redis handles the hot path. Supabase gets the permanent record. Watch both phases run in the live demo →
Pre-Call Budget Check (Phase A)
Before your app calls any LLM, the SDK pings /api/governance/check. We look up that user's daily spend in Upstash Redis — scoped to their plan tier if you pass one. If they've hit their configured limit, we return allowed: false in under 15ms. The API call never happens — zero tokens consumed, zero cost.
Per-Plan Tier Enforcement
Pass the user's plan on check() — 'free', 'pro', 'enterprise', or any string you define. Each plan enforces a separate daily budget in isolated Redis spend buckets. A free user hitting $5/day is blocked while a pro user continues up to $50/day — enforced at the same atomic layer, with zero cross-plan leakage. Falls back to your site's flat budget if no plan is passed.
Post-Call Usage Log (Phase B)
After the LLM responds, the SDK sends actual token counts to /api/governance/log. We calculate the exact dollar cost using our model pricing registry, reconcile the atomic reservation in Redis, and write a permanent record to Supabase with the model, cost, plan, and outcome.
Dashboard Visibility
Every spend event appears in your dashboard. Per-user breakdowns, per-site totals, per-plan activity, daily charts, and a real-time governance feed — all scoped to your API key.
Atomic Lua reservation.
No race condition, no silent overrun.
Without atomic enforcement, two parallel requests read the same balance, both pass, both fire — budget exceeded silently. Every naïve implementation has this bug.
Thskyshield runs the entire reservation — read balance, compare budget, increment spend, set TTL — as a single Lua script inside Redis. No interleaving is possible. A user cannot overrun their budget regardless of request concurrency.
LiteLLM and Helicone log spend after the fact. They cannot prevent the call. We block it before it starts.
Full cost visibility. Zero guesswork.
Every compute event is logged with enough detail to audit, debug, and optimise your LLM usage.
Every call is attributed to an external user ID you provide. See exactly which users are consuming budget, and enforce per-user daily limits independently.
Define separate daily budgets for each subscription tier — free, pro, enterprise, or any plan name you use. Each plan's spend is tracked in its own isolated Redis bucket. No cross-plan leakage.
Input and output tokens are recorded separately. Cost is calculated using our model pricing registry — updated as providers change their rates.
Track spend across GPT-4o, Claude, Gemini, or any model your app uses. Compare cost-per-call across models to inform your model selection decisions.
The governance engine automatically detects when the same prompt fires repeatedly. Identical requests beyond 10× in 60 seconds are short-circuited before a token is consumed — no changes required on your end.
Every governance event — allowed, blocked, or logged — is written to Supabase with action, reason, plan, model, and exact cost. Full history, queryable, exportable.
60 seconds to governed.
Two calls wrap your existing LLM logic. No architecture changes. No infrastructure to manage. Pass a plan string and per-plan enforcement is live instantly.
// middleware.ts
import { shield } from "@thskyshield/next";
export default shield({
routes: ["/api/*", "/dashboard/*"],
policy: "strict"
});