Integration Guide
Add AI compute governance to your app in two calls. Use the SDK for Next.js — or the HTTP API for any language.
✦ What's new
- HTTP API — any stack — two POST endpoints work with Python, Go, Ruby, PHP, or curl. No SDK required. Jump to HTTP API →
- Per-plan budget enforcement — pass
plantocheck()orestimated_tokens. Free, pro, and enterprise users each enforce their own daily budget in isolated Redis buckets. - Atomic cost reservation — parallel requests can no longer race past your budget limit. Single Lua operation, no TOCTOU window.
Quickstart — Next.js
Under 60 seconds. Two calls wrap your existing LLM logic. No architecture changes required.
Prerequisites
- • Next.js 14+ with App Router
- • Any LLM provider (OpenAI, Anthropic, Gemini, etc.)
- • A Site ID and API key from your Thskyshield dashboard
Install the SDK
npm install @thsky-21/thskyshield
# or
yarn add @thsky-21/thskyshieldAdd your credentials
Both values are shown once when you create a site in your dashboard. Store them in your environment file — never commit them to source code.
THSKYSHIELD_SITE_ID=your_site_id_here
THSKYSHIELD_KEY=your_api_key_hereWrap your LLM call
Call check() before your LLM call and log() after. Everything else stays the same.
import { Thskyshield } from '@thsky-21/thskyshield'
// ↑ lowercase 's' — this is exact, not a typo
const shield = new Thskyshield({
siteId: process.env.THSKYSHIELD_SITE_ID!,
apiKey: process.env.THSKYSHIELD_KEY!,
})
export async function POST(req: Request) {
const userId = /* your auth — get the user ID */
const { allowed, reason, requestId } = await shield.check({
externalUserId: userId,
model: 'gpt-4o',
estimatedTokens: { input: 500, output: 200 },
})
if (!allowed) {
return Response.json({ error: 'Request blocked.', reason }, { status: 429 })
}
const completion = await openai.chat.completions.create({ ... })
// plan is carried automatically via requestId — no need to pass it again
await shield.log({
requestId,
externalUserId: userId,
model: 'gpt-4o',
tokens: {
input: completion.usage?.prompt_tokens ?? 0,
output: completion.usage?.completion_tokens ?? 0,
},
})
return Response.json({ response: completion.choices[0].message.content })
}Why must log() be awaited?
If log() is fire-and-forget, a fast attacker can send a second request before the first cost is written to Redis — bypassing your budget. Awaiting it guarantees the Redis counter is updated before your function returns.
Deploy and verify
Deploy your app. Then check your dashboard — governance events should appear within seconds of your first LLM call.
✓ Governance active
Send a chat message. Your dashboard Governance Feed should show an 'allowed' event with model, cost, and token counts within a few seconds.
✓ Budget enforcement active
Set a low daily limit in your dashboard (e.g. $0.10). Make enough calls to exceed it. Subsequent check() calls should return allowed: false with reason: BUDGET_EXCEEDED.
✓ Per-plan enforcement active
Add a plan row in your dashboard (e.g. 'free' at $0.05/day). Pass plan: 'free' to check(). Exhaust the free-plan budget — pro-plan calls should continue unblocked.
✓ Parallel protection active
Send two requests simultaneously near the budget limit. Only one should succeed — the other returns BUDGET_EXCEEDED. This is the Lua atomic reservation working.
Any stack. Two HTTP calls.
Not using Next.js? Two POST endpoints expose the same governance engine to any language or framework. Python, Go, Ruby, PHP, Java — if you can make an HTTP request, you can integrate Thskyshield. Same Redis atomic logic. Same budget enforcement. Same dashboard.
Authentication
Pass your API key in the Authorization header on every request. The site_id goes in the request body.
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json/api/v1/checkPhase ACall this before every LLM API call. Returns allowed: true/false. If allowed is false, do not make the LLM call.
Request body
{
"site_id": "your_site_id", // required
"external_user_id": "user_123", // required
"model": "gpt-4o", // optional
"estimated_tokens": { // optional
"input": 800,
"output": 300
},
"plan": "pro" // optional
}Response — allowed
{
"allowed": true,
"request_id": "req_a1b2c3",
"budget_remaining": 0.34,
"budget_limit": 0.50,
"user_spend_today": 0.16
}Response — blocked
{
"allowed": false,
"request_id": null,
"reason": "BUDGET_EXCEEDED",
"budget_remaining": 0.00,
"budget_limit": 0.50,
"user_spend_today": 0.51,
"retry_after": "2026-04-13T00:00:00.000Z"
}HTTP status codes
200 | Check completed — use the allowed field to decide whether to proceed |
401 | Invalid API key or site_id mismatch |
422 | Missing or invalid required fields |
429 | Your own governance rate limit exceeded (not the user budget) |
500 | Server error — fail-open, request treated as allowed |
/api/v1/logPhase BCall this after every successful LLM call with actual token counts. Pass the request_id from /v1/check — this links both phases and reconciles the Redis spend counter atomically.
Request body
{
"site_id": "your_site_id", // required
"request_id": "req_a1b2c3", // from /v1/check
"model": "gpt-4o", // required
"actual_tokens": { // required
"input": 743,
"output": 287
},
"status": "success" // optional
}Response
{
"logged": true,
"actual_cost": 0.008900,
"user_spend_today": 0.248900,
"budget_remaining": 0.251100
}Integration examples
Working examples in every major language. Copy, paste, ship.
Python (FastAPI + httpx)
import httpx, os
API_KEY = os.getenv("THSKYSHIELD_API_KEY")
SITE_ID = os.getenv("THSKYSHIELD_SITE_ID")
BASE_URL = "https://thskyshield.com/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
async def check_budget(user_id: str, model: str,
input_tokens: int, output_tokens: int):
async with httpx.AsyncClient() as client:
r = await client.post(f"{BASE_URL}/check", headers=HEADERS, json={
"site_id": SITE_ID,
"external_user_id": user_id,
"model": model,
"estimated_tokens": {"input": input_tokens, "output": output_tokens},
})
return r.json()
async def log_usage(request_id: str, model: str,
input_tokens: int, output_tokens: int):
async with httpx.AsyncClient() as client:
await client.post(f"{BASE_URL}/log", headers=HEADERS, json={
"site_id": SITE_ID,
"request_id": request_id,
"model": model,
"actual_tokens": {"input": input_tokens, "output": output_tokens},
"status": "success",
})
# ── Usage in your FastAPI route ──────────────────────────────────────────────
@app.post("/chat")
async def chat(body: ChatRequest, user_id: str):
check = await check_budget(user_id, "gpt-4o", 800, 300)
if not check["allowed"]:
raise HTTPException(status_code=429,
detail={"error": "Budget exceeded",
"reason": check["reason"]})
completion = await openai_client.chat.completions.create(
model="gpt-4o", messages=body.messages
)
await log_usage(
check["request_id"], "gpt-4o",
completion.usage.prompt_tokens,
completion.usage.completion_tokens,
)
return {"response": completion.choices[0].message.content}Go
package thskyshield
import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"os"
)
const baseURL = "https://thskyshield.com/api/v1"
type CheckRequest struct {
SiteID string `json:"site_id"`
ExternalUserID string `json:"external_user_id"`
Model string `json:"model"`
EstimatedTokens TokenEstimate `json:"estimated_tokens"`
}
type TokenEstimate struct {
Input int `json:"input"`
Output int `json:"output"`
}
type CheckResponse struct {
Allowed bool `json:"allowed"`
RequestID string `json:"request_id"`
BudgetRemaining float64 `json:"budget_remaining"`
Reason string `json:"reason,omitempty"`
}
func Check(userID, model string, input, output int) (*CheckResponse, error) {
body, _ := json.Marshal(CheckRequest{
SiteID: os.Getenv("THSKYSHIELD_SITE_ID"),
ExternalUserID: userID,
Model: model,
EstimatedTokens: TokenEstimate{Input: input, Output: output},
})
req, _ := http.NewRequest("POST", baseURL+"/check", bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer "+os.Getenv("THSKYSHIELD_API_KEY"))
req.Header.Set("Content-Type", "application/json")
resp, err := (&http.Client{}).Do(req)
if err != nil { return nil, err }
defer resp.Body.Close()
var result CheckResponse
json.NewDecoder(resp.Body).Decode(&result)
return &result, nil
}Ruby
require 'net/http'
require 'json'
class ThskyShield
BASE_URL = 'https://thskyshield.com/api/v1'
def self.check(user_id:, model:, input_tokens:, output_tokens:)
call(:check, {
site_id: ENV['THSKYSHIELD_SITE_ID'],
external_user_id: user_id,
model: model,
estimated_tokens: { input: input_tokens, output: output_tokens }
})
end
def self.log(request_id:, model:, input_tokens:, output_tokens:)
call(:log, {
site_id: ENV['THSKYSHIELD_SITE_ID'],
request_id: request_id,
model: model,
actual_tokens: { input: input_tokens, output: output_tokens },
status: 'success'
})
end
def self.call(endpoint, payload)
uri = URI("#{BASE_URL}/#{endpoint}")
req = Net::HTTP::Post.new(uri)
req['Authorization'] = "Bearer #{ENV['THSKYSHIELD_API_KEY']}"
req['Content-Type'] = 'application/json'
req.body = payload.to_json
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |h| h.request(req) }
JSON.parse(res.body)
end
end
# ── Usage ────────────────────────────────────────────────────────────────────
result = ThskyShield.check(user_id: current_user.id,
model: 'gpt-4o',
input_tokens: 800, output_tokens: 300)
if !result['allowed']
render json: { error: 'Budget exceeded', reason: result['reason'] }, status: 429
return
endcurl (works from any language via shell)
# Step 1 — check before calling your LLM
curl -X POST https://thskyshield.com/api/v1/check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"site_id": "YOUR_SITE_ID",
"external_user_id": "user_123",
"model": "gpt-4o",
"estimated_tokens": { "input": 800, "output": 300 }
}'
# Response (allowed):
# {
# "allowed": true,
# "request_id": "req_a1b2c3d4e5f6",
# "budget_remaining": 0.34,
# "budget_limit": 0.50,
# "user_spend_today": 0.16
# }
# Step 2 — log actual usage after your LLM responds
curl -X POST https://thskyshield.com/api/v1/log \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"site_id": "YOUR_SITE_ID",
"request_id": "req_a1b2c3d4e5f6",
"model": "gpt-4o",
"actual_tokens": { "input": 743, "output": 287 },
"status": "success"
}'How request_id works
When /v1/check reserves a cost, it stores the context (user ID, estimated cost, plan) in Redis under the request_id key with a 5-minute TTL. When /v1/log receives that ID, it retrieves the context, runs the Lua reconciliation, then deletes the key. One-time use. If you never call /v1/log, the reservation expires naturally after 60 seconds — no budget leak.
SDK Reference
The SDK exposes two methods. That's it — no middleware, no edge config, no infrastructure to manage.
shield.check({ externalUserId, model, estimatedTokens?, promptHash?, plan? })Phase APre-call budget gate. Atomically reserves the estimated cost in Redis so parallel requests from the same user cannot race past the budget limit simultaneously. Pass plan to enforce a plan-specific daily budget (resolved from your site_plans config). Returns a requestId — pass this to log() to link the two phases and carry the plan automatically. Returns allowed: false with a reason code if the call should be blocked — the LLM call must not proceed.
shield.log({ requestId, externalUserId, model, tokens })Phase BPost-call reconciliation. Pass the requestId returned by check() — the SDK uses it to look up the in-flight reservation and the effective plan, so you do not need to pass plan again. Releases the cost reservation and applies the actual token cost atomically in a single Lua operation. Must be awaited — not fire-and-forget — so the Redis spend counter is updated before the next request from the same user arrives.
Reason Codes
When allowed: false is returned — by the SDK or the HTTP API — the reason field tells you exactly why. Use this to show the right message to your users.
| Code | Meaning |
|---|---|
BUDGET_EXCEEDED | User has hit their daily spend limit for their plan tier (or the site flat budget if no plan was passed). The LLM call was not made — zero cost incurred. |
VELOCITY_EXCEEDED | More than 60 requests in 60 seconds from the same user. True sliding window per (siteId, userId). This threshold is intentionally high — it targets automation, not legitimate usage. |
PROMPT_REPEAT_DETECTED | Identical prompt hash seen more than 10 times in 60 seconds — automation replay flagged. SDK-only: pass promptHash (SHA-256 of your prompt) to check() to enable this. |
REQUEST_COST_EXCEEDED | Estimated cost of a single request exceeds the $0.25 per-request hard cap. Check your estimated_tokens values. |
CIRCUIT_BREAKER_FALLBACK | Redis is degraded — circuit breaker has opened. The request is allowed to avoid blocking customers during our infra outage. Recovers automatically after 30 seconds. |
UNAUTHORIZED | API key verification failed. Deliberately opaque — the response does not reveal whether the key or site_id was the mismatch. Check both values in your .env. |
Need help with setup?
We're in early access — every integration is supported directly by the founder. Reach out and we'll get you unblocked.
Contact the founder