Documentation

Integration Guide

Add AI compute governance to your app in two calls. Use the SDK for Next.js — or the HTTP API for any language.

Next.js SDK →HTTP API (any stack) →SDK Reference →Reason Codes →

✦ What's new

HTTP API — any stack — two POST endpoints work with Python, Go, Ruby, PHP, or curl. No SDK required. Jump to HTTP API →
Per-plan budget enforcement — pass plan to check() or estimated_tokens. Free, pro, and enterprise users each enforce their own daily budget in isolated Redis buckets.
Atomic cost reservation — parallel requests can no longer race past your budget limit. Single Lua operation, no TOCTOU window.

Next.js SDK

Quickstart — Next.js

Under 60 seconds. Two calls wrap your existing LLM logic. No architecture changes required.

Prerequisites

• Next.js 14+ with App Router
• Any LLM provider (OpenAI, Anthropic, Gemini, etc.)
• A Site ID and API key from your Thskyshield dashboard

Install the SDK

terminal

npm install @thsky-21/thskyshield
# or
yarn add @thsky-21/thskyshield

Add your credentials

Both values are shown once when you create a site in your dashboard. Store them in your environment file — never commit them to source code.

.env.local

THSKYSHIELD_SITE_ID=your_site_id_here
THSKYSHIELD_KEY=your_api_key_here

Wrap your LLM call

Call check() before your LLM call and log() after. Everything else stays the same.

app/api/chat/route.ts

import { Thskyshield } from '@thsky-21/thskyshield'
//              ↑ lowercase 's' — this is exact, not a typo

const shield = new Thskyshield({
  siteId: process.env.THSKYSHIELD_SITE_ID!,
  apiKey: process.env.THSKYSHIELD_KEY!,
})

export async function POST(req: Request) {
  const userId = /* your auth — get the user ID */

  const { allowed, reason, requestId } = await shield.check({
    externalUserId:  userId,
    model:           'gpt-4o',
    estimatedTokens: { input: 500, output: 200 },
  })

  if (!allowed) {
    return Response.json({ error: 'Request blocked.', reason }, { status: 429 })
  }

  const completion = await openai.chat.completions.create({ ... })

  // plan is carried automatically via requestId — no need to pass it again
  await shield.log({
    requestId,
    externalUserId: userId,
    model:          'gpt-4o',
    tokens: {
      input:  completion.usage?.prompt_tokens ?? 0,
      output: completion.usage?.completion_tokens ?? 0,
    },
  })

  return Response.json({ response: completion.choices[0].message.content })
}

Why must log() be awaited?

If log() is fire-and-forget, a fast attacker can send a second request before the first cost is written to Redis — bypassing your budget. Awaiting it guarantees the Redis counter is updated before your function returns.

Deploy and verify

Deploy your app. Then check your dashboard — governance events should appear within seconds of your first LLM call.

✓ Governance active

Send a chat message. Your dashboard Governance Feed should show an 'allowed' event with model, cost, and token counts within a few seconds.

✓ Budget enforcement active

Set a low daily limit in your dashboard (e.g. $0.10). Make enough calls to exceed it. Subsequent check() calls should return allowed: false with reason: BUDGET_EXCEEDED.

✓ Per-plan enforcement active

Add a plan row in your dashboard (e.g. 'free' at $0.05/day). Pass plan: 'free' to check(). Exhaust the free-plan budget — pro-plan calls should continue unblocked.

✓ Parallel protection active

Send two requests simultaneously near the budget limit. Only one should succeed — the other returns BUDGET_EXCEEDED. This is the Lua atomic reservation working.

HTTP API — any language

HTTP API

Any stack. Two HTTP calls.

Not using Next.js? Two POST endpoints expose the same governance engine to any language or framework. Python, Go, Ruby, PHP, Java — if you can make an HTTP request, you can integrate Thskyshield. Same Redis atomic logic. Same budget enforcement. Same dashboard.

POST /api/v1/checkPOST /api/v1/log

Authentication

Pass your API key in the Authorization header on every request. The site_id goes in the request body.

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

POST/api/v1/checkPhase A

Call this before every LLM API call. Returns allowed: true/false. If allowed is false, do not make the LLM call.

Request body

{
  "site_id":          "your_site_id",     // required
  "external_user_id": "user_123",         // required
  "model":            "gpt-4o",           // optional
  "estimated_tokens": {                   // optional
    "input":  800,
    "output": 300
  },
  "plan":             "pro"               // optional
}

Response — allowed

{
  "allowed":          true,
  "request_id":       "req_a1b2c3",
  "budget_remaining": 0.34,
  "budget_limit":     0.50,
  "user_spend_today": 0.16
}

Response — blocked

{
  "allowed":          false,
  "request_id":       null,
  "reason":           "BUDGET_EXCEEDED",
  "budget_remaining": 0.00,
  "budget_limit":     0.50,
  "user_spend_today": 0.51,
  "retry_after":      "2026-04-13T00:00:00.000Z"
}

HTTP status codes

`200`	Check completed — use the allowed field to decide whether to proceed
`401`	Invalid API key or site_id mismatch
`422`	Missing or invalid required fields
`429`	Your own governance rate limit exceeded (not the user budget)
`500`	Server error — fail-open, request treated as allowed

POST/api/v1/logPhase B

Call this after every successful LLM call with actual token counts. Pass the request_id from /v1/check — this links both phases and reconciles the Redis spend counter atomically.

Request body

{
  "site_id":      "your_site_id",   // required
  "request_id":   "req_a1b2c3",     // from /v1/check
  "model":        "gpt-4o",         // required
  "actual_tokens": {                // required
    "input":  743,
    "output": 287
  },
  "status":       "success"         // optional
}

Response

{
  "logged":           true,
  "actual_cost":      0.008900,
  "user_spend_today": 0.248900,
  "budget_remaining": 0.251100
}

Integration examples

Working examples in every major language. Copy, paste, ship.

Python (FastAPI + httpx)

governance.py

import httpx, os

API_KEY  = os.getenv("THSKYSHIELD_API_KEY")
SITE_ID  = os.getenv("THSKYSHIELD_SITE_ID")
BASE_URL = "https://thskyshield.com/api/v1"
HEADERS  = {"Authorization": f"Bearer {API_KEY}"}

async def check_budget(user_id: str, model: str,
                        input_tokens: int, output_tokens: int):
    async with httpx.AsyncClient() as client:
        r = await client.post(f"{BASE_URL}/check", headers=HEADERS, json={
            "site_id":          SITE_ID,
            "external_user_id": user_id,
            "model":            model,
            "estimated_tokens": {"input": input_tokens, "output": output_tokens},
        })
        return r.json()

async def log_usage(request_id: str, model: str,
                    input_tokens: int, output_tokens: int):
    async with httpx.AsyncClient() as client:
        await client.post(f"{BASE_URL}/log", headers=HEADERS, json={
            "site_id":        SITE_ID,
            "request_id":     request_id,
            "model":          model,
            "actual_tokens":  {"input": input_tokens, "output": output_tokens},
            "status":         "success",
        })

# ── Usage in your FastAPI route ──────────────────────────────────────────────

@app.post("/chat")
async def chat(body: ChatRequest, user_id: str):
    check = await check_budget(user_id, "gpt-4o", 800, 300)

    if not check["allowed"]:
        raise HTTPException(status_code=429,
                            detail={"error": "Budget exceeded",
                                    "reason": check["reason"]})

    completion = await openai_client.chat.completions.create(
        model="gpt-4o", messages=body.messages
    )

    await log_usage(
        check["request_id"], "gpt-4o",
        completion.usage.prompt_tokens,
        completion.usage.completion_tokens,
    )

    return {"response": completion.choices[0].message.content}

thskyshield.go

package thskyshield

import (
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
)

const baseURL = "https://thskyshield.com/api/v1"

type CheckRequest struct {
    SiteID          string          `json:"site_id"`
    ExternalUserID  string          `json:"external_user_id"`
    Model           string          `json:"model"`
    EstimatedTokens TokenEstimate   `json:"estimated_tokens"`
}

type TokenEstimate struct {
    Input  int `json:"input"`
    Output int `json:"output"`
}

type CheckResponse struct {
    Allowed         bool    `json:"allowed"`
    RequestID       string  `json:"request_id"`
    BudgetRemaining float64 `json:"budget_remaining"`
    Reason          string  `json:"reason,omitempty"`
}

func Check(userID, model string, input, output int) (*CheckResponse, error) {
    body, _ := json.Marshal(CheckRequest{
        SiteID:         os.Getenv("THSKYSHIELD_SITE_ID"),
        ExternalUserID: userID,
        Model:          model,
        EstimatedTokens: TokenEstimate{Input: input, Output: output},
    })

    req, _ := http.NewRequest("POST", baseURL+"/check", bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer "+os.Getenv("THSKYSHIELD_API_KEY"))
    req.Header.Set("Content-Type", "application/json")

    resp, err := (&http.Client{}).Do(req)
    if err != nil { return nil, err }
    defer resp.Body.Close()

    var result CheckResponse
    json.NewDecoder(resp.Body).Decode(&result)
    return &result, nil
}

Ruby

thskyshield.rb

require 'net/http'
require 'json'

class ThskyShield
  BASE_URL = 'https://thskyshield.com/api/v1'

  def self.check(user_id:, model:, input_tokens:, output_tokens:)
    call(:check, {
      site_id:          ENV['THSKYSHIELD_SITE_ID'],
      external_user_id: user_id,
      model:            model,
      estimated_tokens: { input: input_tokens, output: output_tokens }
    })
  end

  def self.log(request_id:, model:, input_tokens:, output_tokens:)
    call(:log, {
      site_id:       ENV['THSKYSHIELD_SITE_ID'],
      request_id:    request_id,
      model:         model,
      actual_tokens: { input: input_tokens, output: output_tokens },
      status:        'success'
    })
  end

  def self.call(endpoint, payload)
    uri = URI("#{BASE_URL}/#{endpoint}")
    req = Net::HTTP::Post.new(uri)
    req['Authorization'] = "Bearer #{ENV['THSKYSHIELD_API_KEY']}"
    req['Content-Type']  = 'application/json'
    req.body             = payload.to_json
    res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |h| h.request(req) }
    JSON.parse(res.body)
  end
end

# ── Usage ────────────────────────────────────────────────────────────────────

result = ThskyShield.check(user_id: current_user.id,
                            model: 'gpt-4o',
                            input_tokens: 800, output_tokens: 300)

if !result['allowed']
  render json: { error: 'Budget exceeded', reason: result['reason'] }, status: 429
  return
end

curl (works from any language via shell)

terminal

# Step 1 — check before calling your LLM
curl -X POST https://thskyshield.com/api/v1/check \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "site_id":          "YOUR_SITE_ID",
    "external_user_id": "user_123",
    "model":            "gpt-4o",
    "estimated_tokens": { "input": 800, "output": 300 }
  }'

# Response (allowed):
# {
#   "allowed": true,
#   "request_id": "req_a1b2c3d4e5f6",
#   "budget_remaining": 0.34,
#   "budget_limit": 0.50,
#   "user_spend_today": 0.16
# }

# Step 2 — log actual usage after your LLM responds
curl -X POST https://thskyshield.com/api/v1/log \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "site_id":       "YOUR_SITE_ID",
    "request_id":    "req_a1b2c3d4e5f6",
    "model":         "gpt-4o",
    "actual_tokens": { "input": 743, "output": 287 },
    "status":        "success"
  }'

How request_id works

When /v1/check reserves a cost, it stores the context (user ID, estimated cost, plan) in Redis under the request_id key with a 5-minute TTL. When /v1/log receives that ID, it retrieves the context, runs the Lua reconciliation, then deletes the key. One-time use. If you never call /v1/log, the reservation expires naturally after 60 seconds — no budget leak.

SDK Reference

The SDK exposes two methods. That's it — no middleware, no edge config, no infrastructure to manage.

shield.check({ externalUserId, model, estimatedTokens?, promptHash?, plan? })Phase A

→ Promise<{ allowed: boolean, requestId: string, reason?: string, reserved?: string, currentSpend?: string, limit?: number, plan?: string }>

Pre-call budget gate. Atomically reserves the estimated cost in Redis so parallel requests from the same user cannot race past the budget limit simultaneously. Pass plan to enforce a plan-specific daily budget (resolved from your site_plans config). Returns a requestId — pass this to log() to link the two phases and carry the plan automatically. Returns allowed: false with a reason code if the call should be blocked — the LLM call must not proceed.

shield.log({ requestId, externalUserId, model, tokens })Phase B

→ Promise<{ success: boolean, cost?: string, plan?: string }>

Post-call reconciliation. Pass the requestId returned by check() — the SDK uses it to look up the in-flight reservation and the effective plan, so you do not need to pass plan again. Releases the cost reservation and applies the actual token cost atomically in a single Lua operation. Must be awaited — not fire-and-forget — so the Redis spend counter is updated before the next request from the same user arrives.

Reason Codes

When allowed: false is returned — by the SDK or the HTTP API — the reason field tells you exactly why. Use this to show the right message to your users.

Code	Meaning
`BUDGET_EXCEEDED`	User has hit their daily spend limit for their plan tier (or the site flat budget if no plan was passed). The LLM call was not made — zero cost incurred.
`VELOCITY_EXCEEDED`	More than 60 requests in 60 seconds from the same user. True sliding window per (siteId, userId). This threshold is intentionally high — it targets automation, not legitimate usage.
`PROMPT_REPEAT_DETECTED`	Identical prompt hash seen more than 10 times in 60 seconds — automation replay flagged. SDK-only: pass promptHash (SHA-256 of your prompt) to check() to enable this.
`REQUEST_COST_EXCEEDED`	Estimated cost of a single request exceeds the $0.25 per-request hard cap. Check your estimated_tokens values.
`CIRCUIT_BREAKER_FALLBACK`	Redis is degraded — circuit breaker has opened. The request is allowed to avoid blocking customers during our infra outage. Recovers automatically after 30 seconds.
`UNAUTHORIZED`	API key verification failed. Deliberately opaque — the response does not reveal whether the key or site_id was the mismatch. Check both values in your .env.

Need help with setup?

We're in early access — every integration is supported directly by the founder. Reach out and we'll get you unblocked.

Contact the founder