The Control Plane for AI in Production

Stop runaway agents.
Cap per-user spend.
Audit every LLM decision your team ships.

Hard cost ceilings and loop detection for every agent your team ships. Five lines of code. Zero infrastructure.

Get started — it's free See how the kill works

Works with LangGraph · CrewAI · OpenAI Agents SDK · any framework

agent-run · research-agent · live$0.000

thskyshield · policy enforced0/8 events

< 10ms

Checked per step

One Redis round-trip, before the call fires

Atomic

No race window

Budget reserved in Redis via Lua — parallel steps see it at once

5 lines

To wrap a loop

LangGraph, CrewAI, OpenAI Agents SDK, or your own

Two products, one engine

Same atomic Redis check, two problems.

Design partners

My agent might loop and burn $200.

Runtime governance for agent runs — hard budget ceilings, loop detection, and a kill switch that fires before the money is spent.

Budget ceilingLoop detectionKill switchStep audit

See how it works Talk to us →

Live today

My users could drain my OpenAI bill.

Per-user spend enforcement with atomic Redis. Blocks the API call before it executes — no tokens burned, no cost incurred.

Per-user limitsUnder 10ms checkFail-open

Full product Get started →

For AI Agents

Your agent shouldn't be able to spend $200 on a $2 task.

Autonomous agents are non-deterministic by design. One stuck loop, one bad tool call, one hallucinated retry — and a $0.40 task becomes a $40 incident. You won't see it until the OpenAI bill arrives.

Thskyshield wraps each run with a hard dollar budget, loop detection, an iteration cap, and a timeout. You set the limits when the run starts; the kill fires on whichever one trips first. Every step lands in your dashboard.

Stuck loops

An agent retries the same call over and over. Killed at the loop signature, not at midnight when you read the bill.

Cost blowouts

A run crosses its dollar ceiling. Stopped with whatever it has, instead of grinding on into a $40 surprise.

Endless steps

A run blows past the step count you allowed. Capped before it keeps planning forever.

Hung runs

A step hangs and the run outlives its timeout. Ended cleanly instead of left running in the dark.

agent/run.ts

import { Thskyshield, ShieldKilledError } from '@thsky-21/thskyshield'

const shield = new Thskyshield({
  siteId: process.env.THSKYSHIELD_SITE_ID!,
  apiKey: process.env.THSKYSHIELD_KEY!,
})

const run = await shield.beginRun({ budgetLimitUsd: 2.00, loopThreshold: 5 })

try {
  while (!done) {
    const { requestId } = await run.beforeStep({
      stepType:    'llm',
      model:       'gpt-4o-mini',
      promptInput: prompt,
    })

    const out = await callYourLLM(prompt)
    await run.afterStep({ requestId, actualTokens: out.usage, model: 'gpt-4o-mini' })
  }
} catch (e) {
  if (e instanceof ShieldKilledError) console.log('stopped:', e.reason)
} finally {
  await run.end()
}

It wraps the loop you already have — works with LangGraph, CrewAI, the OpenAI Agents SDK, or a plain while-loop.

Design partners. We're working with a small group of teams to shape the agent SDK. Free forever for the first five. Want in?

Talk to us

How It Works

Three calls wrap the whole loop.

Wrap your agent loop with beginRun, beforeStep, and afterStep. The control plane enforces your policy on every call — atomically, in under 10ms.

shield.beginRun()

Define run limits

Set a dollar budget, max iterations, timeout, and loop detection threshold. Returns a run handle tied to your API key and site.

run.beforeStep()

Gate every LLM call

A single atomic Redis round-trip checks all four limits — budget, iterations, timeout, loop — before the API call fires. Returns allowed or throws ShieldKilledError.

run.afterStep() + run.end()

Settle cost, close run

afterStep reconciles the actual token cost in Redis. end() writes the final summary to Supabase — total cost, iteration count, kill reason if applicable.

agent.ts

import { Thskyshield, ShieldKilledError } from '@thsky-21/thskyshield'

const shield = new Thskyshield({ siteId, apiKey })

const run = await shield.beginRun({
  budgetLimitUsd: 2.00,
  iterationLimit: 30,
  loopThreshold:  5,
})

try {
  while (!done) {
    const { requestId } = await run.beforeStep({
      stepType:        'llm',
      model:           'gpt-4o-mini',
      estimatedTokens: { input: 500, output: 200 },
      promptInput:     currentPrompt,
    })

    const result = await callYourLLM(currentPrompt)

    await run.afterStep({
      requestId,
      actualTokens: result.usage,
      model:        'gpt-4o-mini',
    })
  }
} catch (e) {
  if (e instanceof ShieldKilledError) {
    // e.reason: 'killed_budget' | 'killed_loop' | 'killed_iterations' | 'killed_timeout'
    console.log(`Agent stopped: ${e.reason}. Spent: $${e.spent}`)
  }
} finally {
  const summary = await run.end()
  console.log(`Total: $${summary.totalCostUsd}`)
}

Kill triggers — thrown as ShieldKilledError

killed_budget

Budget exceeded

spent + reserved > limit

killed_loop

Loop detected

same prompt ≥ loop_threshold

killed_iterations

Max steps hit

iter ≥ iteration_limit

killed_timeout

Timed out

elapsed > timeout_seconds

Why a Control Plane

Not a library. A layer above your code.

A library

your-app

budget-lib

service-b

no policy here

runs in-process · per developer · stops with the process

A control plane

Thskyshield

policy + audit

service-a

service-b

service-c

runs above your code · per org · persists across deploys

Drop-in budget libraries are great — for one developer protecting one script. They run inside your process, log to stdout, and stop where your code stops.

The moment you have a team, multiple services, policies that need to match across them, or a finance lead who wants to see what's actually happening — you need a layer above the code. A place where policy lives. A place where every decision is logged. A place that doesn't disappear when the process exits.

That's Thskyshield. The SDK is two functions. Everything else — policies, dashboards, audit logs, alerts, cross-service visibility — lives in the control plane.

Pricing

Free while you build. $49 when you scale.

Free

Forever

1,000 agent runs / month
100 LLM-app active users
1 policy
7-day audit retention
—Webhooks
Community support

Get started

Popular

Pro

$49/mo

Everything in Free, plus:

Unlimited agent runs
Unlimited active users
Unlimited policies
90-day audit retention
Webhooks
Email + Slack support

Talk to us

Things people ask us.

Give your agents a kill switch.

Free to start, no credit card. You can wrap your first run in a few minutes.

$npm install @thsky-21/thskyshield

Get started free Talk to us

Stop runaway agents.
Cap per-user spend.
Audit every LLM decision your team ships.

Same atomic Redis check, two problems.

My agent might loop and burn $200.

My users could drain my OpenAI bill.

Your agent shouldn't be able to spend $200 on a $2 task.

Three calls wrap the whole loop.

Not a library. A layer above your code.

Free while you build. $49 when you scale.

Things people ask us.

How is Thskyshield different from drop-in budget libraries or LLM observability tools?

Why not just set max_iterations=10?

How does the budget enforcement actually work?

What happens if your service goes down?

What LLM providers are supported?

Can I set different limits per user?

What is prompt deduplication / loop detection?

How accurate is the cost calculation?

Give your agents a kill switch.

Stop runaway agents. Cap per-user spend. Audit every LLM decision your team ships.

Same atomic Redis check, two problems.

My agent might loop and burn $200.

My users could drain my OpenAI bill.

Your agent shouldn't be able to spend $200 on a $2 task.

Three calls wrap the whole loop.

Not a library. A layer above your code.

Free while you build. $49 when you scale.

Things people ask us.

How is Thskyshield different from drop-in budget libraries or LLM observability tools?

Why not just set max_iterations=10?

How does the budget enforcement actually work?

What happens if your service goes down?

What LLM providers are supported?

Can I set different limits per user?

What is prompt deduplication / loop detection?

How accurate is the cost calculation?

Give your agents a kill switch.

Stop runaway agents.
Cap per-user spend.
Audit every LLM decision your team ships.