Guardrail — AI Escalation Layer

How it works

Guardrail sits between your AI and your users.

We don't generate text. We don't consume your AI tokens. We only read the response text and score it in milliseconds.

👤

Step 1

User asks
a question

🤖

Step 2

Your AI
responds

🛡️

Step 3

Guardrail
scores it

≤ 20ms

✅ Deliver

⚠️ Flag

🔴 Escalate

⚡

⚡ Zero tokens consumed

Guardrail never calls your LLM. We only receive the response text your AI already generated — no extra API costs, ever.

🔌

Works with any AI

One line of code wraps OpenAI, Claude, Gemini, Llama, or your own model. Guardrail is model-agnostic by design.

🔒

Your data stays yours

We analyse response text only. No user PII, no prompts, no conversation history is ever stored or accessed.

Integrations

Every way to use Guardrail

One API, five ways to plug in. Pick the one that matches your stack.

🧩

Browser · Node.js

JS SDK

Wrap any AI response with gr.check(). Works in the browser or server-side Node.js. One line added to your existing pipeline.

// load once <script src="/sdk/guardrail.js"></script> const gr = new Guardrail({ apiKey: "gr_live_xxx" }); const r = await gr.check(aiText); // → { decision, confidence }

Get snippet →

💬

Any website

Drop-in Chat Widget

One <script> tag adds a floating 🛡️ AI chat to any site. Every response shows a confidence badge. Zero config, dark/light theme.

<script src="/embed/widget.js" data-key="gr_live_xxx" data-context="general" data-theme="dark" ></script>

Copy widget code →

🤖

Claude Desktop · Cursor · Windsurf

MCP Server — One Command

Native MCP integration. Claude can call check_confidence, score_and_explain, and get_my_stats as built-in tools. Zero config beyond one line.

# Install & run instantly npx guardrail-ai-mcp --key gr_live_xxx // Or add to claude_desktop_config.json { "mcpServers": { "guardrail": { "command": "npx", "args": ["guardrail-mcp", "--key", "gr_live_xxx"] } } }

Full MCP docs →

🔌

Python · Go · curl · Any language

REST API

Plain HTTP POST to /api/check. Works from any language or tool. Pass your text and get a decision back in JSON. No SDK required.

curl -X POST \ https://guardrail-mvp-production.up.railway.app/api/check \ -H "X-Guardrail-Key: gr_live_xxx" \ -d '{"text":"your AI response"}' → {"decision":"deliver","confidence":0.91}

Copy curl →

🧪

No code needed

Live Chat Demo

Try Guardrail right now without writing any code. Chat with Claude via a web UI — every response is confidence-scored and highlighted in real-time. Bring your own Anthropic key.

// Nothing to install. // Just need: ✅ A Guardrail key → sign up above ✅ An Anthropic key → console.anthropic.com // Open the Chat Demo and start typing. // Guardrail scores every reply.

Open Chat Demo →

🎮

No setup needed

Confidence Playground

Paste any text and score it instantly against different domain contexts — medical, legal, financial, and more. See which signals trigger flags and why. No API key needed to explore.

// Paste. Click. See results. Input: "I think maybe this drug might help" Context: medical Output: decision: escalate confidence: 0.28 reasons: ["hedged_language", "medical_risk"]

Try Playground →

Signals Monitored

Twelve layers of detection.

Guardrail runs every response through 74 pattern-matching signals across 12 categories before making a routing decision.

🧠

Model Confidence

Uncertainty language, hedges, and self-doubt signals

🌐

Domain Boundary

Detects when AI steps outside its expertise

😤

User Frustration

Conversation sentiment and repeated correction patterns

⚖️

Regulatory Flags

Medical, legal, financial, and safety content detection

🔄

Contradiction

Internal inconsistencies within a single response

🎭

Hallucination Risk

Fabricated statistics, names, and factual patterns

💪

Overconfidence

Absolute language, false certainty, and superlative claims

🔗

Fabricated URLs

Made-up links, fake emails, phone numbers, and addresses

🔓

Instruction Leakage

System prompt disclosure and prompt injection echoes

⏰

Temporal Confusion

Future predictions, stale claims, and ambiguous time references

😊

Sycophancy

Excessive flattery, empty agreement, and overconfident affirmations

🔁

Repetition & Filler

Verbose padding, repeated blocks, and preamble fluff

Quick Start

Three lines of code.

Add Guardrail to any existing AI app in under 60 seconds.

index.html · Browser SDK

<!-- 1. Load the SDK -->
<script src="https://guardrail-mvp-production.up.railway.app/sdk/guardrail.js"></script>

<!-- 2. Initialize -->
<script>
  const gr = new Guardrail({
    apiKey:   'YOUR_API_KEY',
    context:  'medical',           // or 'legal', 'financial', 'general'…
    onEscalate: (result) => notifyHuman(result),
    onFlag:     (result) => showDisclaimer(result),
  });

  <!-- 3. Wrap your AI call -->
  const aiResponse = await openai.chat("...");
  const result     = await gr.check(aiResponse, { userId: 'u_123' });

  if (result.decision === 'deliver') {
    showToUser(aiResponse);           // confidence: 0.92
  }
</script>

Try Interactive Playground → Open Dashboard

Know when your AI
should hand off.

Guardrail sits between your AI and your users.

Everything included.

Every way to use Guardrail

Three outcomes. Clear rules.

Twelve layers of detection.

Three lines of code.

Know when your AIshould hand off.

Guardrail sits between your AI and your users.

Everything included.

Every way to use Guardrail

Three outcomes. Clear rules.

Twelve layers of detection.

Three lines of code.

Know when your AI
should hand off.