API Reference

Complete documentation for the Guardrail scoring API. Pattern-based detection with configurable thresholds — no gen-AI in the scoring path.

Authentication

All API requests (except /api/signup, /api/demo-check, and /api/health) require authentication via your API key.

Send your key via header (recommended)

X-Guardrail-Key: gr_live_your_key_here

Or via query parameter

GET /api/stats?key=gr_live_your_key_here

Admin endpoints (key management) require the master key set via GUARDRAIL_MASTER_KEY.

Base URL

https://guardrail-mvp-production.up.railway.app

Rate Limits

Endpoint	Limit	Window
`/api/demo-check`	5 requests	Per hour, per IP
`/api/check`	Unlimited	With valid API key
`/api/signup`	Unlimited	Idempotent per email

Coming Soon Configurable per-key rate limits and usage quotas.

POST /api/check

Score an AI response for confidence, detect signals, extract claims, and route to a decision. Auth Required

Request Body

Field	Type	Required	Description
`text`	string	Required	The AI-generated response text to score
`userQuery`	string	Optional	The original user question. Enables context-aware scoring: question-type detection, relevance analysis, scope checking, and refusal auditing
`context`	string	Optional	Domain context: `general`, `medical`, `legal`, `financial`, `security`, `safety`, `mental_health`, `child_safety`, `nuclear`. Default: `general`
`userId`	string	Optional	Your internal user ID for tracking

Example Request

curl -X POST https://guardrail-mvp-production.up.railway.app/api/check \
  -H "Content-Type: application/json" \
  -H "X-Guardrail-Key: gr_live_your_key" \
  -d '{
    "text": "The capital of France is Paris.",
    "userQuery": "What is the capital of France?",
    "context": "general"
  }'

Response (200 OK)

{
  "id": "a1b2c3d4-...",
  "confidence": 0.76,
  "decision": "deliver",
  "reasons": ["Hedged language", "Approximate quantification"],
  "context": "general",
  "effectiveContext": "general",
  "detectedContext": null,
  "timestamp": "2026-03-29T08:30:00.000Z",
  "excerpts": [
    {
      "signal": "Approximate quantification",
      "match": "approximately",
      "impact": "-8%"
    }
  ],
  "claims": [
    {
      "text": "The capital of France is Paris",
      "type": "claim",
      "verification": "unverified"
    }
  ]
}

Response Fields

Field	Type	Description
`id`	string	Unique check ID (UUID)
`confidence`	float	Score 0.0–1.0 (higher = more confident)
`decision`	string	`deliver` (≥0.75), `flag` (0.45–0.74), or `escalate` (<0.45)
`reasons`	string[]	Human-readable signal labels that fired
`context`	string	The context used for scoring
`effectiveContext`	string	Actual context used (may be auto-elevated)
`detectedContext`	string?	Auto-detected domain (null if none detected)
`excerpts`	object[]	Exact text matches + impact percentage per signal
`claims`	object[]	Extracted statements with type + verification status
`queryAnalysis`	object?	Present when `userQuery` is provided. Contains `questionType`, `relevanceScore`, `scopeRatio`, and `signals` array

Claim Types

Type	Description
`claim`	Factual assertion (dates, stats, named entities)
`opinion`	Subjective statement ("I think", "probably best")
`instruction`	Imperative/directive ("Run this", "You should")
`question`	Interrogative sentence
`disclaimer`	Self-hedging ("I don't have access", "as an AI")
`filler`	Non-substantive connective text

Verification Statuses

Status	Meaning
`sourced`	Claim has URL, DOI, or "according to [source]"
`unverified`	Factual claim with no citation (−3% penalty, capped −15%)
`self_hedging`	Disclaimer/caveat — informational only

POST /api/chat

Call Claude and score the response in one step. Supports page context from the chat widget for site-aware AI answers. Auth Required

Request Body

Field	Type	Required	Description
`message`	string	Required	The user's message to send to Claude
`context`	string	Optional	Domain context for confidence scoring (default: `general`)
`systemPrompt`	string	Optional	Custom system prompt for Claude
`anthropicKey`	string	Optional	Use caller's own Anthropic key (their tokens). Falls back to server key if omitted
`pageContext`	object	Optional	Scraped page data from the chat widget (see Auto Page Scraping)
`userId`	string	Optional	Your internal user ID for tracking

pageContext Object

Field	Type	Description
`title`	string	Page title from `document.title`
`url`	string	Full page URL
`description`	string	Meta description or OG description
`keywords`	string	Meta keywords
`headings`	string[]	All h1–h3 headings (e.g. `"h1: Pricing"`)
`bodyText`	string	Visible page text (max 3,000 chars)
`scrapedAt`	string	ISO timestamp of when scraping occurred

Response (200 OK)

{
  "id": "a1b2c3d4-...",
  "fullText": "Claude's full response text...",
  "confidence": 0.78,
  "decision": "deliver",
  "reasons": [...],
  "inputTokens": 842,
  "outputTokens": 256,
  "hasPageContext": true
}

Returns 503 if no Anthropic API key is available (neither caller's nor server's).

POST /api/demo-check

Same as /api/check but requires no authentication. Rate-limited to 5 requests per hour per IP. Response includes "demo": true and "remaining": N.

POST /api/demo-chat

Keyless chat endpoint with pre-recorded responses. No LLM tokens consumed. Rate-limited to 10 requests per hour per IP. Returns a random domain-appropriate response scored by Guardrail. Used as fallback when no Anthropic key is available.

Self-serve API key generation. Idempotent — same email returns same key.

Request Body

Field	Type	Required	Description
`email`	string	Required	Valid email address

Response (200)

{
  "key": "gr_live_a3b4c5d6...",
  "email": "you@example.com",
  "message": "Your API key has been created."
}

GET /api/developer/me

Get info about your API key — request counts, decision breakdown, recent logs. Auth Required

Response (200)

{
  "key": "gr_live_xxx...xxx",
  "email": "you@example.com",
  "created": "2026-03-28T...",
  "requests": 42,
  "decisions": { "deliver": 30, "flag": 10, "escalate": 2 },
  "recentLogs": [...]
}

GET /api/stats

Aggregate platform stats. Auth Required

Response (200)

{
  "totalChecks": 1234,
  "deliverRate": 68.5,
  "avgConfidence": 0.72,
  "recentFlags": 12,
  "recentEscalations": 3
}

GET /api/logs

Recent scoring logs (newest first). Auth Required

Query Parameters

Param	Type	Default	Description
`limit`	integer	20	Number of logs to return (max 200)

GET /api/health

Health check — no auth required.

{ "status": "ok" }

POST /api/keys

Create a new API key (admin only). Master Key Required

GET /api/keys

List all API keys (admin only). Master Key Required

DELETE /api/keys/:key

Revoke an API key (admin only). Master Key Required

Browser SDK

Drop-in JavaScript SDK available at:

<script src="https://guardrail-mvp-production.up.railway.app/sdk/guardrail.js"></script>

const gr = new Guardrail({
  apiKey: 'gr_live_xxx',
  context: 'medical',
  onEscalate: (r) => notifyHuman(r),
  onFlag: (r) => showDisclaimer(r),
});

const result = await gr.check(aiResponse);

Drop-in embeddable AI chat with confidence scoring for any website. One script tag — no build step, no dependencies.

Basic Integration

<!-- Paste before </body> -->
<script
  src="https://guardrail-mvp-production.up.railway.app/embed/widget.js"
  data-key="gr_live_xxx"
></script>

All Options

<script
  src="https://guardrail-mvp-production.up.railway.app/embed/widget.js"
  data-key="gr_live_xxx"
  data-context="general"
  data-title="AI Assistant"
  data-theme="dark"
  data-scrape="true"
  data-system-prompt="You are Acme Corp's support agent."
  data-welcome="Hi! Ask me anything about our products."
  data-placeholder="Ask anything..."
></script>

Data Attributes

Attribute	Default	Description
`data-key`	required	Your Guardrail API key
`data-scrape`	`"true"`	Enable/disable auto page scraping
`data-context`	`"general"`	Domain context for scoring
`data-title`	`"AI Assistant"`	Widget header title
`data-theme`	`"dark"`	`"dark"` or `"light"`
`data-system-prompt`	`""`	Extra instructions appended to Claude's system prompt
`data-welcome`	`"Hi! I'm your..."`	First message shown in chat
`data-placeholder`	`"Ask anything..."`	Input placeholder text

The widget calls /api/chat with your key. If no Anthropic key is set on the server, it falls back to /api/demo-chat automatically.

Auto Page Scraping

When data-scrape="true" (default), the widget automatically extracts structured content from the host page and sends it as pageContext with every chat message. This makes the AI site-aware — it can answer questions about your website.

What Gets Scraped

Data	Source
Title	`document.title`
URL	`window.location.href`
Description	`<meta name="description">`
Keywords	`<meta name="keywords">`
OG Tags	`<meta property="og:title">`, `og:description`
Headings	All `h1`–`h3` elements (max 30)
Body Text	Visible text content (max 3,000 characters)

What is NOT Scraped

Scripts, styles, SVGs, iframes, hidden elements, the widget's own UI, user inputs, cookies, localStorage, or data from other tabs.

Privacy

Text only. One-time scrape cached on load (no continuous monitoring). Body capped at 3,000 chars. Data is sent to your server only. Set data-scrape="false" to disable entirely.

tawk.to Integration

Score tawk.to chatbot responses via webhook, or connect Guardrail as an AI Assist Custom Tool.

POST /api/tawkto/webhook

Receives tawk.to chat events. Scores each agent/AI message and logs results. Requires X-Guardrail-Key header or ?key= query param.

https://guardrail-mvp-production.up.railway.app/api/tawkto/webhook?key=gr_live_xxx

Configure in tawk.to → Administration → Settings → Webhooks. Select Chat End and Chat Transcript events.

Response

{
  "processed": 3,
  "results": [
    { "text": "Bot response...", "confidence": 0.82, "decision": "deliver", "reasons": [] }
  ]
}

GET /api/tawkto/openapi.json

OpenAPI 3.0 schema for tawk.to AI Assist Custom Tool integration. Add this URL in tawk.to → AI Assist → Settings → Add Custom Tool (OpenAPI Server).

https://guardrail-mvp-production.up.railway.app/api/tawkto/openapi.json

MCP Server (Claude Desktop)

Guardrail has a native MCP server for use with Claude Desktop and other MCP-compatible clients. Now supports userQuery for context-aware scoring.

Install via npx (recommended)

npx guardrail-ai-mcp --key gr_live_xxx

Claude Desktop config (zero-setup)

{
  "mcpServers": {
    "guardrail": {
      "command": "npx",
      "args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"]
    }
  }
}

Or download directly

Visit Developer Portal → "Download MCP Server" button, or:

curl -o guardrail-mcp.js https://guardrail-mvp-production.up.railway.app/api/mcp-download

Webhooks Coming Soon

Receive real-time notifications when responses are flagged or escalated. Configure a webhook URL and Guardrail will POST events to your endpoint.

Planned event types

Event	When
`response.flagged`	Confidence 0.45–0.74 (needs review)
`response.escalated`	Confidence < 0.45 (human required)
`response.delivered`	Confidence ≥ 0.75 (optional)

Planned webhook payload

{
  "event": "response.escalated",
  "timestamp": "2026-03-29T...",
  "data": {
    "id": "check-uuid",
    "confidence": 0.31,
    "decision": "escalate",
    "reasons": ["Knowledge cutoff disclaimer", "No real-time access"],
    "context": "financial",
    "textPreview": "I don't have access to real-time..."
  }
}

Integrations planned: Slack, PagerDuty, custom HTTP endpoints. Request access to the private beta.

Error Codes

Status	Code	Description
400	`text is required`	Missing or empty `text` field in request body
401	`Missing API key`	No `X-Guardrail-Key` header or `?key=` param
403	`Invalid API key`	Key not found or revoked
404	`Key not found`	Attempted to delete a non-existent key
429	`Demo limit reached`	Exceeded 5 demo requests per hour
500	`Internal error`	Server-side failure (report to maintainer)

Signal Categories (60+ patterns)

Category	Patterns	Weight Range	Examples
Uncertainty	8	0.08–0.18	hedged language, epistemic distancing, partial uncertainty
Knowledge Cutoff	10	0.08–0.22	training cutoff, staleness hedge, verification nudge
Contradiction	4	0.06–0.14	self-correction, position reversal, soft pivot
Evasion	7	0.06–0.12	AI identity deflection, complexity deflection, explicit refusal
Hallucination	9	0.08–0.20	ISBN/DOI fabrication, unattributed studies, precise statistics
Frustration	5	0.06–0.14	repeated correction, abandonment signal, negative evaluation
Sycophancy	4	0.04–0.08	flattery opener, excessive agreement, overconfident affirmation
Quality Bonuses	8	+0.02–0.04	numbered lists, code blocks, URLs, specific numbers

Domain Contexts

Context	Risk Penalty	Auto-detected?
`general`	0%	Default
`medical`	−25%	✅ Yes
`legal`	−25%	✅ Yes
`financial`	−20%	✅ Yes
`security`	−25%	✅ Yes
`safety`	−30%	✅ Yes
`mental_health`	−35%	✅ Yes
`child_safety`	−35%	✅ Yes
`nuclear`	−35%	✅ Yes

When general is selected, Guardrail auto-scans for domain keywords and elevates the context automatically. The detectedContext field in the response shows what was detected.

Architecture Overview

Guardrail sits between your AI model and your users, scoring every response in real-time.

System Architecture

── Your Application ──

                    Any LLM
                    →
                    Your App
                
↓ AI response text
── Guardrail Platform ──

                    API Gateway
                    →
                    Scoring Engine (60+ signals)
                    →
                    Wikipedia Verification
                
↓ decision + confidence

                    ✅ Deliver (≥75%)
                    ⚠️ Flag (45–74%)
                    🔴 Escalate (<45%)
                

Integration Paths

Integration	How It Works	Best For
Browser SDK	`guardrail.js` → `/api/check`	Custom apps with any LLM
Chat Widget	`embed/widget.js` → `/api/chat` + auto page scraping	Adding AI chat to any website
Chatflow / Flowise	Custom Tool → `/api/check`	No-code chatbot builders
tawk.to	Webhook → `/api/tawkto/webhook`	Monitoring existing chatbot quality
MCP / Claude Desktop	`npx guardrail-ai-mcp` → `/api/check`	AI developers using Claude Desktop
Direct REST	`curl /api/check`	Any language, any platform

Scoring Pipeline

1. Input: AI response text + context + userQuery
2. Scan: 60+ signal patterns across 7 categories
Uncertainty · Knowledge Cutoff · Contradiction · Evasion · Hallucination · Frustration · Sycophancy
3. Domain: Auto-detect context + apply risk penalty (−20% to −35%)
4. Quality: Assess structure (lists, code, URLs) → quality bonus (+2% to +10%)
5. Claims: Extract factual statements → count unverified (−3% each)
6. Query: If userQuery provided → relevance, scope, danger analysis
7. Verify: Cross-check claims against Wikipedia (optional)
8. Score: confidence = 0.82 + bonus − penalty → clamp to 0.0–1.0
9. Decide:
                        deliver |
                        flag |
                        escalate
                    

Full Mermaid architecture diagrams available at: Customer Architecture · Internal Architecture

Need help?

Read the FAQ · Try the Interactive Playground · Check the Changelog · View the Developer Portal

API Reference

Authentication

Send your key via header (recommended)

Or via query parameter

Base URL

Rate Limits

POST /api/check

Request Body

Example Request

Response (200 OK)

Response Fields

Claim Types

Verification Statuses

POST /api/chat

Request Body

pageContext Object

Response (200 OK)

POST /api/demo-check

POST /api/demo-chat

POST /api/signup

Request Body

Response (200)

GET /api/developer/me

Response (200)

GET /api/stats

Response (200)

GET /api/logs

Query Parameters

GET /api/health

POST /api/keys

GET /api/keys

DELETE /api/keys/:key

Browser SDK

Chat Widget

Basic Integration

All Options

Data Attributes

Auto Page Scraping

What Gets Scraped

What is NOT Scraped

Privacy

tawk.to Integration

POST /api/tawkto/webhook

Response

GET /api/tawkto/openapi.json

MCP Server (Claude Desktop)

Install via npx (recommended)

Claude Desktop config (zero-setup)

Or download directly

Webhooks Coming Soon

Planned event types

Planned webhook payload

Error Codes

Signal Categories (60+ patterns)

Domain Contexts

Architecture Overview

System Architecture

Integration Paths

Scoring Pipeline