Everything you need to know about Guardrail AI scoring, integrations, privacy, and more.
Three steps:
gr.check(text) and handle the decision./api/chat endpoint, which uses your Anthropic key for Claude calls โ those tokens are billed by Anthropic, not by us.gr_live_xxx) โ identifies your account for Guardrail scoring. Free. Used for authentication and usage tracking.sk-ant-xxx) โ needed ONLY if you want to use the /api/chat endpoint to call Claude through Guardrail. You pay Anthropic for those tokens.For the /api/check endpoint (scoring only), you only need a Guardrail key. No LLM tokens are used.
Guardrail uses 60+ regex-based signal patterns across 7 categories: uncertainty, knowledge cutoff, contradiction, evasion, hallucination, frustration, and sycophancy. Each signal has a weighted penalty.
Every response starts with a base score of 82%. Signals subtract from it, quality indicators (lists, code, URLs) add to it. The final score determines the decision:
No LLM is used in the scoring path. It's purely pattern-based, so it's fast (< 50ms) and deterministic.
Contexts tell Guardrail what kind of content is being scored. High-stakes domains like medical, legal, and financial apply extra penalties (โ20% to โ35%) because errors in those areas are more dangerous.
Supported contexts: general ยท medical ยท legal ยท financial ยท security ยท safety ยท mental_health ยท child_safety ยท nuclear.
If you set general, Guardrail will auto-detect the real domain from the text and elevate automatically.
/api/chat to generate responses via Claude โ scoring that response is still pattern-based./api/check, Guardrail extracts factual claims from the text and cross-checks them against Wikipedia in parallel. Claims that match get a confidence boost (+2% each), contradicted claims get a penalty (โ8% each). This runs automatically โ you can disable it with ?verify=false.userQuery, Guardrail enables context-aware scoring:
confidence score (0.0โ1.0) โ you can set your own cutoffs in your code. For example, a medical app might escalate anything below 80% instead of the default 75%./api/check โ it works with OpenAI, Claude, Gemini, Llama, Mistral, Cohere, or any model. The /api/chat endpoint specifically uses Claude, but scoring works with any text source.One script tag before </body> and you get a full AI chat with confidence scoring on any page:
<script src=".../embed/widget.js" data-key="gr_live_xxx"></script>
It automatically scrapes the host page (title, meta tags, headings, body text) and sends that as context, so the AI can answer questions about your website. The widget supports dark/light themes, custom titles, welcome messages, and system prompts via data-* attributes.
When the chat widget loads, it reads your page's content โ title, URL, meta description, keywords, headings (h1โh3), and first 3,000 characters of visible body text. This is cached once and sent with every chat message as pageContext.
The server injects this into Claude's system prompt so the AI can answer site-specific questions like "What are your pricing plans?" or "Do you offer mobile RON?"
Disable with data-scrape="false". See the docs for full details.
Two options:
/api/tawkto/webhook?key=YOUR_KEY as a webhook URL in tawk.to's settings. Every chat transcript gets scored automatically./api/tawkto/openapi.json as a Custom Tool URL so the AI can check its own answers before sending.Add this to your Claude Desktop config file:
{ "mcpServers": { "guardrail": { "command": "npx", "args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"] } } }
Claude Desktop can then call Guardrail to check response safety as a tool.
/api/check with the AI response and original question. Add a safety disclaimer if the decision is flag or escalate. See the Chatflow Integration Guide for step-by-step instructions.The server operator's Anthropic key pays for Claude API calls. The data-key on the widget is a Guardrail key for authentication โ it's NOT an LLM key.
If no Anthropic key is configured on the server, the widget automatically falls back to demo mode with pre-recorded responses (zero AI token cost).
The widget scrapes publicly visible page content only: title, URL, meta tags, headings, and body text. It does NOT collect:
Data is sent to YOUR Guardrail server, not a third party. Disable with data-scrape="false".
Each chat message with page context adds roughly 800โ1,200 input tokens to the system prompt. At Claude's pricing, that's approximately:
data-* attributes: data-theme="dark" or "light", data-title="Your Brand", data-welcome="Custom greeting", data-placeholder="Custom input text", and data-system-prompt="Custom instructions".Yes. Clone the repo, set your environment variables, and deploy anywhere:
git clone https://github.com/saifsysim/guardrail-mvpANTHROPIC_API_KEY and GUARDRAIL_MASTER_KEY in .envnpm install && npm startWorks on Railway, Render, Heroku, AWS, or any Node.js host. Add DATABASE_URL for PostgreSQL persistence.
anthropicKey parameter in /api/chat, that key is sent from your server to Anthropic โ it never leaves the server-to-Anthropic connection./api/check โ Score only. You pass in pre-generated AI text. No LLM call. Fast (<50ms). Works with any model./api/chat โ Generate + score. Sends your message to Claude, gets a response, scores it, and returns everything in one call. Uses Anthropic tokens.Use /api/check when you already have the AI response. Use /api/chat when you want Guardrail to handle both generation and scoring.
/api/demo-check: 5 requests/hour per IP (no key needed)/api/demo-chat: 10 requests/hour per IP (no key needed)/api/check: Unlimited with a valid API key/api/chat: Unlimited with a valid API key/api/events. Your Developer Portal shows per-key stats, decision breakdowns, and recent scoring logs.check() method throws an error, which your onError callback can handle. Best practice: default to deliver with a disclaimer if Guardrail is unavailable, so your users aren't blocked.?verify=false for latency-sensitive applications.X-Guardrail-Key header or as a ?key= query parameter. The key must start with gr_live_. If you lost your key, sign up again with the same email โ it returns your existing key.src URL in the script tag points to your running Guardrail serverdata-key is a valid Guardrail API keymedical, legal, etc.) where the base penalty is โ25% to โ35%. Combined with any uncertainty language, it can push confidence below 45%. Try scoring with context: "general" to see if the domain penalty is the cause./api/chat endpoint requires an Anthropic API key. Either set ANTHROPIC_API_KEY in your server's .env file, or pass anthropicKey in the request body. If neither is available, use /api/demo-chat for testing.Try the Interactive Playground ยท Read the API Docs ยท Email symehmoo@gmail.com