Get Guardrail running in Claude Desktop in 3 minutes. Every check automatically logs to your dashboard.
Go to Developer Portal and enter your email to get a free API key (starts with gr_live_).
On Mac, open Terminal and run:
open ~/Library/Application\ Support/Claude/claude_desktop_config.json
On Windows, open:
%APPDATA%\Claude\claude_desktop_config.json
Paste this into your config file. Replace gr_live_xxx with your actual API key:
{
"mcpServers": {
"guardrail": {
"command": "npx",
"args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"]
}
}
}
"preferences" section, merge them into one JSON object:
{
"preferences": {
"coworkScheduledTasksEnabled": true,
"sidebarMode": "chat"
},
"mcpServers": {
"guardrail": {
"command": "npx",
"args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"]
}
}
}
Fully quit Claude Desktop (Cmd+Q on Mac) and reopen it. The Guardrail tools will load automatically.
Click the + button โ Connectors โ you should see "guardrail" with a blue toggle ON.
In a Claude Desktop chat, type:
Use the guardrail score_and_explain tool to score this: "Taking 500mg of aspirin daily is safe for everyone."
You should see a confidence score, decision (deliver/flag/escalate), and detected signals.
To avoid saying "use the guardrail tool" every time:
Always use the guardrail score_and_explain tool to score any AI-generated text I share. Show the confidence score, decision, and detected signals. Do not answer the text's question โ only score it.
Now every message in that project will automatically use Guardrail.
check_confidence |
Quick score โ returns confidence 0-1 and deliver/flag/escalate |
score_and_explain |
Detailed score with human-readable explanation of all signals |
get_my_stats |
Your API usage stats โ total checks, decisions, recent logs |
userQuery parameter. When provided, Guardrail also analyzes whether the response is relevant to the question, detects scope creep, and audits dangerous queries for missing refusals.
API Reference ยท Try the Playground ยท npm Package ยท GitHub