MCP Server Setup Guide

Get Guardrail running in Claude Desktop in 3 minutes. Every check automatically logs to your dashboard.

Installation

Get your API key

Go to Developer Portal and enter your email to get a free API key (starts with gr_live_).

Open your Claude Desktop config file

On Mac, open Terminal and run:

open ~/Library/Application\ Support/Claude/claude_desktop_config.json

On Windows, open:

%APPDATA%\Claude\claude_desktop_config.json

Add the Guardrail MCP server

Paste this into your config file. Replace gr_live_xxx with your actual API key:

{
  "mcpServers": {
    "guardrail": {
      "command": "npx",
      "args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"]
    }
  }
}

⚠️ Important: If your config file already has a "preferences" section, merge them into one JSON object:

{
  "preferences": {
    "coworkScheduledTasksEnabled": true,
    "sidebarMode": "chat"
  },
  "mcpServers": {
    "guardrail": {
      "command": "npx",
      "args": ["guardrail-ai-mcp", "--key", "gr_live_xxx"]
    }
  }
}

Restart Claude Desktop

Fully quit Claude Desktop (Cmd+Q on Mac) and reopen it. The Guardrail tools will load automatically.

Verify It's Working

Check the connector is loaded

Click the + button → Connectors → you should see "guardrail" with a blue toggle ON.

Test it

In a Claude Desktop chat, type:

Use the guardrail score_and_explain tool to score this:
"Taking 500mg of aspirin daily is safe for everyone."

You should see a confidence score, decision (deliver/flag/escalate), and detected signals.

Auto-Use (Optional)

Make Guardrail run automatically

To avoid saying "use the guardrail tool" every time:

Click + → Connectors → Tool access
Select "Tools already loaded"
Create a Project (e.g. "Guardrail Testing")
Click + next to Instructions and add:

Always use the guardrail score_and_explain tool to score
any AI-generated text I share. Show the confidence score,
decision, and detected signals. Do not answer the text's
question — only score it.

Now every message in that project will automatically use Guardrail.

Available Tools

Three tools are available in every chat:

`check_confidence`	Quick score — returns confidence 0-1 and deliver/flag/escalate
`score_and_explain`	Detailed score with human-readable explanation of all signals
`get_my_stats`	Your API usage stats — total checks, decisions, recent logs

💡 Context-aware scoring: All tools support an optional userQuery parameter. When provided, Guardrail also analyzes whether the response is relevant to the question, detects scope creep, and audits dangerous queries for missing refusals.

📊 Dashboard: Every MCP tool call logs to your dashboard automatically. View scores, signals, and usage trends at guardrail-mvp-production.up.railway.app/dashboard.html

Need help?

API Reference · Try the Playground · npm Package · GitHub