Skip to main content
Maintain safety, stability, and compliance during agent execution.

Overview

Guardrails are pre-deployed scanners that evaluate user inputs and model outputs to help maintain safe, responsible, and compliant AI interactions. You enable the scanners you need—no deployment required.
User Input → Input Scanners → Agent Processing → Output Scanners → Response

Available Scanners

ScannerDescriptionApplies To
RegexValidates prompts using user-defined regular expression patterns. Supports defining desirable (“good”) and undesirable (“bad”) patterns for fine-grained validation.Input
AnonymizeRemoves sensitive data from user prompts to maintain privacy and prevent exposure of personal information.Input
Ban topicsBlocks specific topics (for example, religion) from appearing in prompts to avoid sensitive or inappropriate discussions.Input
Prompt injectionDetects attempts to manipulate or override model behavior, protecting the LLM from malicious or crafted inputs.Input
ToxicityAnalyzes prompts or responses for toxic or harmful language to ensure safe and respectful interactions.Input, Output
Bias detectionExamines model outputs for potential bias to help maintain neutrality and fairness in generated responses.Output
DeanonymizeReplaces placeholders in model outputs with actual values to restore necessary information when needed.Output
RelevanceMeasures similarity between the user’s prompt and the model’s output and provides a relevance score to ensure responses stay contextually aligned.Output

View Guardrails

To view all pre-deployed guardrails available on the platform:
  • Go to Settings > Manage guardrails.

Enable Scanners

All scanners are pre-deployed and available by default. You must enable the required scanners in each agentic app or tool where you want to use them. To enable scanners:
  1. Open the Guardrails settings for your app or tool:
    • Agentic apps: Go to Agentic apps, select your app, then go to Settings > PII & Guardrails > Guardrails.
    • Tools: Go to Tools, select the tool you want to configure, then select Guardrails.
  2. On the Guardrails page, review the Input scanners and Output scanners tabs. Turn on the toggle next to each scanner you want to apply. Enable Guardrail
  3. To configure a scanner, click it, adjust the available options, then click Save. The available options vary by scanner. For example, the Toxicity scanner includes Risk Threshold and Detection Sensitivity, while the Regex scanner includes Scanner mode (Block or Allow), pattern entry fields, and a Risk threshold slider. Configure Settings

Test Scanners

After enabling and configuring scanners, verify they perform as expected. You can test an individual scanner or the full set, then adjust settings as needed. To test guardrails:
  1. On the Guardrails page, click Test. Test Guardrails
  2. In the Prompt input box, enter a prompt or select Input template to choose a template. Prompt to Test a Scanner
  3. Click Test. Under Scores and Results, review the output. Guardrails Test Results
    FieldDescription
    ValidityIndicates whether the prompt meets the scanner’s criteria. For example, if no toxicity is detected, Validity is set to True.
    Risk ScoreIndicates the prompt’s risk level, calculated as: (Threshold − Scanner Score) / Threshold. For the Relevance scanner, the score is 1 if similarity falls below the threshold; otherwise 0.
    DurationThe time taken by the scanner to process the prompt.
  4. Based on the results, adjust scanner settings and retest as needed.