Amazon Bedrock Guardrails

Overview

Amazon Bedrock Guardrails provides six safeguard policies for securing and controlling generative AI applications. It works with any foundation model (Bedrock, OpenAI, Google Gemini, self-hosted) through the ApplyGuardrail API, enabling consistent safety policies across your entire AI infrastructure.

Six Safeguard Policies

Content Filtering: Block harmful content (hate, insults, sexual, violence, misconduct, prompt attacks)
PII Detection & Redaction: Protect sensitive information (emails, SSNs, credit cards, names, addresses)
Topic Denial: Prevent discussion of specific topics (financial advice, medical diagnosis, legal counsel)
Word Filters: Block custom words, phrases, or AWS-managed profanity lists
Contextual Grounding: Detect hallucinations by validating factual accuracy and relevance (RAG applications)
Automated Reasoning: Mathematical verification against formal policy rules (up to 99% accuracy)

2025 Enhancements

Standard Tier: Enhanced detection, broader language support, code-related use cases (PII in code, malicious injection)
Code Domain Support: PII detection in code syntax, comments, string literals, variable names
Automated Reasoning GA: Mathematical logic validation (December 2025) with 99% verification accuracy
Cross-Region Inference: Standard tier requires opt-in for enhanced capabilities

Key Features

Model-Agnostic: Works with any LLM (not just Bedrock models)
ApplyGuardrail API: Standalone validation without model inference
Multi-Stage Application: Input validation, retrieval filtering, output validation
Versioning: Controlled rollout and rollback capability
CloudWatch Integration: Metrics, logging, and alerting
AgentCore Integration: Real-time tool call validation for agents

Quick Start

1. Create Basic Guardrail

import boto3

bedrock_client = boto3.client("bedrock", region_name="us-east-1")

response = bedrock_client.create_guardrail(
    name="basic-safety-guardrail",
    description="Basic content filtering and PII protection",
    contentPolicyConfig={
        'filtersConfig': [
            {'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'PROMPT_ATTACK', 'inputStrength': 'HIGH', 'outputStrength': 'NONE'}
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'EMAIL', 'action': 'ANONYMIZE'},
            {'type': 'PHONE', 'action': 'ANONYMIZE'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'}
        ]
    }
)

guardrail_id = response['guardrailId']
guardrail_version = response['version']
print(f"Created guardrail: {guardrail_id}, version: {guardrail_version}")

2. Apply Guardrail to Validate Content

bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

response = bedrock_runtime.apply_guardrail(
    guardrailIdentifier=guardrail_id,
    guardrailVersion='1',
    source='INPUT',
    content=[
        {
            'text': {
                'text': 'User input to validate',
                'qualifiers': ['guard_content']
            }
        }
    ]
)

if response['action'] == 'GUARDRAIL_INTERVENED':
    print("Content blocked by guardrail")
else:
    print("Content passed validation")

Operations

Operation 1: Create Comprehensive Guardrail

Create guardrail with all six safeguard policies configured.

Complete Example: All Policies

import boto3

REGION_NAME = "us-east-1"
bedrock_client = boto3.client("bedrock", region_name=REGION_NAME)

response = bedrock_client.create_guardrail(
    name="comprehensive-safety-guardrail",
    description="All safeguard policies: content, PII, topics, words, grounding, AR",

    # Policy 1: Content Filtering
    contentPolicyConfig={
        'filtersConfig': [
            {
                'type': 'HATE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'INSULTS',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'SEXUAL',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'VIOLENCE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'MISCONDUCT',
                'inputStrength': 'MEDIUM',
                'outputStrength': 'MEDIUM'
            },
            {
                'type': 'PROMPT_ATTACK',
                'inputStrength': 'HIGH',
                'outputStrength': 'NONE'  # Only check inputs for jailbreaks
            }
        ]
    },

    # Policy 2: PII De

...

bedrock-guardrails

SKILL.md