bedrock-guardrails

from adaptationio/skrillz

No description

1 stars0 forksUpdated Jan 16, 2026
npx skills add https://github.com/adaptationio/skrillz --skill bedrock-guardrails

SKILL.md

Amazon Bedrock Guardrails

Overview

Amazon Bedrock Guardrails provides six safeguard policies for securing and controlling generative AI applications. It works with any foundation model (Bedrock, OpenAI, Google Gemini, self-hosted) through the ApplyGuardrail API, enabling consistent safety policies across your entire AI infrastructure.

Six Safeguard Policies

  1. Content Filtering: Block harmful content (hate, insults, sexual, violence, misconduct, prompt attacks)
  2. PII Detection & Redaction: Protect sensitive information (emails, SSNs, credit cards, names, addresses)
  3. Topic Denial: Prevent discussion of specific topics (financial advice, medical diagnosis, legal counsel)
  4. Word Filters: Block custom words, phrases, or AWS-managed profanity lists
  5. Contextual Grounding: Detect hallucinations by validating factual accuracy and relevance (RAG applications)
  6. Automated Reasoning: Mathematical verification against formal policy rules (up to 99% accuracy)

2025 Enhancements

  • Standard Tier: Enhanced detection, broader language support, code-related use cases (PII in code, malicious injection)
  • Code Domain Support: PII detection in code syntax, comments, string literals, variable names
  • Automated Reasoning GA: Mathematical logic validation (December 2025) with 99% verification accuracy
  • Cross-Region Inference: Standard tier requires opt-in for enhanced capabilities

Key Features

  • Model-Agnostic: Works with any LLM (not just Bedrock models)
  • ApplyGuardrail API: Standalone validation without model inference
  • Multi-Stage Application: Input validation, retrieval filtering, output validation
  • Versioning: Controlled rollout and rollback capability
  • CloudWatch Integration: Metrics, logging, and alerting
  • AgentCore Integration: Real-time tool call validation for agents

Quick Start

1. Create Basic Guardrail

import boto3

bedrock_client = boto3.client("bedrock", region_name="us-east-1")

response = bedrock_client.create_guardrail(
    name="basic-safety-guardrail",
    description="Basic content filtering and PII protection",
    contentPolicyConfig={
        'filtersConfig': [
            {'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
            {'type': 'PROMPT_ATTACK', 'inputStrength': 'HIGH', 'outputStrength': 'NONE'}
        ]
    },
    sensitiveInformationPolicyConfig={
        'piiEntitiesConfig': [
            {'type': 'EMAIL', 'action': 'ANONYMIZE'},
            {'type': 'PHONE', 'action': 'ANONYMIZE'},
            {'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'}
        ]
    }
)

guardrail_id = response['guardrailId']
guardrail_version = response['version']
print(f"Created guardrail: {guardrail_id}, version: {guardrail_version}")

2. Apply Guardrail to Validate Content

bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

response = bedrock_runtime.apply_guardrail(
    guardrailIdentifier=guardrail_id,
    guardrailVersion='1',
    source='INPUT',
    content=[
        {
            'text': {
                'text': 'User input to validate',
                'qualifiers': ['guard_content']
            }
        }
    ]
)

if response['action'] == 'GUARDRAIL_INTERVENED':
    print("Content blocked by guardrail")
else:
    print("Content passed validation")

Operations

Operation 1: Create Comprehensive Guardrail

Create guardrail with all six safeguard policies configured.

Complete Example: All Policies

import boto3

REGION_NAME = "us-east-1"
bedrock_client = boto3.client("bedrock", region_name=REGION_NAME)

response = bedrock_client.create_guardrail(
    name="comprehensive-safety-guardrail",
    description="All safeguard policies: content, PII, topics, words, grounding, AR",

    # Policy 1: Content Filtering
    contentPolicyConfig={
        'filtersConfig': [
            {
                'type': 'HATE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'INSULTS',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'SEXUAL',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'VIOLENCE',
                'inputStrength': 'HIGH',
                'outputStrength': 'HIGH'
            },
            {
                'type': 'MISCONDUCT',
                'inputStrength': 'MEDIUM',
                'outputStrength': 'MEDIUM'
            },
            {
                'type': 'PROMPT_ATTACK',
                'inputStrength': 'HIGH',
                'outputStrength': 'NONE'  # Only check inputs for jailbreaks
            }
        ]
    },

    # Policy 2: PII De

...
Read full content

Repository Stats

Stars1
Forks0