npx skills add https://github.com/adaptationio/skrillz --skill bedrock-guardrailsSKILL.md
Amazon Bedrock Guardrails
Overview
Amazon Bedrock Guardrails provides six safeguard policies for securing and controlling generative AI applications. It works with any foundation model (Bedrock, OpenAI, Google Gemini, self-hosted) through the ApplyGuardrail API, enabling consistent safety policies across your entire AI infrastructure.
Six Safeguard Policies
- Content Filtering: Block harmful content (hate, insults, sexual, violence, misconduct, prompt attacks)
- PII Detection & Redaction: Protect sensitive information (emails, SSNs, credit cards, names, addresses)
- Topic Denial: Prevent discussion of specific topics (financial advice, medical diagnosis, legal counsel)
- Word Filters: Block custom words, phrases, or AWS-managed profanity lists
- Contextual Grounding: Detect hallucinations by validating factual accuracy and relevance (RAG applications)
- Automated Reasoning: Mathematical verification against formal policy rules (up to 99% accuracy)
2025 Enhancements
- Standard Tier: Enhanced detection, broader language support, code-related use cases (PII in code, malicious injection)
- Code Domain Support: PII detection in code syntax, comments, string literals, variable names
- Automated Reasoning GA: Mathematical logic validation (December 2025) with 99% verification accuracy
- Cross-Region Inference: Standard tier requires opt-in for enhanced capabilities
Key Features
- Model-Agnostic: Works with any LLM (not just Bedrock models)
- ApplyGuardrail API: Standalone validation without model inference
- Multi-Stage Application: Input validation, retrieval filtering, output validation
- Versioning: Controlled rollout and rollback capability
- CloudWatch Integration: Metrics, logging, and alerting
- AgentCore Integration: Real-time tool call validation for agents
Quick Start
1. Create Basic Guardrail
import boto3
bedrock_client = boto3.client("bedrock", region_name="us-east-1")
response = bedrock_client.create_guardrail(
name="basic-safety-guardrail",
description="Basic content filtering and PII protection",
contentPolicyConfig={
'filtersConfig': [
{'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
{'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
{'type': 'PROMPT_ATTACK', 'inputStrength': 'HIGH', 'outputStrength': 'NONE'}
]
},
sensitiveInformationPolicyConfig={
'piiEntitiesConfig': [
{'type': 'EMAIL', 'action': 'ANONYMIZE'},
{'type': 'PHONE', 'action': 'ANONYMIZE'},
{'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'BLOCK'}
]
}
)
guardrail_id = response['guardrailId']
guardrail_version = response['version']
print(f"Created guardrail: {guardrail_id}, version: {guardrail_version}")
2. Apply Guardrail to Validate Content
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")
response = bedrock_runtime.apply_guardrail(
guardrailIdentifier=guardrail_id,
guardrailVersion='1',
source='INPUT',
content=[
{
'text': {
'text': 'User input to validate',
'qualifiers': ['guard_content']
}
}
]
)
if response['action'] == 'GUARDRAIL_INTERVENED':
print("Content blocked by guardrail")
else:
print("Content passed validation")
Operations
Operation 1: Create Comprehensive Guardrail
Create guardrail with all six safeguard policies configured.
Complete Example: All Policies
import boto3
REGION_NAME = "us-east-1"
bedrock_client = boto3.client("bedrock", region_name=REGION_NAME)
response = bedrock_client.create_guardrail(
name="comprehensive-safety-guardrail",
description="All safeguard policies: content, PII, topics, words, grounding, AR",
# Policy 1: Content Filtering
contentPolicyConfig={
'filtersConfig': [
{
'type': 'HATE',
'inputStrength': 'HIGH',
'outputStrength': 'HIGH'
},
{
'type': 'INSULTS',
'inputStrength': 'HIGH',
'outputStrength': 'HIGH'
},
{
'type': 'SEXUAL',
'inputStrength': 'HIGH',
'outputStrength': 'HIGH'
},
{
'type': 'VIOLENCE',
'inputStrength': 'HIGH',
'outputStrength': 'HIGH'
},
{
'type': 'MISCONDUCT',
'inputStrength': 'MEDIUM',
'outputStrength': 'MEDIUM'
},
{
'type': 'PROMPT_ATTACK',
'inputStrength': 'HIGH',
'outputStrength': 'NONE' # Only check inputs for jailbreaks
}
]
},
# Policy 2: PII De
...
Repository
adaptationio/skrillzParent repository
Repository Stats
Stars1
Forks0