aws-finops

from opsyhq/opsy

AI DevOps Agent that won't take down your production

12 stars0 forksUpdated Jan 24, 2026
npx skills add https://github.com/opsyhq/opsy --skill aws-finops

SKILL.md

AWS FinOps Skill for Opsy

Step 1: Cost Explorer First

Start with Cost Explorer — one call covers all regions and services:

  1. Spend by service — identifies top cost drivers
  2. Spend by region — shows where resources live
  3. Daily trend — spots anomalies

Focus on services representing >5% of spend.

If Credits Mask Costs ($0 spend)

Check if Resource Explorer is enabled:

aws resource-explorer-2 list-indexes --region us-east-1

If enabled, use it — one call gets ALL resources:

aws resource-explorer-2 search --query-string "*" --region us-east-1

If NOT enabled, use resourcegroupstaggingapi to find all tagged resources:

aws resourcegroupstaggingapi get-resources --region us-east-1

Then query each active region for core services: EC2, RDS, EBS, Lambda, S3, ECS, EKS, NAT Gateways, Load Balancers.

Step 2: Deep Dive Each Resource

For every resource found, gather full details:

  • EC2: Instance type, state, launch time, CloudWatch CPU/memory
  • RDS: Instance class, connections (14d), storage, Multi-AZ, engine
  • EBS: Attachment status, volume type, size, snapshots
  • S3: Lifecycle policies, storage class, versioning
  • Lambda: Invocations (30d), memory, runtime, provisioned concurrency
  • ECS/EKS: Task definitions, service counts, cluster utilization
  • ECR: Repositories, image count, lifecycle policies
  • Load Balancers: Request count (14d), target groups
  • NAT Gateway: Data processed
  • Elastic IPs: Association status
  • CloudWatch Logs: Retention settings
  • Secrets Manager: Secret count

Check EVERY resource for optimization opportunities. Don't skip services.

Step 3: Check Commitment Coverage

  • Savings Plans utilization
  • Reserved Instance coverage gaps
  • Expiring commitments (next 30 days)

Safety Guardrails

Report findings with evidence, suggest investigation — not direct actions:

  • "Instance i-xxx averaged 3% CPU over 30 days — rightsizing candidate"
  • "Volume vol-xxx unattached since [date] — verify before removing"
  • "RDS db-xxx had 0 connections for 14 days — confirm if still needed"

Thresholds:

  • Idle: ~0% utilization for 14+ days
  • Underutilized: <10% average for 14+ days
  • Rightsizing candidate: <30% average

Smart Recommendation Rules

Only flag when action is possible:

SituationAction
Minimum size + in use (db.t3.micro with connections)Skip — already right-sized
Minimum size + idle (db.t3.micro, 0 connections)Flag as idle
Larger size + low utilizationFlag for rightsizing with specific target
Tagged FinOps:Skip=trueSkip
Dev/staging with Environment=devSkip low utilization (expected)

Before flagging, verify:

  1. Is this the minimum size?
  2. Is it actually in use? (connections/invocations/requests)
  3. Is there a smaller option?

Service Checklists

EC2: Utilization, stopped instances (EBS cost), previous-gen types, On-Demand 24/7 → SP/RI

Lambda: Zero invocations (30d), memory vs duration tradeoff, provisioned concurrency

ECS/EKS: Fargate vs EC2, resource requests vs usage, Spot for fault-tolerant

ECR: Lifecycle policies, image count, total size — old images accumulate

RDS: Connection count, Multi-AZ in dev, instance class utilization, storage, previous-gen

DynamoDB: Provisioned vs On-Demand fit, auto-scaling, TTL

ElastiCache/OpenSearch: Node utilization, reserved coverage

S3: Lifecycle policies, storage class, Intelligent-Tiering, incomplete multipart uploads

EBS: Unattached volumes, gp2→gp3, snapshot retention, IOPS necessity

Networking: Cross-AZ transfer, NAT Gateway → VPC endpoints, CloudFront caching

Load Balancers: Zero requests = orphaned, Classic→ALB/NLB

Elastic IPs: Unassociated = $3.60/month each

CloudWatch: Log retention (default infinite), high-res metrics necessity

Secrets Manager: $0.40/month vs free Parameter Store

API Gateway: HTTP API 70% cheaper than REST

Output Requirements

CSV (Required)

account_id,resource_name,status,recommendation_type,potential_savings_monthly,resource_id,region,resource_type,tags,description
123456789012,web-server-prod,Underutilized,Rightsizing to t3.small,45.00,i-0abc123def456,us-east-1,EC2 Instance,"Environment=prod,Team=platform","Avg CPU 8% over 30 days. Current: t3.large"
123456789012,,Unattached,Verify before removing,12.50,vol-0xyz789,us-east-1,EBS Volume,,"100GB gp2 volume unattached since 2024-12-01"
123456789012,raspberry,No-Lifecycle,Add ECR lifecycle policy,2.00,raspberry,us-east-1,ECR Repository,,"47 images totaling 12GB. No lifecycle policy configured"
ColumnDescription
account_idAWS account ID
resource_nameName tag value (empty if untagged)
statusIdle, Underutilized, `Overs

...

Read full content

Repository Stats

Stars12
Forks0