npx skills add agntswrm/agent-mediaREADME
agent-media
Media processing CLI for AI agents.
- Image: generate, edit, remove-background, resize, convert, extend, crop
- Video: generate (text-to-video and image-to-video)
- Audio: extract from video, transcribe (with speaker identification)
Installation
Global
npm install -g agent-media@latest
From Source
git clone https://github.com/agntswrm/agent-media
cd agent-media
pnpm install && pnpm build && pnpm link --global
Via bunx / npx
Run directly without installing:
bunx agent-media@latest --help
npx agent-media@latest --help
Skills for AI Agents
Install agent-media skills to your coding agent (Claude Code, Cursor, Codex, etc.):
npx skills add agntswrm/agent-media
This adds media processing skills that your AI agent can use automatically. Available skills:
agent-media- Overview of all capabilitiesimage-generate- Generate images from textimage-edit- Edit images with text promptsimage-resize- Resize imagesimage-convert- Convert image formatsimage-extend- Extend image canvas with paddingimage-remove-background- Remove backgroundsimage-crop- Crop images to specified dimensionsaudio-extract- Extract audio from videoaudio-transcribe- Transcribe audio to textvideo-generate- Generate videos from text or images
Quick Start
# generate an image
agent-media image generate --prompt "a robot" --out rob.png
# remove background
agent-media image remove-background --in rob.png --out rob_nobg.png
# edit the image
agent-media image edit --in rob_nobg.png --prompt "the robot is sitting on a bench next to a cat, in the background you can see the Eiffel Tower in Paris" --out rob_cat_paris.png
# generate a video with audio (cat meows, robot speaks!)
agent-media video generate --in rob_cat_paris.png --prompt "the cat meows and the robot says: \"Yes, me too.\"" --audio --out rob_cat_video.mp4
# extract audio from video
agent-media audio extract --in rob_cat_video.mp4 --out rob_cat_audio.mp3
# transcribe the audio
agent-media audio transcribe --in rob_cat_audio.mp3
Requirements
- Node.js >= 18.0.0
- API key from fal.ai, Replicate, Runpod, or AI Gateway for AI features
Local processing (no API key): resize, convert, extend, crop, audio extract, remove-background, transcribe
Cloud processing (API key required): image generate, image edit, video generate, remove-background, transcribe
Note: You may see a
mutex lock failederror when using local remove-background or transcribe — ignore it, the output is correct if JSON shows"ok": true.
image
agent-media image resize --in <path> [options]
agent-media image convert --in <path> --format <f>
agent-media image extend --in <path> --padding <px> --color <hex>
agent-media image crop --in <path> --width <px> --height <px>
agent-media image generate --prompt <text>
agent-media image edit --in <path> --prompt <text>
agent-media image remove-background --in <path>
resize
local
agent-media image resize --in sunset-mountains.jpg --width 800
agent-media image resize --in sunset-mountains.jpg --height 600
agent-media image resize --in https://ytrzap04kkm0giml.public.blob.vercel-storage.com/sunset-mountains.jpg --width 800
| Option | Description |
|---|---|
--in <path> | Input file path or URL (required) |
--width <px> | Target width in pixels |
--height <px> | Target height in pixels |
--out <path> | Output path, filename or directory (default: ./) |
convert
local
agent-media image convert --in sunset-mountains.png --format webp
agent-media image convert --in sunset-mountains.jpg --format png
agent-media image convert --in https://ytrzap04kkm0giml.public.blob.vercel-storage.com/sunset-mountains.png --format jpg --quality 90
| Option | Description |
|---|---|
--in <path> | Input file path or URL (required) |
--format <f> | Output format: png, jpg, webp (required) |
--quality <n> | Quality 1-100 for lossy formats (default: 80) |
--out <path> | Output path, filename or directory (default: ./) |
extend
local
Extend image canvas by adding padding on all sides with a solid background color.
agent-media image extend --in sunset-mountains.jpg --padding 50 --color "#E4ECF8"
agent-media image extend --in https://ytrzap04kkm0giml.public.blob.vercel-storage.com/sunset-mountains.png --padding 100 --color "#FFFFFF"
| Option | Description |
|---|---|
--in <path> | Input file path or URL (required) |
--padding <px> | Padding size in pixels to add on all sides (required) |
--color <hex> | Background color for extended area (required). Also flattens transparency. |
--dpi <n> | DPI |
...
Publisher
Statistics
Stars1
Forks0
Open Issues1
LicenseApache License 2.0
CreatedJan 16, 2026