discover-data

from rand/cc-polymath

Claude Code skills and workflows, optimized for context-efficiency and skill quality. Skills ranging from cloud infrastructure to design to advanced maths.

61 stars7 forksUpdated Nov 25, 2025
npx skills add https://github.com/rand/cc-polymath --skill discover-data

SKILL.md

Data Skills Discovery

Provides automatic access to comprehensive data skills.

When This Skill Activates

This skill auto-activates when you're working with:

  • ETL
  • data pipelines
  • batch processing
  • stream processing
  • data validation
  • orchestration
  • Airflow
  • timely dataflow
  • differential dataflow
  • streaming aggregations
  • windowing
  • real-time analytics

Available Skills

Quick Reference

The Data category contains 9 skills:

  1. batch-processing - Orchestrating complex data pipelines with dependencies
  2. data-validation - Validating data schema before processing
  3. dataflow-coordination - Coordination patterns for distributed dataflow systems
  4. differential-dataflow - Differential computation for incremental updates and efficient joins
  5. etl-patterns - Designing data extraction from multiple sources
  6. pipeline-orchestration - Coordinating complex multi-step data workflows
  7. stream-processing - Processing real-time event streams (Kafka, Flink)
  8. streaming-aggregations - Windowing, sessionization, time-series aggregation
  9. timely-dataflow - Low-latency streaming computation with progress tracking

Load Full Category Details

For complete descriptions and workflows:

cat ~/.claude/skills/data/INDEX.md

This loads the full Data category index with:

  • Detailed skill descriptions
  • Usage triggers for each skill
  • Common workflow combinations
  • Cross-references to related skills

Load Specific Skills

Load individual skills as needed:

# Traditional ETL/Batch
cat ~/.claude/skills/data/batch-processing.md
cat ~/.claude/skills/data/data-validation.md
cat ~/.claude/skills/data/etl-patterns.md
cat ~/.claude/skills/data/pipeline-orchestration.md

# Stream Processing
cat ~/.claude/skills/data/stream-processing.md
cat ~/.claude/skills/data/streaming-aggregations.md

# Advanced Dataflow Systems
cat ~/.claude/skills/data/timely-dataflow.md
cat ~/.claude/skills/data/differential-dataflow.md
cat ~/.claude/skills/data/dataflow-coordination.md

Common Workflow Combinations

Real-Time Analytics Pipeline

# Load these skills together:
cat ~/.claude/skills/data/stream-processing.md          # Kafka setup
cat ~/.claude/skills/data/streaming-aggregations.md     # Windowing patterns
cat ~/.claude/skills/data/dataflow-coordination.md      # Coordination

Incremental Computation System

# Load these skills together:
cat ~/.claude/skills/data/timely-dataflow.md           # Foundation
cat ~/.claude/skills/data/differential-dataflow.md     # Incremental updates
cat ~/.claude/skills/data/dataflow-coordination.md     # Distributed coordination

Hybrid Batch + Stream

# Load these skills together:
cat ~/.claude/skills/data/batch-processing.md          # Batch jobs
cat ~/.claude/skills/data/stream-processing.md         # Stream processing
cat ~/.claude/skills/data/pipeline-orchestration.md    # Overall coordination

Progressive Loading

This gateway skill enables progressive loading:

  • Level 1: Gateway loads automatically (you're here now)
  • Level 2: Load category INDEX.md for full overview
  • Level 3: Load specific skills as needed

Usage Instructions

  1. Auto-activation: This skill loads automatically when Claude Code detects data work
  2. Browse skills: Run cat ~/.claude/skills/data/INDEX.md for full category overview
  3. Load specific skills: Use bash commands above to load individual skills

Next Steps: Run cat ~/.claude/skills/data/INDEX.md to see full category details.

Repository Stats

Stars61
Forks7
LicenseMIT License