Agent Plugin - AI-Powered Testing

The agent plugin integrates Claude Code into your test workflows, enabling AI-powered analysis, data processing, and intelligent test validation. Use it to analyze API responses, generate test data, validate complex business logic, or create intelligent test assertions.

Key Features Demonstrated

AI Analysis: Analyze API responses, logs, and test data with Claude
Multiple Output Formats: Support for JSON, text, and streaming formats
Session Management: Continue conversations across test steps
Template Integration: Use previous test results in AI prompts
Metadata Extraction: Access cost, duration, and session information
Flexible Timeouts: Configure execution timeouts for different use cases
System Prompts: Customize AI behavior with custom instructions

Prerequisites

Before using the agent plugin, you need:

Claude Code CLI installed: Install via npm install -g @anthropic-ai/claude-code or use Homebrew
Anthropic API Key: Set the ANTHROPIC_API_KEY environment variable
Authentication: Run claude login to authenticate with Claude Code

# Install Claude Code CLI
npm install -g @anthropic-ai/claude-code
# OR
brew install claude

# Set your API key
export ANTHROPIC_API_KEY=sk-ant-your-key-here

# Login to Claude Code
claude login

Basic Configuration

plugin: agent
config:
  agent: "claude-code" # Only supported agent type
  prompt: "Your prompt here" # Required: instruction for Claude
  mode: "single" # Optional: single, continue, resume
  output_format: "json" # Optional: json, text, streaming-json
  timeout: "30s" # Optional: execution timeout

Simple Example

name: "Basic Agent Analysis"
tests:
  - name: "Analyze API response"
    steps:
      - name: "Get user data"
        plugin: http
        config:
          method: GET
          url: "https://jsonplaceholder.typicode.com/users/1"
        save:
          - json_path: ".name"
            as: "user_name"
          - json_path: ".email"
            as: "user_email"

      - name: "AI analysis of user data"
        plugin: agent
        config:
          agent: "claude-code"
          prompt: |
            Analyze this user data:
            Name: {{ user_name }}
            Email: {{ user_email }}

            Is this a valid user profile? Respond with JSON:
            {"valid": true/false, "issues": ["list", "of", "issues"]}
          output_format: "json"
        save:
          - json_path: ".response"
            as: "validation_result"

      - name: "Log validation result"
        plugin: log
        config:
          message: "User validation: {{ validation_result }}"

Comprehensive Configuration Example

name: "Agent Plugin Feature Demo"
vars:
  api_endpoint: "https://jsonplaceholder.typicode.com/posts/1"
  system_instructions: "You are a technical analyst. Be concise and precise."

tests:
  - name: "Complete agent workflow"
    steps:
      # Get test data
      - name: "Fetch API data"
        plugin: http
        config:
          method: GET
          url: "{{ .vars.api_endpoint }}"
        save:
          - json_path: ".title"
            as: "post_title"
          - json_path: ".body"
            as: "post_body"

      # JSON output with system prompt
      - name: "Structured analysis"
        plugin: agent
        config:
          agent: "claude-code"
          prompt: |
            Analyze this content:
            Title: {{ post_title }}
            Body: {{ post_body }}

            Provide JSON analysis: {
              "sentiment": "positive/negative/neutral",
              "topics": ["array", "of", "topics"],
              "word_count": number,
              "summary": "brief summary"
            }
          mode: "single"
          output_format: "json"
          system_prompt: "{{ .vars.system_instructions }}"
          timeout: "45s"
          save_full_response: true
        save:
          - json_path: ".response"
            as: "structured_analysis"
          - json_path: ".cost"
            as: "analysis_cost"
          - json_path: ".session_id"
            as: "session_id"
        assertions:
          - type: "json_path"
            path: ".success"
            expected: true

      # Text output with multi-turn
      - name: "Detailed analysis"
        plugin: agent
        config:
          agent: "claude-code"
          prompt: "Expand on the analysis: {{ structured_analysis }}"
          mode: "single"
          output_format: "text"
          max_turns: 2
          timeout: "60s"
        save:
          - json_path: ".response"
            as: "detailed_analysis"

      # Optional field extraction
      - name: "Extract key insights"
        plugin: agent
        config:
          agent: "claude-code"
          prompt: "List the top 3 insights from: {{ detailed_analysis }}"
          output_format: "text"
        save:
          - json_path: ".response"
            as: "key_insights"
          - json_path: ".nonexistent_field"
            as: "optional_data"
            required: false

      # Final summary
      - name: "Log comprehensive results"
        plugin: log
        config:
          message: |
            🤖 AI Analysis Complete:
            📊 Structured: {{ structured_analysis }}
            💰 Cost: ${{ analysis_cost }}
            📝 Detailed: {{ detailed_analysis }}
            🎯 Insights: {{ key_insights }}
            ❓ Optional: {{ optional_data }}

Configuration Options

Required Fields

Field	Description	Example
`agent`	Agent type (only "claude-code" supported)	`"claude-code"`
`prompt`	Instruction for Claude (supports templates)	`"Analyze this data: {{ api_response }}"`

Optional Fields

Field	Type	Default	Description
`mode`	string	`"single"`	Execution mode: `single`, `continue`, `resume`
`output_format`	string	`"json"`	Output format: `json`, `text`, `streaming-json`
`timeout`	string	`"30s"`	Execution timeout (e.g., "30s", "2m")
`system_prompt`	string	-	Custom system instructions
`max_turns`	integer	1	Maximum conversation turns
`session_id`	string	-	Session ID for resume mode
`continue_recent`	boolean	false	Continue most recent conversation
`save_full_response`	boolean	true	Save complete response to context

Output Formats

JSON Format

Returns structured metadata with the Claude response:

output_format: "json"

Response structure:

{
  "type": "result",
  "subtype": "success",
  "result": "Claude's response here",
  "session_id": "unique-session-id",
  "cost_usd": 0.003,
  "duration_ms": 1234,
  "num_turns": 1,
  "is_error": false
}

Text Format

Returns Claude's response as plain text:

output_format: "text"

Best for simple analysis or when you want Claude's raw response.

Execution Modes

Single Mode (Default)

Execute a one-time prompt:

mode: "single"
prompt: "Analyze this data: {{ api_response }}"

Continue Mode

Continue the most recent conversation:

mode: "continue"
continue_recent: true
prompt: "Now summarize our discussion"

Resume Mode

Resume a specific conversation session:

mode: "resume"
session_id: "{{ previous_session_id }}"
prompt: "Continue from where we left off"

Save Operations

Basic Response Extraction

save:
  - json_path: ".response"
    as: "ai_analysis"

Metadata Extraction

save:
  - json_path: ".response"
    as: "analysis_content"
  - json_path: ".cost"
    as: "execution_cost"
  - json_path: ".duration"
    as: "execution_time"
  - json_path: ".session_id"
    as: "session_id"
  - json_path: ".success"
    as: "success_status"

Optional Fields

save:
  - json_path: ".response"
    as: "required_analysis"
  - json_path: ".optional_metadata"
    as: "optional_data"
    required: false # Won't fail if field doesn't exist

Assertions

Validate agent execution results:

assertions:
  - type: "json_path"
    path: ".success"
    expected: true
  - type: "json_path"
    path: ".exit_code"
    expected: 0
  - type: "json_path"
    path: ".cost"
    expected: 0 # Cost should be 0 for testing

Template Variables

Use data from previous steps in agent prompts:

# From HTTP responses
prompt: "Analyze this API data: {{ api_response }}"

# From configuration
prompt: "Process {{ .vars.user_data }} according to {{ .vars.business_rules }}"

# From previous agent results
prompt: "Based on {{ previous_analysis }}, what are the next steps?"

# Multi-line prompts with templates
prompt: |
  Previous Analysis: {{ step1_result }}
  New Data: {{ step2_data }}

  Compare these results and identify:
  1. Key differences
  2. Trending patterns
  3. Recommendations

Use Cases

API Response Analysis

- name: "Analyze API health"
  plugin: agent
  config:
    agent: "claude-code"
    prompt: |
      API Response Analysis:
      Status: {{ response_status }}
      Time: {{ response_time }}ms
      Size: {{ response_size }} bytes

      Rate this API's health (1-10) and explain issues.
    output_format: "json"

Test Data Validation

- name: "Validate business rules"
  plugin: agent
  config:
    agent: "claude-code"
    prompt: |
      User Data: {{ user_profile }}
      Business Rules: {{ .vars.validation_rules }}

      Does this user profile comply with business rules?
      Return: {"compliant": boolean, "violations": []}

Intelligent Assertions

- name: "Smart content validation"
  plugin: agent
  config:
    agent: "claude-code"
    prompt: |
      Content: {{ page_content }}

      Check for:
      - Appropriate language
      - Complete information
      - Professional tone

      Score 1-10 with reasoning.

Log Analysis

- name: "Analyze error logs"
  plugin: agent
  config:
    agent: "claude-code"
    prompt: |
      Error Logs: {{ error_logs }}

      Categorize errors and suggest fixes:
      {"categories": [], "critical_count": 0, "suggestions": []}

Running Examples

# Run basic agent example
rocketship run -af examples/agent-testing/rocketship.yaml

# Run with API key environment variable
ANTHROPIC_API_KEY=your-key rocketship run -af examples/agent-testing/rocketship.yaml

# Run comprehensive test suite
rocketship run -af examples/agent-testing/comprehensive-test.yaml

Best Practices

1. Clear, Specific Prompts

# Good: Specific instructions
prompt: |
  Analyze this JSON response for data quality:
  {{ api_response }}

  Check: completeness, format validity, business logic
  Return: {"score": 1-10, "issues": ["specific", "problems"]}

# Avoid: Vague prompts
prompt: "Check this data: {{ api_response }}"

2. Use Appropriate Output Formats

# Use JSON for structured data extraction
output_format: "json"
prompt: "Return analysis as: {\"score\": number, \"issues\": []}"

# Use text for explanations and summaries
output_format: "text"
prompt: "Explain the security implications of this configuration"

3. Set Reasonable Timeouts

# Quick analysis
timeout: "15s"

# Complex analysis
timeout: "60s"

# Multi-turn conversations
timeout: "120s"

4. Handle Optional Data

save:
  - json_path: ".response"
    as: "analysis"
  - json_path: ".metadata.extra"
    as: "extra_info"
    required: false # Won't fail test if missing

5. Use System Prompts for Consistency

vars:
  analyst_prompt: "You are a security analyst. Be thorough and highlight risks."

config:
  system_prompt: "{{ .vars.analyst_prompt }}"
  prompt: "Analyze this configuration: {{ config_data }}"

Troubleshooting

Common Issues

"claude command not found"

Install Claude Code CLI: npm install -g @anthropic-ai/claude-code
Verify installation: which claude

"ANTHROPIC_API_KEY environment variable is required"

Set your API key: export ANTHROPIC_API_KEY=sk-ant-your-key
Get an API key from https://console.anthropic.com/

"Authentication required"

Login to Claude Code: claude login

Empty responses

Check your prompt is clear and specific
Verify template variables are being substituted correctly
Increase timeout for complex analysis

JSON parsing errors

Use output_format: "text" for debugging
Check Claude's actual response format
Ensure prompts request valid JSON structure

The agent plugin enables powerful AI-driven testing workflows, making your tests more intelligent and capable of handling complex validation scenarios that would be difficult to implement with traditional assertion methods.