Agent Skill
2/7/2026

declutter-test-harness

Test harness creation skill - creates characterization tests to preserve behavior during refactoring

J
joowankim
0GitHub Stars
1Views
npx skills add joowankim/declutter

SKILL.md

Namedeclutter-test-harness
DescriptionTest harness creation skill - creates characterization tests to preserve behavior during refactoring

name: declutter-test-harness description: Test harness creation skill - creates characterization tests to preserve behavior during refactoring invocation: Use before any refactoring to establish a safety net of tests

Test Harness Creation

The Iron Law

╔═══════════════════════════════════════════════════════════════════╗
║  NO REFACTORING WITHOUT CHARACTERIZATION TESTS FIRST             ║
╚═══════════════════════════════════════════════════════════════════╝

Purpose

Create a safety net of tests that capture the current behavior of code before refactoring. These tests ensure that refactoring preserves behavior, even if that behavior has bugs.

Key Principle

"Characterization tests document what the system actually does, not what it should do." — Michael Feathers, Working Effectively with Legacy Code

Test Harness Workflow

digraph test_harness {
    rankdir=TB;
    node [shape=box];

    identify [label="1. Identify\nRefactoring Target"];
    entry [label="2. Find\nEntry Points"];
    capture [label="3. Write Tests\nCapturing Behavior"];
    run [label="4. Run Tests"];
    pass [label="All Pass?" shape=diamond];
    done [label="Harness Complete" shape=ellipse];
    debug [label="Debug Test\n(Not Code!)"];

    identify -> entry;
    entry -> capture;
    capture -> run;
    run -> pass;
    pass -> done [label="YES"];
    pass -> debug [label="NO"];
    debug -> capture;
}

Step 1: Identify Entry Points

For the code being refactored, identify all ways it can be called:

- [ ] Public methods/functions
- [ ] Event handlers
- [ ] API endpoints
- [ ] CLI commands
- [ ] Scheduled jobs
- [ ] Message queue consumers

Step 2: Capture Input/Output Pairs

For each entry point, document:

InputExpected OutputSide Effects
Typical caseCurrent resultDB writes, logs
Edge case 1Current result...
Error caseCurrent exception...
Null/emptyCurrent behavior...

Step 3: Write Characterization Tests

Python Template

import pytest
from module_under_test import target_function

class TestTargetFunctionCharacterization:
    """
    Characterization tests for target_function.
    These capture CURRENT behavior, not desired behavior.
    DO NOT modify expected values during refactoring.
    """

    def test_typical_input_returns_expected_output(self):
        # Arrange
        input_data = {"user_id": 123, "action": "purchase"}

        # Act
        result = target_function(input_data)

        # Assert - This is what it DOES, not what it SHOULD do
        assert result == {"status": "ok", "transaction_id": "expected_id"}

    def test_empty_input_returns_error(self):
        # Arrange
        input_data = {}

        # Act
        result = target_function(input_data)

        # Assert
        assert result == {"status": "error", "message": "Missing user_id"}

    def test_invalid_action_behavior(self):
        # Arrange
        input_data = {"user_id": 123, "action": "invalid"}

        # Act & Assert - Document current exception behavior
        with pytest.raises(ValueError) as exc:
            target_function(input_data)
        assert "Unknown action" in str(exc.value)

    def test_side_effects_are_captured(self, mock_database):
        # Arrange
        input_data = {"user_id": 123, "action": "purchase"}

        # Act
        target_function(input_data)

        # Assert side effects
        mock_database.insert.assert_called_once_with(
            "transactions",
            {"user_id": 123, "action": "purchase"}
        )

TypeScript Template

import { targetFunction } from './module-under-test';

describe('targetFunction characterization', () => {
  /**
   * Characterization tests - capture CURRENT behavior.
   * DO NOT modify expected values during refactoring.
   */

  it('handles typical input correctly', () => {
    const input = { userId: 123, action: 'purchase' };

    const result = targetFunction(input);

    expect(result).toEqual({ status: 'ok', transactionId: 'expected_id' });
  });

  it('handles empty input', () => {
    const input = {};

    const result = targetFunction(input);

    expect(result).toEqual({ status: 'error', message: 'Missing userId' });
  });

  it('throws on invalid action', () => {
    const input = { userId: 123, action: 'invalid' };

    expect(() => targetFunction(input)).toThrow('Unknown action');
  });

  it('logs expected messages', () => {
    const consoleSpy = jest.spyOn(console, 'log');
    const input = { userId: 123, action: 'purchase' };

    targetFunction(input);

    expect(consoleSpy).toHaveBeenCalledWith('Processing purchase for 123');
  });
});

Step 4: Handle Dependencies

Mocking Strategy

Dependency TypeStrategy
DatabaseMock at boundary
External APIMock responses
File systemUse temp directory
Time/datesFreeze time
RandomSeed or mock

Dependency Injection for Testability

If code is too coupled to dependencies:

  1. Extract dependency to parameter
  2. Provide default for production
  3. Inject mock for tests
# Before: Hard to test
def process_order(order_id):
    db = DatabaseConnection()  # Hidden dependency
    order = db.get_order(order_id)
    # ...

# After: Testable
def process_order(order_id, db=None):
    db = db or DatabaseConnection()  # Injectable
    order = db.get_order(order_id)
    # ...

Step 5: Verify Harness

Before proceeding to refactoring:

- [ ] All characterization tests pass
- [ ] Each entry point has at least one test
- [ ] Edge cases are covered
- [ ] Error paths are covered
- [ ] Side effects are verified
- [ ] Tests fail when behavior changes

Mutation Testing (Optional but Recommended)

Verify tests actually catch changes:

# Python
mutmut run --paths-to-mutate=module_under_test.py

# JavaScript
npx stryker run

Red Flags - STOP

ThoughtReality
"The existing tests are enough"Verify it. Run them. Check coverage.
"I'll add tests after refactoring"You'll miss behavior changes.
"This code is too tangled to test"That's why you need the harness. Extract seams first.
"Testing this is too slow"Slow tests > broken refactoring

Checklist

- [ ] Entry points identified
- [ ] Input/output pairs documented
- [ ] Characterization tests written
- [ ] All tests passing
- [ ] Edge cases covered
- [ ] Error paths covered
- [ ] Side effects verified
- [ ] Dependencies properly mocked

Signal

When harness is complete, emit:

HARNESS_COMPLETE
Skills Info
Original Name:declutter-test-harnessAuthor:joowankim