Agent Skill
2/7/2026

validate-correctness

Validates discovered bugs with reproducing tests and validates fixes with regression tests. Called by other skills when bugs are found during optimization hunting. Creates unit tests and fuzz tests to prove bugs exist and fixes work.

B
blt
0GitHub Stars
2Views
npx skills add blt/datadog-skills

SKILL.md

Namevalidate-correctness
DescriptionValidates discovered bugs with reproducing tests and validates fixes with regression tests. Called by other skills when bugs are found during optimization hunting. Creates unit tests and fuzz tests to prove bugs exist and fixes work.

name: validate-correctness description: Validates discovered bugs with reproducing tests and validates fixes with regression tests. Called by other skills when bugs are found during optimization hunting. Creates unit tests and fuzz tests to prove bugs exist and fixes work.

Correctness Validation

When optimization hunting discovers a bug instead of an optimization opportunity, this skill validates the finding through tests. No bug fix should be merged without a test that would have caught it.

When This Skill Is Called

Other skills invoke /validate-correctness when:

  • /hunt-optimization discovers a bug instead of an optimization
  • /rescue-optimization finds broken code during salvage
  • /review-optimization identifies correctness issues during review
Optimization hunt discovers bug
            │
            ▼
    /validate-correctness
            │
            ├── Create reproducing test (proves bug exists)
            ├── Validate fix (proves fix works)
            ├── Add fuzz test (catches variants)
            └── Record in validations.yaml
            │
            ▼
    Return to calling skill with VALIDATED status

Philosophy

"A bug without a test is just an anecdote. A bug with a test is knowledge."

Finding bugs during optimization work is valuable, not a failure. But a bug fix without a reproducing test:

  1. Can't prove the bug existed
  2. Can't prove the fix works
  3. Can regress silently later

Every bug fix MUST include a test that fails before the fix and passes after.


Phase 1: Understand the Bug

1.1 Document the Bug

bug:
  file: pkg/foo/bar.go
  function: ProcessItems
  discovered_by: hunt-optimization
  description: "make([]T, n) + append creates n zero elements before real data"
  impact: "PIDs have leading zeros, causes lookup failures"
  root_cause: "Confused make([]T, n) with make([]T, 0, n)"

1.2 Identify Bug Category

CategoryExampleTest Strategy
Off-by-oneWrong slice boundsUnit test with edge cases
Nil handlingMissing nil checkUnit test with nil input
Initializationmake([]T, n) + appendUnit test checking output
ConcurrencyRace conditionTest with -race, fuzz test
OverflowInteger overflowFuzz test with large values
Logic errorWrong conditionUnit test with failing case

1.3 Find the Minimal Reproducer

Identify the smallest input that triggers the bug:

// What input demonstrates the bug?
input := []int{1, 2, 3}
expected := []int{1, 2, 3}
actual := BuggyFunction(input)
// actual = []int{0, 0, 0, 1, 2, 3}  // BUG: leading zeros

Phase 2: Create Reproducing Test

2.1 Write Test That Fails on Buggy Code

func TestFunctionName_BugDescription(t *testing.T) {
    // Arrange: Setup that triggers the bug
    input := createInputThatTriggersBug()

    // Act: Call the buggy function
    result := FunctionName(input)

    // Assert: What SHOULD happen (will fail on buggy code)
    expected := expectedCorrectOutput()
    if !reflect.DeepEqual(result, expected) {
        t.Errorf("BugDescription: got %v, want %v", result, expected)
    }
}

2.2 Test Naming Convention

TestFunctionName_BugDescription
TestProcessPIDs_NoLeadingZeros
TestFlush_EmptyInputReturnsEmptySlice
TestAppend_PreallocDoesNotPrependZeros

2.3 Verify Test Fails Before Fix

# Checkout code BEFORE fix
git stash
git checkout origin/main

# Run the new test - MUST FAIL
go test -run TestFunctionName_BugDescription ./pkg/path/...
# Expected: FAIL

# Return to fix branch
git checkout -
git stash pop

If test passes on buggy code, the test doesn't reproduce the bug. Rewrite it.


Phase 3: Validate the Fix

3.1 Verify Test Passes After Fix

# On the fix branch
go test -run TestFunctionName_BugDescription ./pkg/path/...
# Expected: PASS

3.2 Run Full Test Suite

# Ensure fix doesn't break anything else
go test ./pkg/path/...
go test -race ./pkg/path/...

3.3 Test Edge Cases

Add tests for related edge cases:

func TestFunctionName_EdgeCases(t *testing.T) {
    tests := []struct {
        name     string
        input    InputType
        expected OutputType
    }{
        {"empty input", nil, nil},
        {"single element", []int{1}, []int{1}},
        {"typical case", []int{1,2,3}, []int{1,2,3}},
        {"large input", makeLargeInput(), expectedLargeOutput()},
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got := FunctionName(tt.input)
            if !reflect.DeepEqual(got, tt.expected) {
                t.Errorf("got %v, want %v", got, tt.expected)
            }
        })
    }
}

Phase 4: Add Fuzz Test (When Appropriate)

4.1 When to Add Fuzz Tests

Bug TypeFuzz Test?Reason
Input parsingYESMany edge cases
SerializationYESFormat variations
Numeric operationsYESOverflow, underflow
String manipulationYESUnicode, empty, long
Simple logic errorNoUnit test sufficient
Nil handlingNoExplicit cases enough

4.2 Fuzz Test Template

func FuzzFunctionName(f *testing.F) {
    // Seed corpus with known interesting inputs
    f.Add([]byte{})
    f.Add([]byte{1, 2, 3})
    f.Add([]byte{0, 0, 0})

    f.Fuzz(func(t *testing.T, data []byte) {
        // Should not panic
        result := FunctionName(data)

        // Invariants that must always hold
        if result == nil && len(data) > 0 {
            t.Error("non-empty input should not produce nil")
        }

        // Round-trip check (if applicable)
        if !isValidOutput(result) {
            t.Errorf("invalid output for input %v", data)
        }
    })
}

4.3 Run Fuzz Test

# Quick fuzz (find obvious issues)
go test -fuzz=FuzzFunctionName -fuzztime=30s ./pkg/path/...

# Longer fuzz (thorough exploration)
go test -fuzz=FuzzFunctionName -fuzztime=5m ./pkg/path/...

Phase 5: Document and Record

5.1 Commit Message Format

git commit -m "$(cat <<'EOF'
fix(pkg): description of bug fix

Bug: make([]T, n) with append prepends n zero elements
Fix: Use make([]T, 0, n) for correct preallocation

Test: TestProcessPIDs_NoLeadingZeros fails before, passes after
Fuzz: FuzzProcessPIDs added for input variations

Discovered-by: /hunt-optimization
Validated-by: /validate-correctness

🤖 Generated with Claude Code
EOF
)"

5.2 MANDATORY: Record in validations.yaml

  - bug_id: containerd-pids-leading-zeros
    date: 2026-01-06
    file: pkg/util/containers/containerd/containerd_util.go
    function: ListRunningProcesses
    category: initialization
    description: "make([]T, n) + append prepends n zeros"
    discovered_by: hunt-optimization
    original_branch: mem-opt/containerd-pids-fix-rescued
    tests_added:
      - TestListRunningProcesses_NoLeadingZeros
      - TestListRunningProcesses_EdgeCases
    fuzz_added: false
    verified_fails_before: true
    verified_passes_after: true
    lesson: "Always use make([]T, 0, n) when building slice via append"

Phase 6: Return to Calling Skill

After validation complete, return status to the calling skill:

validation_result:
  status: VALIDATED  # or INVALID, NEEDS_WORK
  bug_confirmed: true
  fix_confirmed: true
  tests_added: 2
  fuzz_added: false
  ready_for_merge: true

The calling skill should:

  1. Record the bug discovery as a SUCCESS (not failure)
  2. Include validation status in the review
  3. Proceed with merge if VALIDATED

Usage

/validate-correctness

Checklist

Before returning VALIDATED:

  • Bug documented with root cause
  • Reproducing test written
  • Test FAILS on buggy code (verified)
  • Test PASSES on fixed code (verified)
  • Edge case tests added
  • Fuzz test added (if appropriate)
  • Full test suite passes
  • Race detector passes (go test -race)
  • Recorded in validations.yaml
Skills Info
Original Name:validate-correctnessAuthor:blt