GitHub – searlsco/prove_it


Certified Shovelware

By far the most frustrating thing about Claude Code is its penchant for prematurely declaring success. Out-of-the-box, Claude will happily announce a task is complete. Has it run the tests? No. Did it add any tests? No. Did it run the code? Also no.

That’s why I (well, Claude) wrote prove_it: to introduce structured and unstructured verifiability checks into Claude’s workflow. It hooks into Claude Code’s lifecycle events and runs whatever checks you configure — test suites, lint scripts, AI code reviewers — blocking Claude until they pass.

If it’s not obvious, prove_it only works with Claude Code. If you’re not using Claude Code, this tool won’t do anything for you.

What does prove_it prove?

The two most important things prove_it does:

  • Blocks stop — each time Claude finishes its response and hands control back to the user, it fires “stop” hooks. prove_it runs your fast tests (script/test_fast) and blocks if they fail. It can also deploy a reviewer agent to check whether commensurate verification methods (e.g. test coverage) were introduced for whatever code was added during the response
  • Blocks commits — each time Claude attempts a git commit, prove_it runs ./script/test and blocks unless it passes. It can then deploy a reviewer agent that inspects all staged changes and hunts for potential bugs and dead code, blocking if it finds anything significant

Other stuff prove_it does:

  • Blocks human commits too — prove_it installs git pre-commit and pre-push hooks so the same test checks run whether Claude or a human is committing
  • Beads integration — if your project uses beads to track work, prove_it will stop Claude from editing code unless a current task is in progress, essentially forcing it to know what it’s working on before it starts working
  • Tracks runs — if code hasn’t changed since the last successful test run, prove_it skips re-running your tests (configurable per-check)
  • Config protection — blocks Claude from editing your prove_it config files directly
# Install the CLI
brew install searlsco/tap/prove_it

# Register prove_it hooks in ~/.claude/settings.json
prove_it install

Then, in each project:

cd your-project
prove_it init

This will interactively set up .claude/prove_it.json, create a script/test stub if you don’t have one, and install git hooks. Restart Claude Code and you’re live.

Pass flags to skip prompts (useful for CI or scripting):

prove_it init --git-hooks --default-checks

Available flags:

--[no-]git-hooks                Install git pre-commit/pre-push hooks (default: on)
--[no-]default-checks           Include beads gate, AI code review, AI coverage review (default: on)
--[no-]automatic-git-hook-merge Merge with existing git hooks (default: off — fails if hooks exist)

By default, prove_it looks for two test scripts by convention:

Script Purpose When it runs
script/test Full test suite (units, integration, linters, etc.) Before every git commit
script/test_fast Fast unit tests only Every time Claude stops work

For example, your script/test_fast script might run:

And your full script/test command will probably run that and more:

#!/bin/bash
rake test standard:fix test:system

That’s it. Now Claude must see your tests pass before claiming the job’s done or committing your code.

prove_it is configured with a hooks array in .claude/prove_it.json. Each hook targets a lifecycle event and runs a list of checks:

{
  "configVersion": 2,
  "enabled": true,
  "sources": ["src/**/*.js", "lib/**/*.js", "test/**/*.js"],
  "hooks": [
    {
      "type": "claude",
      "event": "Stop",
      "checks": [
        { "name": "fast-tests", "type": "script", "command": "./script/test_fast" },
        { "name": "coverage-review", "type": "agent", "prompt": "Check coverage...\n\n{{session_diffs}}" }
      ]
    }
  ]
}

Config files merge (later overrides earlier):

  1. ~/.claude/prove_it/config.json — global defaults
  2. .claude/prove_it.json — project config (commit this)
  3. .claude/prove_it.local.json — local overrides (gitignored, per-developer)

Type Event What triggers it
claude SessionStart Claude boots up
claude PreToolUse Before Claude uses a tool (edit, commit, etc.)
claude Stop Claude finishes a task
git pre-commit Before any git commit (Claude or human)
git pre-push Before any git push

  • script — runs a shell command, fails on non-zero exit
  • agent — sends a prompt to an AI reviewer, expects PASS/FAIL response (see Agent checks)

PreToolUse hooks can filter by tool name and command patterns:

{
  "type": "claude",
  "event": "PreToolUse",
  "matcher": "Bash",
  "triggers": ["(^|\\s)git\\s+commit\\b"],
  "checks": [...]
}

matcher filters by Claude’s tool name (Edit, Write, Bash, etc.). triggers are regex patterns matched against the tool’s command argument. Both are optional — omit them to run on every PreToolUse.

{ "name": "beads-gate", "type": "script", "command": "prove_it builtin:beads-gate",
  "when": { "fileExists": ".beads" } }

Supported conditions: fileExists, envSet, envNotSet.

Agent checks spawn a separate AI process to review Claude’s work with an independent PASS/FAIL verdict. This is useful because the reviewing agent has no stake in the code it’s judging.

By default, agent checks use claude -p (Claude Code in pipe mode). The reviewer receives a wrapped prompt and must respond with PASS or FAIL: .

{
  "name": "code-review",
  "type": "agent",
  "prompt": "Review staged changes for:\n1. Test coverage gaps\n2. Logic errors or edge cases\n3. Dead code\n\n{{staged_diff}}"
}

These expand in agent prompts:

Variable Contents
{{staged_diff}} git diff --cached (staged changes)
{{staged_files}} git diff --cached --name-only
{{working_diff}} git diff (unstaged changes)
{{changed_files}} git diff --name-only HEAD
{{session_diffs}} All changes since session baseline
{{test_output}} Output from the most recent script check
{{tool_command}} The command Claude is trying to run
{{file_path}} The file Claude is trying to edit
{{project_dir}} Project directory
{{git_head}} Current HEAD commit SHA

Adversarial cross-platform review

You can use a different AI for each reviewer, so the agent doing the work is checked by a competing model:

{
  "name": "code-review",
  "type": "agent",
  "prompt": "Review staged changes for bugs and missing tests.\n\n{{staged_diff}}"
},
{
  "name": "adversarial-review",
  "type": "agent",
  "command": "codex exec -",
  "prompt": "Second opinion: look for issues the primary reviewer might miss.\n\n{{staged_diff}}"
}

The command field accepts any CLI that reads a prompt from stdin and writes its response to stdout. Defaults to claude -p.

prove_it ships with built-in checks invoked via prove_it builtin::

Builtin Event What it does
session-baseline SessionStart Records git state for session diff tracking
beads-reminder SessionStart Reminds Claude about issue tracker workflow
config-protection PreToolUse Blocks direct edits to prove_it config files
beads-gate PreToolUse Requires an in-progress issue before code changes
soft-stop-reminder Stop Reminds Claude to push and clean up

prove_it install     Register global hooks (~/.claude/settings.json)
prove_it uninstall   Remove global hooks
prove_it init        Set up current project (interactive or with flags)
prove_it deinit      Remove prove_it from current project
prove_it diagnose    Check installation and show effective config
prove_it hook  Run a dispatcher directly (claude:Stop, git:pre-commit)

prove_it only runs in directories that contain a git repository, so casual use of Claude in ~/tmp or ~/bin won’t trigger it.

When you do need to disable it:

Ignore specific directories

Edit ~/.claude/prove_it/config.json:

{
  "ignoredPaths": ["~/bin", "~/dotfiles"]
}

For all contributors — edit .claude/prove_it.json:

For just you — edit .claude/prove_it.local.json:

Disable with an environment variable

export PROVE_IT_DISABLED=1
  • Hooks not firing — Restart Claude Code after prove_it install
  • Tests not running — Check ./script/test exists and is executable (chmod +x)
  • Hooks running in wrong directories — prove_it only activates in git repos

See example/basic/ and example/advanced/ for working projects with configs, test suites, and reviewer prompts.

  • Node.js >= 18
  • Claude Code with hooks support

MIT



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *