Unified Code Review

ckb review runs 20 quality checks in a single command and returns a verdict, findings, and suggested reviewers. It replaces the need to wire up individual gates manually.

Architecture

Review Architecture

Quick links: CI-CD-Integration for GitHub Action setup · Quality Gates for individual gate thresholds · Workflow Examples for production templates

Quick Start

# Review current branch against main
ckb review

# Custom base branch
ckb review --base=develop

# Review staged changes only (pre-commit)
ckb review --staged

# Scope to a path prefix or symbol
ckb review internal/query/
ckb review --scope=Engine

# Only run specific checks
ckb review --checks=breaking,secrets,health

# New analyzers
ckb review --checks=dead-code,test-gaps,blast-radius --max-fanout=20

# Bug pattern detection
ckb review --checks=bug-patterns

# CI mode (exit 1=fail, 2=warn)
ckb review --ci --format=json

# Markdown for PR comments
ckb review --format=markdown

# With AI-generated narrative summary
ckb review --llm

Checks

ckb review orchestrates 20 checks in three priority tiers. All checks run concurrently; tree-sitter calls are serialized via a mutex so git subprocess work overlaps with analysis.

Tier 1 — Blocking (Must Fix)

Check	What It Does	Detector	Gate Type
`breaking`	API breaking change detection (removed symbols, changed signatures)	SCIP CompareAPI	fail
`secrets`	Credential and secret scanning (entropy ≥ 3.5 + pattern matching)	Entropy scanner + allowlist	fail
`critical`	Safety-critical path enforcement (requires config)	Glob pattern matching	fail

Tier 2 — Important (Should Fix)

Check	What It Does	Detector	Gate Type
`complexity`	Cyclomatic/cognitive complexity delta per function	tree-sitter AST	warn
`health`	8-factor weighted code health score (A-F grades)	Multi-source (see below)	warn
`risk`	Multi-factor risk score (file count, LOC, hotspots, module spread)	Heuristic	warn
`coupling`	Missing co-changed files (≥70% correlation, 365-day window)	Git history	warn
`dead-code`	Unreferenced symbols and constants in changed files	SCIP + reference counting	warn
`blast-radius`	High fan-out symbol detection; informational unless threshold set	SCIP AnalyzeImpact	warn/info
`bug-patterns`	High-confidence Go AST bug patterns (10 rules, see below)	tree-sitter AST	warn

Tier 3 — Informational (Good to Know)

Check	What It Does	Detector	Gate Type
`hotspots`	Overlap with volatile/high-churn files (score > 0.5)	Churn ranking	info
`tests`	Affected test coverage	GetAffectedTests	warn*
`test-gaps`	Untested functions in changed files (≥5 lines)	tree-sitter + coverage	info
`comment-drift`	Numeric constant vs comment mismatch (e.g., `// 256` above `const MaxSize = 512`)	Regex scan	info
`format-consistency`	Cross-formatter output divergence (Human vs Markdown)	Function pair analysis	info
`generated`	Generated file detection and exclusion from all checks	Glob + header markers	info
`traceability`	Commit-to-ticket linkage (e.g., JIRA-\d+)	Regex + branch name	warn
`independence`	Author != reviewer enforcement (regulated industries)	Commit author extraction	warn
`classify`	Change classification (new/refactor/moved/churn/test/config)	Heuristics	—
`split`	Large PR split suggestion with cluster analysis	BFS graph clustering	warn

* tests only warns when requireTests: true is set.

Selecting Checks

# Run all (default)
ckb review

# Only security-related
ckb review --checks=breaking,secrets

# Only bug patterns
ckb review --checks=bug-patterns

# Skip slow checks
ckb review --checks=breaking,secrets,tests,coupling

Bug Pattern Detection

The bug-patterns check uses tree-sitter AST analysis to detect 10 high-confidence Go bug patterns. Only newly introduced patterns are reported — pre-existing issues from the base branch are filtered out.

#	Rule	What It Catches
1	`defer-in-loop`	`defer` inside a loop — cleanup won't run per iteration, only at function exit
2	`unreachable-code`	Statements after `return` or `panic`
3	`empty-error-branch`	`if err != nil { }` with no body — silently swallowed error
4	`unchecked-type-assert`	`x.(T)` without `ok` check — panic risk at runtime
5	`self-assignment`	`a = a` — no-op assignment
6	`nil-after-deref`	Pointer dereference before nil check
7	`identical-branches`	`if/else` with identical code in both branches
8	`shadowed-err`	Inner `:=` redeclares outer `err` variable
9	`discarded-error`	Function call returning error with no assignment
10	`missing-defer-close`	Resource from `Open`/`Create` without `defer Close()`

Scope: First 20 non-test .go files in the changeset.

Deduplication: Findings are compared against the base branch. Only patterns newly introduced by the PR are reported. Pre-existing bugs show as "pass" with a note.

Hold-the-Line

By default, CKB Review only reports issues on lines actually changed in the PR, not pre-existing problems in unchanged code.

How It Works

Parse unified diff to extract changed line numbers per file
Build a map: file → set of changed line numbers
After all checks run, filter findings:
- Keep: File-level findings (StartLine == 0)
- Keep: Findings on changed lines
- Keep: Findings on files not in the diff map (safety fallback)
- Drop: Findings on unchanged lines

Why It Matters

In a codebase with historical technical debt, without Hold-the-Line, hundreds of pre-existing warnings would flood the report. Reviewers see only what this PR introduced.

Configuration

{
  "holdTheLine": true
}

Set to false to see all findings, including pre-existing ones.

Output Formats

Format	Flag	Use Case
Human	`--format=human`	Terminal output with colors and icons (default)
JSON	`--format=json`	Machine-readable, CI pipelines, AI consumption
Markdown	`--format=markdown`	PR comments (GitHub, GitLab)
GitHub Actions	`--format=github-actions`	Inline `::error`/`::warning` annotations
SARIF	`--format=sarif`	GitHub Code Scanning, GitLab SAST
CodeClimate	`--format=codeclimate`	GitLab Code Quality
Compliance	`--format=compliance`	Audit trail (IEC 61508, DO-178C, ISO 26262)

Review Policy

Configure quality gates via CLI flags or .ckb/config.json:

{
  "review": {
    "blockBreakingChanges": true,
    "blockSecrets": true,
    "requireTests": false,
    "maxRiskScore": 0.7,
    "maxComplexityDelta": 0,
    "maxFiles": 0,
    "maxBlastRadiusDelta": 0,
    "maxFanOut": 0,
    "deadCodeMinConfidence": 0.8,
    "testGapMinLines": 5,
    "failOnLevel": "error",
    "holdTheLine": true,
    "splitThreshold": 50,
    "criticalPaths": ["drivers/**", "protocol/**"],
    "criticalSeverity": "error",
    "generatedPatterns": ["*.pb.go", "*.generated.*", "*.pb.cc", "parser.tab.c", "lex.yy.c"],
    "generatedMarkers": ["DO NOT EDIT", "Generated by", "AUTO-GENERATED", "This file is generated"],
    "traceabilityPatterns": [],
    "traceabilitySources": ["commit-message", "branch-name"],
    "requireTraceability": false,
    "requireTraceForCriticalPaths": false,
    "requireIndependentReview": false,
    "minReviewers": 1
  }
}

Configuration Hierarchy

Hardcoded defaults (DefaultReviewPolicy)
Repository config from .ckb/config.json (overrides defaults)
CLI flags (override everything)

CLI Overrides

# Quality gates
ckb review --block-breaking=true --block-secrets=true
ckb review --require-tests --max-risk=0.8
ckb review --max-complexity=10 --max-files=100
ckb review --max-fanout=20                    # blast-radius threshold
ckb review --dead-code-confidence=0.9         # stricter dead code
ckb review --test-gap-lines=10                # only flag larger functions

# Verdict control
ckb review --fail-on=warning   # fail on warnings too
ckb review --fail-on=none      # never fail (informational)

# Safety-critical paths
ckb review --critical-paths="drivers/**,auth/**"

# Traceability
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"

# Reviewer independence
ckb review --require-independent --min-reviewers=2

# AI narrative
ckb review --llm

Code Health Scoring

The health check computes a 0-100 score per file using 8 weighted factors:

Factor	Weight	Source	Scale
Cyclomatic complexity	25%	tree-sitter	≤5: 100, 6-10: 85, 11-20: 65, 21-30: 40, >30: 20
Cognitive complexity	15%	tree-sitter	Same scale as cyclomatic
File size (LOC)	10%	Line count	≤100: 100, 101-300: 85, 301-500: 70, 501-1000: 50, >1000: 30
Churn (commits/30d)	15%	git log	≤2: 100, 3-5: 80, 6-10: 60, 11-20: 40, >20: 20
Coupling (co-changes)	10%	git log	≤2: 100, 3-5: 80, 6-10: 60, >10: 40
Bus factor (authors)	10%	git blame	≥5: 100, 3-4: 85, 2: 60, 1: 30
Age (days since change)	15%	git log	≤30: 100, 31-90: 85, 91-180: 70, 181-365: 50, >365: 30

Grades

Grade	Score Range
A	90-100
B	70-89
C	50-69
D	30-49
F	0-29

Confidence Factor

The confidence factor (0-1) indicates how reliable the score is:

Reduced by 0.4 if tree-sitter parsing fails
Reduced by 0.3 if all metrics are at default (no git data)
Reduced by 0.2 if only bus factor is at default

Health is computed for both base and head versions. Findings are generated when a file degrades by more than 5 points.

Score Calculation

The review score starts at 100 and deducts points per finding:

Severity	Points	Cap Per Check	Cap Per Rule
error	-10	20	10
warning	-3	20	10
info	-1	20	10

Maximum total deduction: 80 points (score floor: 0)

Three caps prevent noise from dominating the score:

Per-rule cap (10): A single noisy rule (e.g., ckb/bug/discarded-error) can't consume its check's entire budget.
Per-check cap (20): A single check (e.g., coupling with 100+ findings) can't overwhelm the score.
Total cap (80): Large PRs where many checks fire don't produce meaningless scores.

Verdict Logic

fail-on-level = "error" (default), "warning", or "none"

"none"    → always pass
"warning" → fail if any check has status "fail" or "warn"
"error"   → fail if any check has status "fail"; warn if any has "warn"; otherwise pass

CI exit codes: 0 = pass, 1 = fail, 2 = warn

Review Effort Estimation

CKB estimates human review effort based on Microsoft and Google code review research.

Time Model

Code Category	Review Speed
New code	200 LOC/hour
Refactored/modified	300 LOC/hour
Moved/test/config	500 LOC/hour (quick scan)

Additional Factors

Factor	Overhead
File switches (> 5 files)	2 min per file
Module context switches (> 1 module)	5 min per module beyond first
Safety-critical files	10 min each
Minimum	5 minutes

Complexity Scale

Level	Estimated Time
Trivial	< 20 min
Moderate	20-60 min
Complex	60-240 min
Very complex	> 240 min

Example Output

Estimated Review: ~45min (moderate)
  · 20 min from 120 LOC
  · 10 min from 5 file switches
  · 15 min for 3 critical files

Narrative Summary

CKB generates a 2-3 sentence review summary automatically.

Deterministic (Default)

Sentence 1: "Changes N files across M modules (languages)."
Sentence 2: Risk signal (failed checks, top warnings, or "No blocking issues found.")
Sentence 3: Focus area (split suggestion, critical files, or omitted)

Example: "Changes 25 files across 3 modules (Go, TypeScript). 2 breaking API changes detected; 2 safety-critical files changed. 2 safety-critical files need focused review."

LLM-Powered (`--llm`)

With --llm, Claude generates a context-aware summary based on the top 10 findings, verdict, score, and health report.

Model: claude-sonnet-4-20250514 (configurable)
Timeout: 30 seconds
Fallback: deterministic narrative on API failure

Change Classification

The classify check categorizes each file in the changeset:

Category	Heuristic	Review Priority
`new`	File doesn't exist at base	High
`refactoring`	Renamed + changes	Medium
`moved`	Renamed + ≤20% content change	Low
`churn`	≥3 commits in last 30 days	High
`config`	Makefiles, CI, Docker, etc.	Low
`test`	Test file patterns (`_test.go`, `test_.py`, etc.)	Medium
`generated`	Matches generated patterns/markers	Skip
`modified`	Default — none of the above	Medium

The review effort estimate uses classification to adjust time: generated files are skipped, tests review faster, new code gets full review time.

Generated File Detection

Generated files are detected early and excluded from all other checks. This is the most important token-saving mechanism: in a monorepo with Protocol Buffers, this alone can eliminate 30-50% of files.

Default Patterns (Glob)

*.generated.*
*.pb.go           # Protocol Buffers (Go)
*.pb.cc           # Protocol Buffers (C++)
parser.tab.c      # Yacc/Bison
lex.yy.c          # Flex

Default Header Markers (First 10 Lines)

"DO NOT EDIT"
"Generated by"
"AUTO-GENERATED"
"This file is generated"

Both patterns and markers are configurable via policy.

PR Split Suggestion

When a PR exceeds splitThreshold files (default: 50), the split check analyzes the changeset and suggests independent clusters.

Algorithm

Build adjacency graph of files
Connect files in same module (fully connected)
Enrich edges using coupling analysis (top 20 files, correlation ≥ 0.5)
Find connected components via BFS
If ≤1 component → no split recommended
If >1 component → return independent clusters

Cluster Metadata

Each cluster reports:

Name: dominant module
Files: list with additions/deletions
Languages: detected languages
Independent: always true (connected components are by definition)

Sorted by file count descending.

Risk Scoring

The risk check computes a 0-1 risk score from four factors:

Factor	Contribution
> 20 files changed	+0.3 (> 10: +0.15)
> 1000 LOC changes	+0.3 (> 500: +0.15)
Hotspot overlap	+0.1 per hotspot (max 0.3)
> 5 modules affected	+0.2

Risk levels: low (< 0.3), medium (0.3-0.6), high (> 0.6)

Traceability & Compliance

For regulated industries (IEC 61508, DO-178C, ISO 26262):

# Require ticket references in commits
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"

# Require independent reviewer (author != reviewer)
ckb review --require-independent --min-reviewers=2

# Safety-critical path enforcement
ckb review --critical-paths="drivers/**,protocol/**"

# Full compliance output
ckb review --format=compliance

Traceability Sources

Configure where to look for ticket references:

commit-message — scan commit messages
branch-name — scan the branch name

Both are checked by default when traceability is enabled.

Finding Baselines

Track finding trends across releases:

# Save current findings as a baseline
ckb review baseline save --tag=v1.0

# List saved baselines
ckb review baseline list

# Compare two baselines
ckb review baseline diff v1.0 v2.0

Baseline diffs classify each finding as new, unchanged, or resolved.

Storage: .ckb/baselines/ (git-ignored)

Token Savings for AI-Assisted Review

When used via MCP (reviewPR tool), CKB Review dramatically reduces token consumption for AI code reviewers. Five mechanisms work together:

1. Generated File Filtering

Generated files are excluded before any analysis. Protocol Buffers, build artifacts, lock files — none of this reaches the AI.

Savings: 30-50% of files in typical monorepos.

2. Structured Findings Instead of Raw Diffs

Instead of a 10KB+ unified diff, the AI receives a structured JSON response:

{
  "verdict": "warn",
  "score": 72,
  "findings": [
    {
      "check": "breaking",
      "severity": "error",
      "file": "api/handler.go",
      "startLine": 42,
      "message": "Removed public function HandleAuth()",
      "suggestion": "Update all call sites or provide a wrapper",
      "tier": 1
    }
  ]
}

Savings: 80-90% smaller than raw diff.

3. Hold-the-Line Filtering

Only new issues are reported. Historical debt is filtered out.

Savings: 40-70% fewer findings in codebases with existing debt.

4. Finding Tier Filtering

Top 10 findings by tier + severity. Tier 3 (informational) suppressed for AI summary.

5. Summary Statistics

Single numbers replace pages of analysis:

"riskScore": 0.42 instead of reading all changed files
"healthDelta": -0.8" instead of per-file health records

Cumulative Result

Step	Remaining Volume
Raw diff (600 files)	100% (~500K-1M tokens)
After generated-file filtering	~60%
After Hold-the-Line	~25%
Structured findings	~5%
After tier filtering	~2-3% (~10K-30K tokens)

Typical savings: 85-95% token reduction on PRs with 50+ files.

Performance

CKB Review is optimized for large PRs. The engine runs 20 checks in parallel with a mutex-based tree-sitter scheduler that overlaps git subprocess work with static analysis.

PR Size	Runtime	Git Calls	Tree-Sitter Calls
10 files	~2s	~15	~20
100 files	~8s	~40	~60
600 files	~15s	~50 (batched)	~60 (capped)

Parallelization Strategy

Without lock (run in parallel): breaking, secrets, tests, hotspots, risk, coupling, dead-code, blast-radius, critical, traceability, independence

With tree-sitter mutex (serialized): complexity, health, test-gaps, bug-patterns, comment-drift, format-consistency

Lock-free checks typically complete within 200ms while tree-sitter checks process their queue. Net effect: ~6-8x speedup vs sequential execution.

Key Optimizations

Batched git operations: one git log --name-only replaces 120+ individual calls for health scoring
Parallel git blame: 5-worker pool instead of sequential calls
Cached hotspot scores: computed once, shared across all checks
Capped analysis: health limited to 30 files, coupling to 20, bug-patterns to 20

GitHub Action

CKB provides a composite action at action/ckb-review:

- uses: SimplyLiz/CodeMCP/action/ckb-review@main
  with:
    fail-on: 'error'         # or 'warning' / 'none'
    comment: 'true'          # post PR comment
    sarif: 'true'            # upload to Code Scanning
    checks: ''               # all checks (or comma-separated subset)
    critical-paths: 'drivers/**'
    require-trace: 'false'
    trace-patterns: ''
    require-independent: 'false'
    max-fanout: '0'          # blast-radius threshold (0 = disabled)
    dead-code-confidence: '0.8'
    test-gap-lines: '5'

Outputs: verdict (pass/warn/fail), score (0-100), findings (count)

For a complete workflow example, see the pr-review.yml template or Workflow Examples#pr-review.

MCP Tool

The reviewPR MCP tool exposes the same engine to AI assistants:

{
  "tool": "reviewPR",
  "arguments": {
    "baseBranch": "main",
    "checks": ["breaking", "secrets", "health", "bug-patterns"],
    "failOnLevel": "error",
    "criticalPaths": ["drivers/**"]
  }
}

Returns the full review response: verdict, score, checks, findings, health report, split suggestion, change breakdown, reviewers, review effort estimate, and narrative summary.

CLI Reference

ckb review [scope]

Flags:
  --format=human|json|markdown|github-actions|sarif|codeclimate|compliance
  --base=main                    # Base branch
  --head=                        # Head branch (default: HEAD)
  --checks=breaking,secrets,...  # Filter to specific checks
  --ci                           # CI mode: exit 0=pass, 1=fail, 2=warn
  --staged                       # Review staged changes instead of branch diff
  --scope=internal/query         # Filter to path prefix or symbol

  # Quality Gates
  --fail-on=error|warning|none   # Verdict threshold
  --block-breaking               # Fail on breaking changes
  --block-secrets                # Fail on secret leaks
  --require-tests                # Warn if no tests affected
  --max-risk=0.7                 # Risk score threshold
  --max-complexity=10            # Complexity delta threshold
  --max-files=0                  # Max file count (0=disabled)
  --max-blast-radius=0           # Blast radius threshold
  --max-fanout=0                 # Alias for --max-blast-radius

  # Safety-Critical Paths
  --critical-paths=drivers/**    # Glob patterns

  # Traceability
  --require-trace                # Enforce ticket references
  --trace-patterns=JIRA-\d+      # Ticket regex patterns

  # Reviewer Independence
  --require-independent          # Enforce independent review
  --min-reviewers=2              # Minimum independent reviewers

  # Analyzer Tuning
  --dead-code-confidence=0.8     # Dead code confidence threshold
  --test-gap-lines=5             # Min function lines for test gap reporting
  --llm                          # Use Claude for narrative generation

  # Lint Deduplication
  --lint-report=report.json      # External lint report to deduplicate against

Response Structure

The full ReviewPRResponse contains:

Field	Type	Description
`ckbVersion`	string	CKB version
`schemaVersion`	string	Response schema version
`verdict`	string	`"pass"`, `"warn"`, or `"fail"`
`score`	int	0-100 review score
`summary`	object	File counts, LOC, modules, languages
`checks`	array	20 check results with status, severity, duration
`findings`	array	Actionable items with file, line, message, suggestion, tier
`reviewers`	array	Suggested reviewers from CODEOWNERS + git blame
`generated`	array	Detected generated files with reasons
`splitSuggestion`	object	PR split clusters (if applicable)
`changeBreakdown`	object	Category counts (new/refactor/test/etc.)
`reviewEffort`	object	Time estimate with factors and complexity label
`clusterReviewers`	array	Per-cluster reviewer assignments
`healthReport`	object	Per-file health scores with deltas
`narrative`	string	2-3 sentence review summary
`prTier`	string	`"small"` (< 100 LOC), `"medium"` (≤ 600), `"large"` (> 600)
`provenance`	object	Repo state ID, dirty flag, query duration

Finding Structure

Each finding includes:

Field	Description
`check`	Check name (e.g., `"breaking"`, `"bug-patterns"`)
`severity`	`"error"`, `"warning"`, or `"info"`
`file`	File path
`startLine`	Line number (0 = file-level)
`message`	Human-readable description
`suggestion`	Optional fix recommendation
`ruleId`	Machine-readable ID (e.g., `ckb/breaking/removed-symbol`)
`hint`	Optional drilldown hint (e.g., `→ ckb explain Symbol`)
`tier`	1 (blocking), 2 (important), 3 (informational)

Quality Gates — Individual gate thresholds and CI enforcement
CI-CD-Integration — Full CI/CD integration guide
Workflow Examples — Production-ready workflow templates
Impact-Analysis — Blast radius and risk scoring details
Security — Secret detection configuration
Configuration — Global configuration options