Skip to content

Unified Code Review

ckb review runs 20 quality checks in a single command and returns a verdict, findings, and suggested reviewers. It replaces the need to wire up individual gates manually.

Architecture

Review Architecture

Quick links: CI-CD-Integration for GitHub Action setup · Quality Gates for individual gate thresholds · Workflow Examples for production templates


Quick Start

# Review current branch against main
ckb review

# Custom base branch
ckb review --base=develop

# Review staged changes only (pre-commit)
ckb review --staged

# Scope to a path prefix or symbol
ckb review internal/query/
ckb review --scope=Engine

# Only run specific checks
ckb review --checks=breaking,secrets,health

# New analyzers
ckb review --checks=dead-code,test-gaps,blast-radius --max-fanout=20

# Bug pattern detection
ckb review --checks=bug-patterns

# CI mode (exit 1=fail, 2=warn)
ckb review --ci --format=json

# Markdown for PR comments
ckb review --format=markdown

# With AI-generated narrative summary
ckb review --llm

Checks

ckb review orchestrates 20 checks in three priority tiers. All checks run concurrently; tree-sitter calls are serialized via a mutex so git subprocess work overlaps with analysis.

Tier 1 — Blocking (Must Fix)

Check What It Does Detector Gate Type
breaking API breaking change detection (removed symbols, changed signatures) SCIP CompareAPI fail
secrets Credential and secret scanning (entropy ≥ 3.5 + pattern matching) Entropy scanner + allowlist fail
critical Safety-critical path enforcement (requires config) Glob pattern matching fail

Tier 2 — Important (Should Fix)

Check What It Does Detector Gate Type
complexity Cyclomatic/cognitive complexity delta per function tree-sitter AST warn
health 8-factor weighted code health score (A-F grades) Multi-source (see below) warn
risk Multi-factor risk score (file count, LOC, hotspots, module spread) Heuristic warn
coupling Missing co-changed files (≥70% correlation, 365-day window) Git history warn
dead-code Unreferenced symbols and constants in changed files SCIP + reference counting warn
blast-radius High fan-out symbol detection; informational unless threshold set SCIP AnalyzeImpact warn/info
bug-patterns High-confidence Go AST bug patterns (10 rules, see below) tree-sitter AST warn

Tier 3 — Informational (Good to Know)

Check What It Does Detector Gate Type
hotspots Overlap with volatile/high-churn files (score > 0.5) Churn ranking info
tests Affected test coverage GetAffectedTests warn*
test-gaps Untested functions in changed files (≥5 lines) tree-sitter + coverage info
comment-drift Numeric constant vs comment mismatch (e.g., // 256 above const MaxSize = 512) Regex scan info
format-consistency Cross-formatter output divergence (Human vs Markdown) Function pair analysis info
generated Generated file detection and exclusion from all checks Glob + header markers info
traceability Commit-to-ticket linkage (e.g., JIRA-\d+) Regex + branch name warn
independence Author != reviewer enforcement (regulated industries) Commit author extraction warn
classify Change classification (new/refactor/moved/churn/test/config) Heuristics
split Large PR split suggestion with cluster analysis BFS graph clustering warn

* tests only warns when requireTests: true is set.

Selecting Checks

# Run all (default)
ckb review

# Only security-related
ckb review --checks=breaking,secrets

# Only bug patterns
ckb review --checks=bug-patterns

# Skip slow checks
ckb review --checks=breaking,secrets,tests,coupling

Bug Pattern Detection

The bug-patterns check uses tree-sitter AST analysis to detect 10 high-confidence Go bug patterns. Only newly introduced patterns are reported — pre-existing issues from the base branch are filtered out.

# Rule What It Catches
1 defer-in-loop defer inside a loop — cleanup won't run per iteration, only at function exit
2 unreachable-code Statements after return or panic
3 empty-error-branch if err != nil { } with no body — silently swallowed error
4 unchecked-type-assert x.(T) without ok check — panic risk at runtime
5 self-assignment a = a — no-op assignment
6 nil-after-deref Pointer dereference before nil check
7 identical-branches if/else with identical code in both branches
8 shadowed-err Inner := redeclares outer err variable
9 discarded-error Function call returning error with no assignment
10 missing-defer-close Resource from Open/Create without defer Close()

Scope: First 20 non-test .go files in the changeset.

Deduplication: Findings are compared against the base branch. Only patterns newly introduced by the PR are reported. Pre-existing bugs show as "pass" with a note.


Hold-the-Line

By default, CKB Review only reports issues on lines actually changed in the PR, not pre-existing problems in unchanged code.

How It Works

  1. Parse unified diff to extract changed line numbers per file
  2. Build a map: file → set of changed line numbers
  3. After all checks run, filter findings:
    • Keep: File-level findings (StartLine == 0)
    • Keep: Findings on changed lines
    • Keep: Findings on files not in the diff map (safety fallback)
    • Drop: Findings on unchanged lines

Why It Matters

In a codebase with historical technical debt, without Hold-the-Line, hundreds of pre-existing warnings would flood the report. Reviewers see only what this PR introduced.

Configuration

{
  "holdTheLine": true
}

Set to false to see all findings, including pre-existing ones.


Output Formats

Format Flag Use Case
Human --format=human Terminal output with colors and icons (default)
JSON --format=json Machine-readable, CI pipelines, AI consumption
Markdown --format=markdown PR comments (GitHub, GitLab)
GitHub Actions --format=github-actions Inline ::error/::warning annotations
SARIF --format=sarif GitHub Code Scanning, GitLab SAST
CodeClimate --format=codeclimate GitLab Code Quality
Compliance --format=compliance Audit trail (IEC 61508, DO-178C, ISO 26262)

Review Policy

Configure quality gates via CLI flags or .ckb/config.json:

{
  "review": {
    "blockBreakingChanges": true,
    "blockSecrets": true,
    "requireTests": false,
    "maxRiskScore": 0.7,
    "maxComplexityDelta": 0,
    "maxFiles": 0,
    "maxBlastRadiusDelta": 0,
    "maxFanOut": 0,
    "deadCodeMinConfidence": 0.8,
    "testGapMinLines": 5,
    "failOnLevel": "error",
    "holdTheLine": true,
    "splitThreshold": 50,
    "criticalPaths": ["drivers/**", "protocol/**"],
    "criticalSeverity": "error",
    "generatedPatterns": ["*.pb.go", "*.generated.*", "*.pb.cc", "parser.tab.c", "lex.yy.c"],
    "generatedMarkers": ["DO NOT EDIT", "Generated by", "AUTO-GENERATED", "This file is generated"],
    "traceabilityPatterns": [],
    "traceabilitySources": ["commit-message", "branch-name"],
    "requireTraceability": false,
    "requireTraceForCriticalPaths": false,
    "requireIndependentReview": false,
    "minReviewers": 1
  }
}

Configuration Hierarchy

  1. Hardcoded defaults (DefaultReviewPolicy)
  2. Repository config from .ckb/config.json (overrides defaults)
  3. CLI flags (override everything)

CLI Overrides

# Quality gates
ckb review --block-breaking=true --block-secrets=true
ckb review --require-tests --max-risk=0.8
ckb review --max-complexity=10 --max-files=100
ckb review --max-fanout=20                    # blast-radius threshold
ckb review --dead-code-confidence=0.9         # stricter dead code
ckb review --test-gap-lines=10                # only flag larger functions

# Verdict control
ckb review --fail-on=warning   # fail on warnings too
ckb review --fail-on=none      # never fail (informational)

# Safety-critical paths
ckb review --critical-paths="drivers/**,auth/**"

# Traceability
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"

# Reviewer independence
ckb review --require-independent --min-reviewers=2

# AI narrative
ckb review --llm

Code Health Scoring

The health check computes a 0-100 score per file using 8 weighted factors:

Factor Weight Source Scale
Cyclomatic complexity 25% tree-sitter ≤5: 100, 6-10: 85, 11-20: 65, 21-30: 40, >30: 20
Cognitive complexity 15% tree-sitter Same scale as cyclomatic
File size (LOC) 10% Line count ≤100: 100, 101-300: 85, 301-500: 70, 501-1000: 50, >1000: 30
Churn (commits/30d) 15% git log ≤2: 100, 3-5: 80, 6-10: 60, 11-20: 40, >20: 20
Coupling (co-changes) 10% git log ≤2: 100, 3-5: 80, 6-10: 60, >10: 40
Bus factor (authors) 10% git blame ≥5: 100, 3-4: 85, 2: 60, 1: 30
Age (days since change) 15% git log ≤30: 100, 31-90: 85, 91-180: 70, 181-365: 50, >365: 30

Grades

Grade Score Range
A 90-100
B 70-89
C 50-69
D 30-49
F 0-29

Confidence Factor

The confidence factor (0-1) indicates how reliable the score is:

  • Reduced by 0.4 if tree-sitter parsing fails
  • Reduced by 0.3 if all metrics are at default (no git data)
  • Reduced by 0.2 if only bus factor is at default

Health is computed for both base and head versions. Findings are generated when a file degrades by more than 5 points.


Score Calculation

The review score starts at 100 and deducts points per finding:

Severity Points Cap Per Check Cap Per Rule
error -10 20 10
warning -3 20 10
info -1 20 10

Maximum total deduction: 80 points (score floor: 0)

Three caps prevent noise from dominating the score:

  • Per-rule cap (10): A single noisy rule (e.g., ckb/bug/discarded-error) can't consume its check's entire budget.
  • Per-check cap (20): A single check (e.g., coupling with 100+ findings) can't overwhelm the score.
  • Total cap (80): Large PRs where many checks fire don't produce meaningless scores.

Verdict Logic

fail-on-level = "error" (default), "warning", or "none"

"none"    → always pass
"warning" → fail if any check has status "fail" or "warn"
"error"   → fail if any check has status "fail"; warn if any has "warn"; otherwise pass

CI exit codes: 0 = pass, 1 = fail, 2 = warn


Review Effort Estimation

CKB estimates human review effort based on Microsoft and Google code review research.

Time Model

Code Category Review Speed
New code 200 LOC/hour
Refactored/modified 300 LOC/hour
Moved/test/config 500 LOC/hour (quick scan)

Additional Factors

Factor Overhead
File switches (> 5 files) 2 min per file
Module context switches (> 1 module) 5 min per module beyond first
Safety-critical files 10 min each
Minimum 5 minutes

Complexity Scale

Level Estimated Time
Trivial < 20 min
Moderate 20-60 min
Complex 60-240 min
Very complex > 240 min

Example Output

Estimated Review: ~45min (moderate)
  · 20 min from 120 LOC
  · 10 min from 5 file switches
  · 15 min for 3 critical files

Narrative Summary

CKB generates a 2-3 sentence review summary automatically.

Deterministic (Default)

Sentence 1: "Changes N files across M modules (languages)."
Sentence 2: Risk signal (failed checks, top warnings, or "No blocking issues found.")
Sentence 3: Focus area (split suggestion, critical files, or omitted)

Example: "Changes 25 files across 3 modules (Go, TypeScript). 2 breaking API changes detected; 2 safety-critical files changed. 2 safety-critical files need focused review."

LLM-Powered (--llm)

With --llm, Claude generates a context-aware summary based on the top 10 findings, verdict, score, and health report.

  • Model: claude-sonnet-4-20250514 (configurable)
  • Timeout: 30 seconds
  • Fallback: deterministic narrative on API failure

Change Classification

The classify check categorizes each file in the changeset:

Category Heuristic Review Priority
new File doesn't exist at base High
refactoring Renamed + changes Medium
moved Renamed + ≤20% content change Low
churn ≥3 commits in last 30 days High
config Makefiles, CI, Docker, etc. Low
test Test file patterns (*_test.go, test_*.py, etc.) Medium
generated Matches generated patterns/markers Skip
modified Default — none of the above Medium

The review effort estimate uses classification to adjust time: generated files are skipped, tests review faster, new code gets full review time.


Generated File Detection

Generated files are detected early and excluded from all other checks. This is the most important token-saving mechanism: in a monorepo with Protocol Buffers, this alone can eliminate 30-50% of files.

Default Patterns (Glob)

*.generated.*
*.pb.go           # Protocol Buffers (Go)
*.pb.cc           # Protocol Buffers (C++)
parser.tab.c      # Yacc/Bison
lex.yy.c          # Flex

Default Header Markers (First 10 Lines)

"DO NOT EDIT"
"Generated by"
"AUTO-GENERATED"
"This file is generated"

Both patterns and markers are configurable via policy.


PR Split Suggestion

When a PR exceeds splitThreshold files (default: 50), the split check analyzes the changeset and suggests independent clusters.

Algorithm

  1. Build adjacency graph of files
  2. Connect files in same module (fully connected)
  3. Enrich edges using coupling analysis (top 20 files, correlation ≥ 0.5)
  4. Find connected components via BFS
  5. If ≤1 component → no split recommended
  6. If >1 component → return independent clusters

Cluster Metadata

Each cluster reports:

  • Name: dominant module
  • Files: list with additions/deletions
  • Languages: detected languages
  • Independent: always true (connected components are by definition)

Sorted by file count descending.


Risk Scoring

The risk check computes a 0-1 risk score from four factors:

Factor Contribution
> 20 files changed +0.3 (> 10: +0.15)
> 1000 LOC changes +0.3 (> 500: +0.15)
Hotspot overlap +0.1 per hotspot (max 0.3)
> 5 modules affected +0.2

Risk levels: low (< 0.3), medium (0.3-0.6), high (> 0.6)


Traceability & Compliance

For regulated industries (IEC 61508, DO-178C, ISO 26262):

# Require ticket references in commits
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"

# Require independent reviewer (author != reviewer)
ckb review --require-independent --min-reviewers=2

# Safety-critical path enforcement
ckb review --critical-paths="drivers/**,protocol/**"

# Full compliance output
ckb review --format=compliance

Traceability Sources

Configure where to look for ticket references:

  • commit-message — scan commit messages
  • branch-name — scan the branch name

Both are checked by default when traceability is enabled.


Finding Baselines

Track finding trends across releases:

# Save current findings as a baseline
ckb review baseline save --tag=v1.0

# List saved baselines
ckb review baseline list

# Compare two baselines
ckb review baseline diff v1.0 v2.0

Baseline diffs classify each finding as new, unchanged, or resolved.

Storage: .ckb/baselines/ (git-ignored)


Token Savings for AI-Assisted Review

When used via MCP (reviewPR tool), CKB Review dramatically reduces token consumption for AI code reviewers. Five mechanisms work together:

1. Generated File Filtering

Generated files are excluded before any analysis. Protocol Buffers, build artifacts, lock files — none of this reaches the AI.

Savings: 30-50% of files in typical monorepos.

2. Structured Findings Instead of Raw Diffs

Instead of a 10KB+ unified diff, the AI receives a structured JSON response:

{
  "verdict": "warn",
  "score": 72,
  "findings": [
    {
      "check": "breaking",
      "severity": "error",
      "file": "api/handler.go",
      "startLine": 42,
      "message": "Removed public function HandleAuth()",
      "suggestion": "Update all call sites or provide a wrapper",
      "tier": 1
    }
  ]
}

Savings: 80-90% smaller than raw diff.

3. Hold-the-Line Filtering

Only new issues are reported. Historical debt is filtered out.

Savings: 40-70% fewer findings in codebases with existing debt.

4. Finding Tier Filtering

Top 10 findings by tier + severity. Tier 3 (informational) suppressed for AI summary.

5. Summary Statistics

Single numbers replace pages of analysis:

  • "riskScore": 0.42 instead of reading all changed files
  • "healthDelta": -0.8" instead of per-file health records

Cumulative Result

Step Remaining Volume
Raw diff (600 files) 100% (~500K-1M tokens)
After generated-file filtering ~60%
After Hold-the-Line ~25%
Structured findings ~5%
After tier filtering ~2-3% (~10K-30K tokens)

Typical savings: 85-95% token reduction on PRs with 50+ files.


Performance

CKB Review is optimized for large PRs. The engine runs 20 checks in parallel with a mutex-based tree-sitter scheduler that overlaps git subprocess work with static analysis.

PR Size Runtime Git Calls Tree-Sitter Calls
10 files ~2s ~15 ~20
100 files ~8s ~40 ~60
600 files ~15s ~50 (batched) ~60 (capped)

Parallelization Strategy

Without lock (run in parallel): breaking, secrets, tests, hotspots, risk, coupling, dead-code, blast-radius, critical, traceability, independence

With tree-sitter mutex (serialized): complexity, health, test-gaps, bug-patterns, comment-drift, format-consistency

Lock-free checks typically complete within 200ms while tree-sitter checks process their queue. Net effect: ~6-8x speedup vs sequential execution.

Key Optimizations

  • Batched git operations: one git log --name-only replaces 120+ individual calls for health scoring
  • Parallel git blame: 5-worker pool instead of sequential calls
  • Cached hotspot scores: computed once, shared across all checks
  • Capped analysis: health limited to 30 files, coupling to 20, bug-patterns to 20

GitHub Action

CKB provides a composite action at action/ckb-review:

- uses: SimplyLiz/CodeMCP/action/ckb-review@main
  with:
    fail-on: 'error'         # or 'warning' / 'none'
    comment: 'true'          # post PR comment
    sarif: 'true'            # upload to Code Scanning
    checks: ''               # all checks (or comma-separated subset)
    critical-paths: 'drivers/**'
    require-trace: 'false'
    trace-patterns: ''
    require-independent: 'false'
    max-fanout: '0'          # blast-radius threshold (0 = disabled)
    dead-code-confidence: '0.8'
    test-gap-lines: '5'

Outputs: verdict (pass/warn/fail), score (0-100), findings (count)

For a complete workflow example, see the pr-review.yml template or Workflow Examples#pr-review.


MCP Tool

The reviewPR MCP tool exposes the same engine to AI assistants:

{
  "tool": "reviewPR",
  "arguments": {
    "baseBranch": "main",
    "checks": ["breaking", "secrets", "health", "bug-patterns"],
    "failOnLevel": "error",
    "criticalPaths": ["drivers/**"]
  }
}

Returns the full review response: verdict, score, checks, findings, health report, split suggestion, change breakdown, reviewers, review effort estimate, and narrative summary.


CLI Reference

ckb review [scope]

Flags:
  --format=human|json|markdown|github-actions|sarif|codeclimate|compliance
  --base=main                    # Base branch
  --head=                        # Head branch (default: HEAD)
  --checks=breaking,secrets,...  # Filter to specific checks
  --ci                           # CI mode: exit 0=pass, 1=fail, 2=warn
  --staged                       # Review staged changes instead of branch diff
  --scope=internal/query         # Filter to path prefix or symbol

  # Quality Gates
  --fail-on=error|warning|none   # Verdict threshold
  --block-breaking               # Fail on breaking changes
  --block-secrets                # Fail on secret leaks
  --require-tests                # Warn if no tests affected
  --max-risk=0.7                 # Risk score threshold
  --max-complexity=10            # Complexity delta threshold
  --max-files=0                  # Max file count (0=disabled)
  --max-blast-radius=0           # Blast radius threshold
  --max-fanout=0                 # Alias for --max-blast-radius

  # Safety-Critical Paths
  --critical-paths=drivers/**    # Glob patterns

  # Traceability
  --require-trace                # Enforce ticket references
  --trace-patterns=JIRA-\d+      # Ticket regex patterns

  # Reviewer Independence
  --require-independent          # Enforce independent review
  --min-reviewers=2              # Minimum independent reviewers

  # Analyzer Tuning
  --dead-code-confidence=0.8     # Dead code confidence threshold
  --test-gap-lines=5             # Min function lines for test gap reporting
  --llm                          # Use Claude for narrative generation

  # Lint Deduplication
  --lint-report=report.json      # External lint report to deduplicate against

Response Structure

The full ReviewPRResponse contains:

Field Type Description
ckbVersion string CKB version
schemaVersion string Response schema version
verdict string "pass", "warn", or "fail"
score int 0-100 review score
summary object File counts, LOC, modules, languages
checks array 20 check results with status, severity, duration
findings array Actionable items with file, line, message, suggestion, tier
reviewers array Suggested reviewers from CODEOWNERS + git blame
generated array Detected generated files with reasons
splitSuggestion object PR split clusters (if applicable)
changeBreakdown object Category counts (new/refactor/test/etc.)
reviewEffort object Time estimate with factors and complexity label
clusterReviewers array Per-cluster reviewer assignments
healthReport object Per-file health scores with deltas
narrative string 2-3 sentence review summary
prTier string "small" (< 100 LOC), "medium" (≤ 600), "large" (> 600)
provenance object Repo state ID, dirty flag, query duration

Finding Structure

Each finding includes:

Field Description
check Check name (e.g., "breaking", "bug-patterns")
severity "error", "warning", or "info"
file File path
startLine Line number (0 = file-level)
message Human-readable description
suggestion Optional fix recommendation
ruleId Machine-readable ID (e.g., ckb/breaking/removed-symbol)
hint Optional drilldown hint (e.g., → ckb explain Symbol)
tier 1 (blocking), 2 (important), 3 (informational)