Unified Code Review
ckb review runs 20 quality checks in a single command and returns a verdict, findings, and suggested reviewers. It replaces the need to wire up individual gates manually.
Architecture
Quick links: CI-CD-Integration for GitHub Action setup · Quality Gates for individual gate thresholds · Workflow Examples for production templates
Quick Start
# Review current branch against main
ckb review
# Custom base branch
ckb review --base=develop
# Review staged changes only (pre-commit)
ckb review --staged
# Scope to a path prefix or symbol
ckb review internal/query/
ckb review --scope=Engine
# Only run specific checks
ckb review --checks=breaking,secrets,health
# New analyzers
ckb review --checks=dead-code,test-gaps,blast-radius --max-fanout=20
# Bug pattern detection
ckb review --checks=bug-patterns
# CI mode (exit 1=fail, 2=warn)
ckb review --ci --format=json
# Markdown for PR comments
ckb review --format=markdown
# With AI-generated narrative summary
ckb review --llm
Checks
ckb review orchestrates 20 checks in three priority tiers. All checks run concurrently; tree-sitter calls are serialized via a mutex so git subprocess work overlaps with analysis.
Tier 1 — Blocking (Must Fix)
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
breaking |
API breaking change detection (removed symbols, changed signatures) | SCIP CompareAPI | fail |
secrets |
Credential and secret scanning (entropy ≥ 3.5 + pattern matching) | Entropy scanner + allowlist | fail |
critical |
Safety-critical path enforcement (requires config) | Glob pattern matching | fail |
Tier 2 — Important (Should Fix)
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
complexity |
Cyclomatic/cognitive complexity delta per function | tree-sitter AST | warn |
health |
8-factor weighted code health score (A-F grades) | Multi-source (see below) | warn |
risk |
Multi-factor risk score (file count, LOC, hotspots, module spread) | Heuristic | warn |
coupling |
Missing co-changed files (≥70% correlation, 365-day window) | Git history | warn |
dead-code |
Unreferenced symbols and constants in changed files | SCIP + reference counting | warn |
blast-radius |
High fan-out symbol detection; informational unless threshold set | SCIP AnalyzeImpact | warn/info |
bug-patterns |
High-confidence Go AST bug patterns (10 rules, see below) | tree-sitter AST | warn |
Tier 3 — Informational (Good to Know)
| Check | What It Does | Detector | Gate Type |
|---|---|---|---|
hotspots |
Overlap with volatile/high-churn files (score > 0.5) | Churn ranking | info |
tests |
Affected test coverage | GetAffectedTests | warn* |
test-gaps |
Untested functions in changed files (≥5 lines) | tree-sitter + coverage | info |
comment-drift |
Numeric constant vs comment mismatch (e.g., // 256 above const MaxSize = 512) |
Regex scan | info |
format-consistency |
Cross-formatter output divergence (Human vs Markdown) | Function pair analysis | info |
generated |
Generated file detection and exclusion from all checks | Glob + header markers | info |
traceability |
Commit-to-ticket linkage (e.g., JIRA-\d+) | Regex + branch name | warn |
independence |
Author != reviewer enforcement (regulated industries) | Commit author extraction | warn |
classify |
Change classification (new/refactor/moved/churn/test/config) | Heuristics | — |
split |
Large PR split suggestion with cluster analysis | BFS graph clustering | warn |
* tests only warns when requireTests: true is set.
Selecting Checks
# Run all (default)
ckb review
# Only security-related
ckb review --checks=breaking,secrets
# Only bug patterns
ckb review --checks=bug-patterns
# Skip slow checks
ckb review --checks=breaking,secrets,tests,coupling
Bug Pattern Detection
The bug-patterns check uses tree-sitter AST analysis to detect 10 high-confidence Go bug patterns. Only newly introduced patterns are reported — pre-existing issues from the base branch are filtered out.
| # | Rule | What It Catches |
|---|---|---|
| 1 | defer-in-loop |
defer inside a loop — cleanup won't run per iteration, only at function exit |
| 2 | unreachable-code |
Statements after return or panic |
| 3 | empty-error-branch |
if err != nil { } with no body — silently swallowed error |
| 4 | unchecked-type-assert |
x.(T) without ok check — panic risk at runtime |
| 5 | self-assignment |
a = a — no-op assignment |
| 6 | nil-after-deref |
Pointer dereference before nil check |
| 7 | identical-branches |
if/else with identical code in both branches |
| 8 | shadowed-err |
Inner := redeclares outer err variable |
| 9 | discarded-error |
Function call returning error with no assignment |
| 10 | missing-defer-close |
Resource from Open/Create without defer Close() |
Scope: First 20 non-test .go files in the changeset.
Deduplication: Findings are compared against the base branch. Only patterns newly introduced by the PR are reported. Pre-existing bugs show as "pass" with a note.
Hold-the-Line
By default, CKB Review only reports issues on lines actually changed in the PR, not pre-existing problems in unchanged code.
How It Works
- Parse unified diff to extract changed line numbers per file
- Build a map:
file → set of changed line numbers - After all checks run, filter findings:
- Keep: File-level findings (StartLine == 0)
- Keep: Findings on changed lines
- Keep: Findings on files not in the diff map (safety fallback)
- Drop: Findings on unchanged lines
Why It Matters
In a codebase with historical technical debt, without Hold-the-Line, hundreds of pre-existing warnings would flood the report. Reviewers see only what this PR introduced.
Configuration
{
"holdTheLine": true
}
Set to false to see all findings, including pre-existing ones.
Output Formats
| Format | Flag | Use Case |
|---|---|---|
| Human | --format=human |
Terminal output with colors and icons (default) |
| JSON | --format=json |
Machine-readable, CI pipelines, AI consumption |
| Markdown | --format=markdown |
PR comments (GitHub, GitLab) |
| GitHub Actions | --format=github-actions |
Inline ::error/::warning annotations |
| SARIF | --format=sarif |
GitHub Code Scanning, GitLab SAST |
| CodeClimate | --format=codeclimate |
GitLab Code Quality |
| Compliance | --format=compliance |
Audit trail (IEC 61508, DO-178C, ISO 26262) |
Review Policy
Configure quality gates via CLI flags or .ckb/config.json:
{
"review": {
"blockBreakingChanges": true,
"blockSecrets": true,
"requireTests": false,
"maxRiskScore": 0.7,
"maxComplexityDelta": 0,
"maxFiles": 0,
"maxBlastRadiusDelta": 0,
"maxFanOut": 0,
"deadCodeMinConfidence": 0.8,
"testGapMinLines": 5,
"failOnLevel": "error",
"holdTheLine": true,
"splitThreshold": 50,
"criticalPaths": ["drivers/**", "protocol/**"],
"criticalSeverity": "error",
"generatedPatterns": ["*.pb.go", "*.generated.*", "*.pb.cc", "parser.tab.c", "lex.yy.c"],
"generatedMarkers": ["DO NOT EDIT", "Generated by", "AUTO-GENERATED", "This file is generated"],
"traceabilityPatterns": [],
"traceabilitySources": ["commit-message", "branch-name"],
"requireTraceability": false,
"requireTraceForCriticalPaths": false,
"requireIndependentReview": false,
"minReviewers": 1
}
}
Configuration Hierarchy
- Hardcoded defaults (DefaultReviewPolicy)
- Repository config from
.ckb/config.json(overrides defaults) - CLI flags (override everything)
CLI Overrides
# Quality gates
ckb review --block-breaking=true --block-secrets=true
ckb review --require-tests --max-risk=0.8
ckb review --max-complexity=10 --max-files=100
ckb review --max-fanout=20 # blast-radius threshold
ckb review --dead-code-confidence=0.9 # stricter dead code
ckb review --test-gap-lines=10 # only flag larger functions
# Verdict control
ckb review --fail-on=warning # fail on warnings too
ckb review --fail-on=none # never fail (informational)
# Safety-critical paths
ckb review --critical-paths="drivers/**,auth/**"
# Traceability
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"
# Reviewer independence
ckb review --require-independent --min-reviewers=2
# AI narrative
ckb review --llm
Code Health Scoring
The health check computes a 0-100 score per file using 8 weighted factors:
| Factor | Weight | Source | Scale |
|---|---|---|---|
| Cyclomatic complexity | 25% | tree-sitter | ≤5: 100, 6-10: 85, 11-20: 65, 21-30: 40, >30: 20 |
| Cognitive complexity | 15% | tree-sitter | Same scale as cyclomatic |
| File size (LOC) | 10% | Line count | ≤100: 100, 101-300: 85, 301-500: 70, 501-1000: 50, >1000: 30 |
| Churn (commits/30d) | 15% | git log | ≤2: 100, 3-5: 80, 6-10: 60, 11-20: 40, >20: 20 |
| Coupling (co-changes) | 10% | git log | ≤2: 100, 3-5: 80, 6-10: 60, >10: 40 |
| Bus factor (authors) | 10% | git blame | ≥5: 100, 3-4: 85, 2: 60, 1: 30 |
| Age (days since change) | 15% | git log | ≤30: 100, 31-90: 85, 91-180: 70, 181-365: 50, >365: 30 |
Grades
| Grade | Score Range |
|---|---|
| A | 90-100 |
| B | 70-89 |
| C | 50-69 |
| D | 30-49 |
| F | 0-29 |
Confidence Factor
The confidence factor (0-1) indicates how reliable the score is:
- Reduced by 0.4 if tree-sitter parsing fails
- Reduced by 0.3 if all metrics are at default (no git data)
- Reduced by 0.2 if only bus factor is at default
Health is computed for both base and head versions. Findings are generated when a file degrades by more than 5 points.
Score Calculation
The review score starts at 100 and deducts points per finding:
| Severity | Points | Cap Per Check | Cap Per Rule |
|---|---|---|---|
| error | -10 | 20 | 10 |
| warning | -3 | 20 | 10 |
| info | -1 | 20 | 10 |
Maximum total deduction: 80 points (score floor: 0)
Three caps prevent noise from dominating the score:
- Per-rule cap (10): A single noisy rule (e.g.,
ckb/bug/discarded-error) can't consume its check's entire budget. - Per-check cap (20): A single check (e.g., coupling with 100+ findings) can't overwhelm the score.
- Total cap (80): Large PRs where many checks fire don't produce meaningless scores.
Verdict Logic
fail-on-level = "error" (default), "warning", or "none"
"none" → always pass
"warning" → fail if any check has status "fail" or "warn"
"error" → fail if any check has status "fail"; warn if any has "warn"; otherwise pass
CI exit codes: 0 = pass, 1 = fail, 2 = warn
Review Effort Estimation
CKB estimates human review effort based on Microsoft and Google code review research.
Time Model
| Code Category | Review Speed |
|---|---|
| New code | 200 LOC/hour |
| Refactored/modified | 300 LOC/hour |
| Moved/test/config | 500 LOC/hour (quick scan) |
Additional Factors
| Factor | Overhead |
|---|---|
| File switches (> 5 files) | 2 min per file |
| Module context switches (> 1 module) | 5 min per module beyond first |
| Safety-critical files | 10 min each |
| Minimum | 5 minutes |
Complexity Scale
| Level | Estimated Time |
|---|---|
| Trivial | < 20 min |
| Moderate | 20-60 min |
| Complex | 60-240 min |
| Very complex | > 240 min |
Example Output
Estimated Review: ~45min (moderate)
· 20 min from 120 LOC
· 10 min from 5 file switches
· 15 min for 3 critical files
Narrative Summary
CKB generates a 2-3 sentence review summary automatically.
Deterministic (Default)
Sentence 1: "Changes N files across M modules (languages)."
Sentence 2: Risk signal (failed checks, top warnings, or "No blocking issues found.")
Sentence 3: Focus area (split suggestion, critical files, or omitted)
Example: "Changes 25 files across 3 modules (Go, TypeScript). 2 breaking API changes detected; 2 safety-critical files changed. 2 safety-critical files need focused review."
LLM-Powered (--llm)
With --llm, Claude generates a context-aware summary based on the top 10 findings, verdict, score, and health report.
- Model: claude-sonnet-4-20250514 (configurable)
- Timeout: 30 seconds
- Fallback: deterministic narrative on API failure
Change Classification
The classify check categorizes each file in the changeset:
| Category | Heuristic | Review Priority |
|---|---|---|
new |
File doesn't exist at base | High |
refactoring |
Renamed + changes | Medium |
moved |
Renamed + ≤20% content change | Low |
churn |
≥3 commits in last 30 days | High |
config |
Makefiles, CI, Docker, etc. | Low |
test |
Test file patterns (*_test.go, test_*.py, etc.) |
Medium |
generated |
Matches generated patterns/markers | Skip |
modified |
Default — none of the above | Medium |
The review effort estimate uses classification to adjust time: generated files are skipped, tests review faster, new code gets full review time.
Generated File Detection
Generated files are detected early and excluded from all other checks. This is the most important token-saving mechanism: in a monorepo with Protocol Buffers, this alone can eliminate 30-50% of files.
Default Patterns (Glob)
*.generated.*
*.pb.go # Protocol Buffers (Go)
*.pb.cc # Protocol Buffers (C++)
parser.tab.c # Yacc/Bison
lex.yy.c # Flex
Default Header Markers (First 10 Lines)
"DO NOT EDIT"
"Generated by"
"AUTO-GENERATED"
"This file is generated"
Both patterns and markers are configurable via policy.
PR Split Suggestion
When a PR exceeds splitThreshold files (default: 50), the split check analyzes the changeset and suggests independent clusters.
Algorithm
- Build adjacency graph of files
- Connect files in same module (fully connected)
- Enrich edges using coupling analysis (top 20 files, correlation ≥ 0.5)
- Find connected components via BFS
- If ≤1 component → no split recommended
- If >1 component → return independent clusters
Cluster Metadata
Each cluster reports:
- Name: dominant module
- Files: list with additions/deletions
- Languages: detected languages
- Independent: always true (connected components are by definition)
Sorted by file count descending.
Risk Scoring
The risk check computes a 0-1 risk score from four factors:
| Factor | Contribution |
|---|---|
| > 20 files changed | +0.3 (> 10: +0.15) |
| > 1000 LOC changes | +0.3 (> 500: +0.15) |
| Hotspot overlap | +0.1 per hotspot (max 0.3) |
| > 5 modules affected | +0.2 |
Risk levels: low (< 0.3), medium (0.3-0.6), high (> 0.6)
Traceability & Compliance
For regulated industries (IEC 61508, DO-178C, ISO 26262):
# Require ticket references in commits
ckb review --require-trace --trace-patterns="JIRA-\d+,GH-\d+"
# Require independent reviewer (author != reviewer)
ckb review --require-independent --min-reviewers=2
# Safety-critical path enforcement
ckb review --critical-paths="drivers/**,protocol/**"
# Full compliance output
ckb review --format=compliance
Traceability Sources
Configure where to look for ticket references:
commit-message— scan commit messagesbranch-name— scan the branch name
Both are checked by default when traceability is enabled.
Finding Baselines
Track finding trends across releases:
# Save current findings as a baseline
ckb review baseline save --tag=v1.0
# List saved baselines
ckb review baseline list
# Compare two baselines
ckb review baseline diff v1.0 v2.0
Baseline diffs classify each finding as new, unchanged, or resolved.
Storage: .ckb/baselines/ (git-ignored)
Token Savings for AI-Assisted Review
When used via MCP (reviewPR tool), CKB Review dramatically reduces token consumption for AI code reviewers. Five mechanisms work together:
1. Generated File Filtering
Generated files are excluded before any analysis. Protocol Buffers, build artifacts, lock files — none of this reaches the AI.
Savings: 30-50% of files in typical monorepos.
2. Structured Findings Instead of Raw Diffs
Instead of a 10KB+ unified diff, the AI receives a structured JSON response:
{
"verdict": "warn",
"score": 72,
"findings": [
{
"check": "breaking",
"severity": "error",
"file": "api/handler.go",
"startLine": 42,
"message": "Removed public function HandleAuth()",
"suggestion": "Update all call sites or provide a wrapper",
"tier": 1
}
]
}
Savings: 80-90% smaller than raw diff.
3. Hold-the-Line Filtering
Only new issues are reported. Historical debt is filtered out.
Savings: 40-70% fewer findings in codebases with existing debt.
4. Finding Tier Filtering
Top 10 findings by tier + severity. Tier 3 (informational) suppressed for AI summary.
5. Summary Statistics
Single numbers replace pages of analysis:
"riskScore": 0.42instead of reading all changed files"healthDelta": -0.8"instead of per-file health records
Cumulative Result
| Step | Remaining Volume |
|---|---|
| Raw diff (600 files) | 100% (~500K-1M tokens) |
| After generated-file filtering | ~60% |
| After Hold-the-Line | ~25% |
| Structured findings | ~5% |
| After tier filtering | ~2-3% (~10K-30K tokens) |
Typical savings: 85-95% token reduction on PRs with 50+ files.
Performance
CKB Review is optimized for large PRs. The engine runs 20 checks in parallel with a mutex-based tree-sitter scheduler that overlaps git subprocess work with static analysis.
| PR Size | Runtime | Git Calls | Tree-Sitter Calls |
|---|---|---|---|
| 10 files | ~2s | ~15 | ~20 |
| 100 files | ~8s | ~40 | ~60 |
| 600 files | ~15s | ~50 (batched) | ~60 (capped) |
Parallelization Strategy
Without lock (run in parallel): breaking, secrets, tests, hotspots, risk, coupling, dead-code, blast-radius, critical, traceability, independence
With tree-sitter mutex (serialized): complexity, health, test-gaps, bug-patterns, comment-drift, format-consistency
Lock-free checks typically complete within 200ms while tree-sitter checks process their queue. Net effect: ~6-8x speedup vs sequential execution.
Key Optimizations
- Batched git operations: one
git log --name-onlyreplaces 120+ individual calls for health scoring - Parallel git blame: 5-worker pool instead of sequential calls
- Cached hotspot scores: computed once, shared across all checks
- Capped analysis: health limited to 30 files, coupling to 20, bug-patterns to 20
GitHub Action
CKB provides a composite action at action/ckb-review:
- uses: SimplyLiz/CodeMCP/action/ckb-review@main
with:
fail-on: 'error' # or 'warning' / 'none'
comment: 'true' # post PR comment
sarif: 'true' # upload to Code Scanning
checks: '' # all checks (or comma-separated subset)
critical-paths: 'drivers/**'
require-trace: 'false'
trace-patterns: ''
require-independent: 'false'
max-fanout: '0' # blast-radius threshold (0 = disabled)
dead-code-confidence: '0.8'
test-gap-lines: '5'
Outputs: verdict (pass/warn/fail), score (0-100), findings (count)
For a complete workflow example, see the pr-review.yml template or Workflow Examples#pr-review.
MCP Tool
The reviewPR MCP tool exposes the same engine to AI assistants:
{
"tool": "reviewPR",
"arguments": {
"baseBranch": "main",
"checks": ["breaking", "secrets", "health", "bug-patterns"],
"failOnLevel": "error",
"criticalPaths": ["drivers/**"]
}
}
Returns the full review response: verdict, score, checks, findings, health report, split suggestion, change breakdown, reviewers, review effort estimate, and narrative summary.
CLI Reference
ckb review [scope]
Flags:
--format=human|json|markdown|github-actions|sarif|codeclimate|compliance
--base=main # Base branch
--head= # Head branch (default: HEAD)
--checks=breaking,secrets,... # Filter to specific checks
--ci # CI mode: exit 0=pass, 1=fail, 2=warn
--staged # Review staged changes instead of branch diff
--scope=internal/query # Filter to path prefix or symbol
# Quality Gates
--fail-on=error|warning|none # Verdict threshold
--block-breaking # Fail on breaking changes
--block-secrets # Fail on secret leaks
--require-tests # Warn if no tests affected
--max-risk=0.7 # Risk score threshold
--max-complexity=10 # Complexity delta threshold
--max-files=0 # Max file count (0=disabled)
--max-blast-radius=0 # Blast radius threshold
--max-fanout=0 # Alias for --max-blast-radius
# Safety-Critical Paths
--critical-paths=drivers/** # Glob patterns
# Traceability
--require-trace # Enforce ticket references
--trace-patterns=JIRA-\d+ # Ticket regex patterns
# Reviewer Independence
--require-independent # Enforce independent review
--min-reviewers=2 # Minimum independent reviewers
# Analyzer Tuning
--dead-code-confidence=0.8 # Dead code confidence threshold
--test-gap-lines=5 # Min function lines for test gap reporting
--llm # Use Claude for narrative generation
# Lint Deduplication
--lint-report=report.json # External lint report to deduplicate against
Response Structure
The full ReviewPRResponse contains:
| Field | Type | Description |
|---|---|---|
ckbVersion |
string | CKB version |
schemaVersion |
string | Response schema version |
verdict |
string | "pass", "warn", or "fail" |
score |
int | 0-100 review score |
summary |
object | File counts, LOC, modules, languages |
checks |
array | 20 check results with status, severity, duration |
findings |
array | Actionable items with file, line, message, suggestion, tier |
reviewers |
array | Suggested reviewers from CODEOWNERS + git blame |
generated |
array | Detected generated files with reasons |
splitSuggestion |
object | PR split clusters (if applicable) |
changeBreakdown |
object | Category counts (new/refactor/test/etc.) |
reviewEffort |
object | Time estimate with factors and complexity label |
clusterReviewers |
array | Per-cluster reviewer assignments |
healthReport |
object | Per-file health scores with deltas |
narrative |
string | 2-3 sentence review summary |
prTier |
string | "small" (< 100 LOC), "medium" (≤ 600), "large" (> 600) |
provenance |
object | Repo state ID, dirty flag, query duration |
Finding Structure
Each finding includes:
| Field | Description |
|---|---|
check |
Check name (e.g., "breaking", "bug-patterns") |
severity |
"error", "warning", or "info" |
file |
File path |
startLine |
Line number (0 = file-level) |
message |
Human-readable description |
suggestion |
Optional fix recommendation |
ruleId |
Machine-readable ID (e.g., ckb/breaking/removed-symbol) |
hint |
Optional drilldown hint (e.g., → ckb explain Symbol) |
tier |
1 (blocking), 2 (important), 3 (informational) |
Related Pages
- Quality Gates — Individual gate thresholds and CI enforcement
- CI-CD-Integration — Full CI/CD integration guide
- Workflow Examples — Production-ready workflow templates
- Impact-Analysis — Blast radius and risk scoring details
- Security — Secret detection configuration
- Configuration — Global configuration options