How CKB Works

CKB analyzes your code through a multi-stage pipeline that builds a queryable knowledge graph. Here's what happens under the hood.

The Analysis Pipeline

Your Code → Parsing → Symbol Extraction → Graph Building → Query Engine → Answers

1. Parsing with Tree-sitter

CKB uses Tree-sitter to parse your source files into concrete syntax trees (CSTs). This gives us:

Language-agnostic parsing — Same approach works for Go, TypeScript, Python, Rust, Java, and more
Incremental updates — Only re-parse changed files
Error tolerance — Partially broken code still parses

// Your code
func ProcessOrder(order Order) error {
    return db.Save(order)
}

// Tree-sitter sees
(function_declaration
  name: (identifier) @function.name
  parameters: (parameter_list ...)
  body: (block ...))

2. Symbol Extraction

From the syntax tree, CKB extracts symbols — the meaningful identifiers in your code:

Symbol Type	Examples
Functions	`ProcessOrder`, `handleRequest`
Types	`Order`, `UserService`
Variables	`db`, `config`
Imports	`"github.com/your/package"`

Each symbol gets a unique identifier based on its fully-qualified path, so pkg/orders.ProcessOrder is distinct from pkg/returns.ProcessOrder.

3. Building the Knowledge Graph

CKB connects symbols into a directed graph:

┌─────────────┐     calls      ┌──────────────┐
│ProcessOrder │───────────────▶│   db.Save    │
└─────────────┘                └──────────────┘
       │                              │
       │ uses type                    │ uses type
       ▼                              ▼
┌─────────────┐                ┌──────────────┐
│    Order    │                │    Record    │
└─────────────┘                └──────────────┘

The graph captures:

Call relationships — What functions call what
Type usage — What types are used where
Import dependencies — What packages depend on what
File membership — What symbols live in what files

4. Enrichment Layers

On top of the base graph, CKB adds contextual information:

Ownership — Who owns what code, derived from:

Git blame history
CODEOWNERS files
Commit patterns

Complexity Metrics — How complex is each function:

Cyclomatic complexity
Cognitive complexity
Lines of code

Change History — How often code changes:

Commit frequency (hotspots)
Recent modifications
Churn rate

Documentation — What's documented:

Doc comments
README references
ADR (Architecture Decision Record) links

5. The Query Engine

When you ask CKB a question, the query engine:

Routes your query to the right analyzer
Traverses the knowledge graph
Merges results from multiple sources
Compresses output for efficient LLM consumption

Example: Impact Analysis

Let's trace what happens when you ask "What breaks if I change ProcessOrder?"

ckb impact ProcessOrder

Step 1: Find the symbol

Query: "ProcessOrder"
Result: pkg/orders/handler.go:ProcessOrder (function)

Step 2: Traverse callers (reverse call graph)

ProcessOrder
├── called by: OrderController.Create
├── called by: BatchProcessor.Run
└── called by: OrderService.Submit

Step 3: Expand transitively

OrderController.Create
├── called by: HTTP handler /api/orders
└── tested by: orders_test.go

BatchProcessor.Run
├── called by: CronJob scheduler
└── tested by: batch_test.go

Step 4: Assess risk

Impact Score: HIGH
- 3 direct callers
- 2 test files affected
- 1 public API endpoint
- Last changed: 3 days ago (active code)

Step 5: Format response

Changing ProcessOrder affects:
- 3 functions in 2 packages
- Tests: orders_test.go, batch_test.go
- Risk: HIGH (public API surface)

Storage Architecture

CKB stores the knowledge graph in SQLite for portability:

.ckb/
├── index.db          # Symbol graph, relationships
├── ownership.db      # Code ownership data
├── cache/            # Query result cache
└── telemetry.db      # Usage analytics (opt-in)

Why SQLite?

No server to run — works offline
Fast queries — optimized for graph traversal
Portable — copy the folder to share the index
Incremental — update without full rebuild

Backends and Fallbacks

CKB uses multiple backends depending on what's available:

Backend	What it provides	When used
SCIP	Precise cross-references	When SCIP index exists
Tree-sitter	Syntax-based symbols	Always (fallback)
LSP	Real-time analysis	IDE integrations
Git	History and blame	Ownership queries

If you have a SCIP index (generated by scip-go, scip-typescript, etc.), CKB uses it for compiler-accurate references. Otherwise, Tree-sitter provides good-enough analysis without any setup.

What CKB Doesn't Do

Understanding limitations helps set expectations:

No runtime analysis — CKB analyzes static code, not execution paths
No modification — CKB is read-only; it never changes your code
No dynamic dispatch — Interface implementations require explicit hints
No external dependencies — CKB analyzes your code, not node_modules

Performance Characteristics

Operation	Typical Time	Notes
Initial index	10-60s	Depends on codebase size
Incremental update	1-5s	Only changed files
Simple query	10-50ms	Single symbol lookup
Impact analysis	100-500ms	Graph traversal
Full-repo search	200ms-2s	Depends on result count

CKB is designed for interactive use — queries should feel instant.

Next Steps

Quick Start — Index your first repository
Architecture — Deep dive into system design
Impact Analysis — Learn about blast radius queries