Skip to content

Architecture

CommitBee is ~18K lines of Rust compiled to a single static binary with LTO.

Crate Structure

txt
src/
β”œβ”€β”€ main.rs              # Entry point, tracing setup
β”œβ”€β”€ lib.rs               # Library exports (for integration tests)
β”œβ”€β”€ app.rs               # Application orchestrator (all the glue)
β”œβ”€β”€ cli.rs               # CLI argument parsing (clap derive)
β”œβ”€β”€ config.rs            # Configuration loading (figment layered)
β”œβ”€β”€ error.rs             # Error types (thiserror + miette diagnostics)
β”œβ”€β”€ domain/
β”‚   β”œβ”€β”€ change.rs        # FileChange, StagedChanges, ChangeStatus
β”‚   β”œβ”€β”€ symbol.rs        # CodeSymbol, SymbolKind, SpanChangeKind
β”‚   β”œβ”€β”€ diff.rs          # SymbolDiff, ChangeDetail (structural AST diffs)
β”‚   β”œβ”€β”€ context.rs       # PromptContext β€” assembles the LLM prompt
β”‚   └── commit.rs        # CommitType enum (single source of truth)
└── services/
    β”œβ”€β”€ git.rs           # GitService β€” gix for discovery, git CLI for diffs
    β”œβ”€β”€ analyzer.rs      # AnalyzerService β€” tree-sitter parsing via rayon
    β”œβ”€β”€ context.rs       # ContextBuilder β€” evidence flags, token budget
    β”œβ”€β”€ differ.rs        # AstDiffer β€” structural comparison of old/new symbols
    β”œβ”€β”€ safety.rs        # Secret scanning (24 patterns), conflict detection
    β”œβ”€β”€ sanitizer.rs     # CommitSanitizer + CommitValidator
    β”œβ”€β”€ splitter.rs      # CommitSplitter β€” diff-shape + Jaccard clustering
    β”œβ”€β”€ progress.rs      # Progress indicators (indicatif spinners, TTY-aware)
    └── llm/
        β”œβ”€β”€ mod.rs       # LlmBackend enum dispatch, SYSTEM_PROMPT
        β”œβ”€β”€ ollama.rs    # OllamaProvider β€” streaming NDJSON
        β”œβ”€β”€ openai.rs    # OpenAiProvider β€” SSE streaming
        └── anthropic.rs # AnthropicProvider β€” SSE streaming

Key Design Decisions

Hybrid Git β€” gix (pure Rust) is used for fast repo discovery, but the git CLI is used for diffs and staging operations.
This avoids the complexity of reimplementing diff parsing in pure Rust while keeping startup fast.

Full File Parsing β€” Tree-sitter parses the complete staged and HEAD versions of files, not just the diff hunks.
Diff hunks are then mapped to symbol spans. This means CommitBee knows the full context of what changed, not just the changed lines.

Enum Dispatch β€” The LLM provider uses an enum (LlmBackend) rather than a trait object.
This avoids async-trait overhead and the complexity of dyn dispatch for async methods.

Streaming with Cancellation β€” All providers support Ctrl+C cancellation via tokio_util::CancellationToken.
The streaming display runs in a separate tokio task with tokio::select! for responsive cancellation.

Token Budget β€” The context builder tracks character usage (~4 chars per token) and truncates the diff if it exceeds the budget, prioritizing the most important files.
The budget adapts based on available information: when structural AST diffs are present, the symbol allocation shrinks (20%) since the diffs carry precise detail; when only
signatures are available, symbols get 30%. The default 24K char budget (~6K tokens) is safe for 8K context models.

Single Source of Truth for Types β€” CommitType::ALL is a const array that defines all valid commit types.
The system prompt’s type list is verified at compile time (via a #[test]) to match this array exactly.

Error Philosophy

Every error in CommitBee is:

  • Actionable β€” Tells you what went wrong and how to fix it (via miette help messages)
  • Typed β€” Uses thiserror for structured error variants, not string errors
  • Diagnostic β€” Error codes like commitbee::git::no_staged for programmatic handling

No panics in user-facing code paths. The sanitizer and validator are tested with proptest to ensure they never panic on arbitrary input.

Testing Strategy

CommitBee has 424 tests across multiple strategies:

StrategyWhat It Covers
Unit testsIndividual functions (sanitizer rules, type parsing, config defaults)
Snapshot tests (insta)Output format stability
Property tests (proptest)Never-panic guarantees for parsers
Integration tests (wiremock)Full provider round-trips with mocked HTTP
Git fixture testsReal git operations in temp directories

Run them:

bash
cargo test                    # All 424 tests
cargo test --test sanitizer   # Just sanitizer tests
cargo test --test integration # LLM provider mocks
COMMITBEE_LOG=debug cargo test -- --nocapture  # With logging