KODEGEN.ᴀɪ - Documentation

Overview

KODEGEN.ᴀɪ is a blazing-fast Rust-native MCP Server (Model Context Protocol) with 75 elite auto-coding tools designed for professional, autonomous code generation and predictable high-quality results.

Every tool has been thoughtfully hyper-optimized for:

Speed - Code it faster with native Rust performance
Context Efficiency - Code it cheaper with minimal token usage
Agent-First Design - Built for AI workflows, not humans

Key Features

🗂️ Warp Speed Mods

14 filesystem tools optimized for coding workflows with atomic operations and concurrent traversal.

💻 Terminal as a Tool

Full VT100 pseudoterminal sessions with smart state detection and real-time output streaming.

🧠 Reasoning Chains

Stateful thinking sessions with branching, revision, and unlimited context across extended problem-solving.

🔮 Agents with Agents

N-depth agent delegation with full prompt control for hierarchical, coordinated agent pyramids.

📊 LLM Observability

Track tool usage, analyze patterns, and optimize workflows with built-in introspection.

📝 Agents Manage Prompts

Create and manage reusable prompt templates with Jinja2 rendering and schema validation.

Installation

Quick Install

Install KODEGEN.ᴀɪ with a single command:

curl -fsSL https://kodegen.ai/install | sh

Automatic Editor Configuration

After installation, automatically configure all detected MCP clients with one command:

kodegen install

This will scan your system and automatically configure:

✅ Claude Desktop

Auto-configures claude_desktop_config.json

✅ Windsurf

Auto-configures Windsurf MCP settings

✅ Cursor

Auto-configures Cursor AI settings

✅ Zed

Auto-configures Zed editor settings

✅ Roo Code

Auto-configures Roo Code settings

What it does:

🔍 Scans for installed MCP-compatible editors
📝 Creates config files if they don't exist
⚡ Injects KODEGEN into existing configs
💾 Creates backups before modification
✅ Reports results with detailed status

Manual Installation

For manual installation or to build from source:

# Clone the repository
git clone https://github.com/cyrup-ai/kodegen.git
cd kodegen

# Build with Cargo
cargo build --release

# Install to system
cargo install --path .

Manual MCP Client Configuration

Claude Desktop (Manual)

Add to your ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "kodegen": {
      "command": "kodegen"
    }
  }
}

Other MCP Clients

KODEGEN.ᴀɪ works with any MCP-compatible client. Use kodegen install for automatic configuration, or manually add the server config to:

Claude Code
Continue
Cline
VSCode MCP extension
And thousands more...

Configuration

Runtime Tool Selection

Control which tools are active at runtime by passing arguments to the kodegen binary in your MCP client configuration:

Method 1: Comma-Separated List

{
  "mcpServers": {
    "kodegen": {
      "command": "kodegen",
      "args": ["--tools", "filesystem,terminal,sequential_thinking"]
    }
  }
}

Method 2: Multiple Flags

{
  "mcpServers": {
    "kodegen": {
      "command": "kodegen",
      "args": [
        "--tool", "filesystem",
        "--tool", "terminal",
        "--tool", "sequential_thinking"
      ]
    }
  }
}

If no arguments are provided, all compiled tool categories are enabled by default.

Runtime Configuration

AI agents can modify configuration at runtime using the set_config_value tool:

set_config_value({
  "key": "file_read_line_limit",
  "value": 5000
})

Available Configuration Options

File Operations

file_read_line_limit - Maximum lines to read per file (default: 2000)
file_write_line_limit - Maximum lines to write per operation (default: 1000)
fuzzy_search_threshold - Similarity threshold for fuzzy matching (default: 0.8)

Security

blocked_commands - List of commands to block in terminal sessions
allowed_paths - Whitelist of paths for file operations

Performance

max_search_results - Maximum search results to return (default: 100)
terminal_timeout - Terminal command timeout in seconds (default: 300)

Advanced Users

Custom Builds with Feature Gates

Create hyper-optimized binaries by compiling only the tools you need. KODEGEN uses Cargo feature gates to enable/disable tool categories at compile time, resulting in smaller binaries and faster startup.

Building with Specific Features

By default, all features are enabled. To build a minimal binary with only specific tools:

# Build with only filesystem and terminal tools
cargo build --release \
  --no-default-features \
  --features "filesystem,terminal"

# Install custom build
cargo install --path . \
  --no-default-features \
  --features "filesystem,terminal,sequential_thinking"

Available Feature Flags

filesystem

14 tools for file operations, searching, and editing

~800KB

terminal

5 tools for terminal sessions and command execution

~300KB

sequential_thinking

1 tool for stateful reasoning chains

~150KB

claude_agent

5 tools for sub-agent orchestration

~400KB

prompt

4 tools for prompt template management

~250KB

introspection

2 tools for usage tracking and observability

~100KB

process

2 tools for process management

~150KB

Common Build Profiles

Minimal Coding Assistant (Filesystem + Terminal)

cargo install --path . \
  --no-default-features \
  --features "filesystem,terminal"

# Smallest binary: ~1.2MB (vs 3.5MB full build)
# Perfect for: Basic file operations and command execution

Thinking Agent (Filesystem + Sequential Thinking + Agents)

cargo install --path . \
  --no-default-features \
  --features "filesystem,sequential_thinking,claude_agent"

# Medium binary: ~1.8MB
# Perfect for: Research, analysis, and multi-step reasoning

Full-Featured Build (Default)

cargo install --path .

# Complete binary: ~5.5MB
# Includes: All 75 tools across 11 categories

Combining Compile-Time and Runtime Filtering

For maximum optimization, combine feature gates with runtime arguments:

# 1. Build with only filesystem and terminal features
cargo install --path . \
  --no-default-features \
  --features "filesystem,terminal"

# 2. Configure MCP client to use only filesystem tools
{
  "mcpServers": {
    "kodegen": {
      "command": "kodegen",
      "args": ["--tool", "filesystem"]
    }
  }
}

# Result: Smallest binary + fastest startup + minimal memory footprint

Checking Available Tools

List all tool categories compiled into your binary:

kodegen --list-categories

This shows which features were enabled at compile time.

Performance Comparison

Build Configuration	Binary Size	Startup Time	Memory Usage
Full Build (All Features)	~5.5MB	~30ms	~12MB
Minimal (filesystem + terminal)	~1.2MB	~12ms	~4MB
Filesystem Only	~900KB	~8ms	~3MB

Note: Measurements are approximate and may vary by platform and usage patterns.

What Your Coding Agent Can Do

KODEGEN empowers LLM coding agents with 75 specialized tools. Here's what makes each category special:

🗂️ Filesystem Tools (14 tools)

📖 Memory-Efficient File Reading

What it enables: Your agent can read any file - from tiny configs to gigabyte-sized logs - without loading everything into memory.

Technical innovations:

Ring Buffer Streaming: Reading the last 100 lines of a 1GB file uses only ~10KB of memory thanks to VecDeque circular buffer implementation
One-Pass Efficiency: Counts total lines while reading, not in a separate pass - O(n) instead of O(2n)
Negative Offset Magic: Want the last 30 lines? Just specify offset: -30 and the agent gets tail behavior automatically
Smart Image Handling: Automatically detects PNG, JPEG, GIF, WebP and returns base64-encoded data
URL Support: Can fetch files from HTTP/HTTPS endpoints with 30-second timeout protection
Helpful Annotations: Partial reads include headers like "[Reading lines 100-200 of 1000 total]"

✏️ Intelligent Code Editing

What it enables: Surgical precision edits with automatic error recovery when exact matches fail.

Technical innovations:

Fuzzy Matching Fallback: When exact match fails, automatically tries fuzzy search with configurable similarity threshold (default: 0.8)
Character-Level Diff Analysis: Shows exactly what's different with Unicode-aware character analysis
Whitespace Detection: Identifies when differences are only whitespace (spaces vs tabs, CRLF vs LF)
Line Ending Normalization: Automatically handles Windows (CRLF) vs Unix (LF) line endings
Intelligent Error Messages: When fuzzy match found, displays character codes for invisible characters and suggests fixes
Fire-and-Forget Logging: Async logging that never blocks execution, tracking all edit attempts for debugging
Expected Count Validation: Warns if actual replacement count differs from expected

🔍 Progressive Search

What it enables: Stream search results as they're found instead of waiting for complete scan.

Streaming Results: Get results progressively, cancel early when found what you need
Regex Support: Full regular expression pattern matching across codebases
Concurrent Traversal: Multi-threaded directory walking for faster results

📦 Batch Operations

What it enables: Read multiple files in parallel, move/delete/create with atomic operations.

Parallel Reading: read_multiple_files processes files concurrently
Atomic Moves: Safe file relocation with validation
Smart Metadata: get_file_info returns size, timestamps, permissions, line counts

💻 Terminal Tools (5 tools)

🖥️ Full Pseudoterminal Emulation

What it enables: LLM agents can run commands exactly like you would in a terminal - builds, tests, interactive CLIs, anything.

Technical innovations:

True PTY Support: Full VT100 pseudoterminal implementation, not just command execution
Smart State Detection: Automatically detects three states: command finished, still running, or waiting for REPL input
REPL Recognition: Knows when interactive prompts (Python >>>, Node >, etc.) are ready for input
Session Management: Track multiple running processes by PID, read output streams independently
Output Streaming: Real-time output access for long-running builds without blocking
Command Validation: Security layer blocks dangerous commands (rm -rf, sudo, etc.) with smart parsing
Initial Delay Control: Configurable wait (default 100ms) for quick commands to complete before first response

🎯 Interactive Command Sessions

What it enables: Your coding assistant can interact with REPLs, debuggers, and interactive CLIs just like a human developer.

Send Input: Type into running sessions (Python REPL, database CLI, interactive installers)
Read Output: Stream output in real-time as commands execute
Stop Commands: Gracefully terminate long-running processes
List Sessions: See all active terminal sessions with status

🧠 Sequential Thinking (1 tool)

🌊 Stateful Reasoning Engine

What it enables: Agents can break down complex problems across multiple reasoning steps, branch into parallel exploration, and revise thinking as insights emerge.

Technical innovations:

Actor-Model Concurrency: Each session runs in its own actor task with exclusive state ownership - completely lock-free!
MPSC Message Passing: Commands sent via channels, zero shared memory contention
Thought Branching: Explore multiple solution paths simultaneously - "What if we try REST vs GraphQL?"
Revision Support: Mark thoughts as revising previous reasoning when new insights emerge
Unlimited Context: Maintain complete thought history across extended problem-solving sessions
Auto-Persistence: Orphaned sessions automatically saved to disk, cleaned up after 24 hours
Colored Terminal Output: Thought streams rendered with syntax highlighting to stderr for debugging
Auto-Termination: Session actor gracefully terminates when thinking is complete

🔮 Agent Orchestration (5 tools)

🤖 True N-Depth Agent Delegation

What it enables: Agents can spawn specialized sub-agents, which can spawn their own sub-agents, creating hierarchical agent pyramids for complex tasks.

Why it's powerful:

Infinite Depth: No arbitrary limits - agents spawn agents spawn agents as deep as needed
Custom Prompts: Full control over sub-agent instructions via prompt templates
Real-Time Streaming: Watch sub-agent output as it generates, not after completion
Parallel Execution: Spawn multiple specialist agents working on different aspects simultaneously
Conversation Management: Send follow-up prompts, read responses, terminate when done
Coordinated Workflows: Main agent orchestrates research agent + code generation agent + testing agent

📝 Prompt Management (4 tools)

🎨 Template Library with Jinja2

What it enables: Agents can create, store, and reuse prompt templates with dynamic variable rendering.

Jinja2 Rendering: Full template engine with variables, conditionals, loops
Schema Validation: Ensure templates meet requirements before execution
A/B Testing: Store multiple instruction variations and compare results
Version Control Ready: Templates stored as files for git tracking
CRUD Operations: Add, edit, delete, retrieve templates programmatically

📊 Introspection (2 tools)

🔍 Self-Observability

What it enables: Agents can track their own behavior, analyze patterns, and optimize workflows.

Usage Statistics: See which tools are called most, success rates, execution patterns
Full Call History: Inspect recent invocations with complete arguments and responses
Performance Metrics: Execution time tracking for bottleneck identification
Failure Analysis: Spot recurring errors and debugging opportunities
Self-Improvement: LLM agents can analyze their own tool usage to improve efficiency

⚙️ Process & Configuration (4 tools)

🔧 System Control

Process Management:

List Processes: View running system processes with CPU/memory stats and filtering
Kill Process: Terminate processes by PID (with safety checks)

Runtime Configuration:

Get Config: Inspect current settings (line limits, thresholds, paths)
Set Config: Modify behavior at runtime without restart (fuzzy search threshold, file limits, etc.)
Dynamic Tuning: Agents can adjust their own configuration based on task requirements

🔗 Git & GitHub (36 tools)

Full Repository Operations: Complete git workflow automation plus GitHub API integration for issues, PRs, releases, and more. (Detailed tool list available in source code)

What Your Coding Agent Can Actually Do

Real-world workflows that autonomous agents can execute with KODEGEN:

🔄 Autonomous Refactoring

The Task: Convert synchronous functions to async throughout a Rust project.

What the agent does:

Search: Finds all function definitions matching the pattern across the entire codebase using streaming search
Read: Opens each file to understand context and dependencies
Edit: Makes surgical changes with fuzzy matching (handles whitespace variations automatically)
Test: Runs cargo test in a terminal session, monitoring output in real-time
Validate: Checks test output, rolls back if failures detected
Commit: Creates git commit when all tests pass

Why it works: Memory-efficient file reading means the agent can process massive codebases. Fuzzy matching recovers from minor mismatches. Terminal state detection knows exactly when tests finish.

🕵️ Multi-Agent Code Analysis

The Task: Research best practices and generate production-ready error handling code.

What happens:

Main Agent: Spawns a research specialist sub-agent with custom prompt: "Research Rust async error handling patterns in 2024"
Research Agent: Searches web, analyzes documentation, compiles findings into structured report
Main Agent: Spawns code generation specialist with findings from research agent
Code Agent: Generates example implementations following discovered best practices
Main Agent: Reviews output from both sub-agents, synthesizes final implementation

Why it works: N-depth delegation allows specialization. Real-time streaming means main agent sees progress. Each sub-agent can spawn its own helpers if needed.

🏗️ Architectural Design with Branching Thought

The Task: Design a scalable API architecture with exploration of alternatives.

The reasoning process:

Initial Analysis: "Need to design a scalable API layer - considering performance, maintainability, team expertise"
Branch A: Explores REST API approach - analyzes versioning strategies, caching benefits, tooling ecosystem
Branch B: Explores GraphQL approach - analyzes federation patterns, over-fetching elimination, learning curve
Synthesis: Compares branches, discovers REST better fits due to caching requirements and team experience
Revision: Updates initial architecture decision with evidence from both branches

Why it works: Lock-free actor model means zero contention even with many branches. Auto-persistence means long planning sessions never lose progress. Revision support allows refinement as understanding deepens.

🚀 Full-Stack Feature Implementation

The Task: Add authentication to an existing web application.

End-to-end automation:

Planning: Uses sequential thinking to break down requirements into implementation steps
Research: Spawns sub-agent to research OAuth 2.0 best practices and security considerations
Database: Edits migration files, runs migrations via terminal, validates schema changes
Backend: Searches for route definitions, adds auth middleware with fuzzy-matched edits, implements JWT handling
Frontend: Adds login UI components, updates API client with auth headers
Testing: Runs unit tests in parallel terminal sessions, monitors output streams
Integration: Spawns test agent to write E2E tests covering auth flows
Documentation: Updates README with auth setup instructions
Review: Uses introspection tools to analyze what tools were used most, optimize future implementations

Why it works: The combination of all 75 tools creates truly autonomous capability. File operations handle code changes, terminal tools run tests/migrations, agent orchestration delegates specialized tasks, sequential thinking maintains coherent strategy across hundreds of steps.

Contributing

KODEGEN.ᴀɪ is open source and welcomes contributions!

Getting Started

Fork the repository on GitHub
Clone your fork: git clone https://github.com/YOUR_USERNAME/kodegen.git
Create a feature branch: git checkout -b feature/amazing-tool
Make your changes
Run tests: cargo test
Commit and push: git push origin feature/amazing-tool
Open a Pull Request

Development Guidelines

All tools must implement the Tool trait
Follow the pattern in packages/filesystem/src/read_file.rs
Write comprehensive prompts for LLM learning
Add JsonSchema to all Args types
Update documentation when adding features

Community

Join the discussion:

Overview

Key Features

🗂️ Warp Speed Mods

💻 Terminal as a Tool

🧠 Reasoning Chains

🔮 Agents with Agents

📊 LLM Observability

📝 Agents Manage Prompts

Installation

Quick Install

Automatic Editor Configuration

✅ Claude Desktop

✅ Windsurf

✅ Cursor

✅ Zed

✅ Roo Code

What it does:

Manual Installation

Manual MCP Client Configuration

Claude Desktop (Manual)

Other MCP Clients

Configuration

Runtime Tool Selection

Method 1: Comma-Separated List

Method 2: Multiple Flags

Available Tool Categories

Runtime Configuration

Available Configuration Options

File Operations

Security

Performance

Advanced Users

Custom Builds with Feature Gates

Building with Specific Features

Available Feature Flags

filesystem

terminal

sequential_thinking

claude_agent

prompt

introspection

process

Common Build Profiles

Minimal Coding Assistant (Filesystem + Terminal)

Thinking Agent (Filesystem + Sequential Thinking + Agents)

Full-Featured Build (Default)

Combining Compile-Time and Runtime Filtering

Checking Available Tools

Performance Comparison

What Your Coding Agent Can Do

🗂️ Filesystem Tools (14 tools)

📖 Memory-Efficient File Reading

✏️ Intelligent Code Editing

🔍 Progressive Search

📦 Batch Operations

💻 Terminal Tools (5 tools)

🖥️ Full Pseudoterminal Emulation

🎯 Interactive Command Sessions

🧠 Sequential Thinking (1 tool)

🌊 Stateful Reasoning Engine

🔮 Agent Orchestration (5 tools)

🤖 True N-Depth Agent Delegation

📝 Prompt Management (4 tools)

🎨 Template Library with Jinja2

📊 Introspection (2 tools)

🔍 Self-Observability

⚙️ Process & Configuration (4 tools)

🔧 System Control

🔗 Git & GitHub (36 tools)

What Your Coding Agent Can Actually Do

🔄 Autonomous Refactoring

🕵️ Multi-Agent Code Analysis

🏗️ Architectural Design with Branching Thought

🚀 Full-Stack Feature Implementation

Contributing

Getting Started

Development Guidelines

Community