Documentation

Everything you need to install, configure, and extend woozcode in your development workflow.

Installation

Woozcode installs globally as a CLI tool via npm, yarn, or pnpm. Node.js 18 or later is required.

npm

$ npm install -g woozcode

yarn

$ yarn global add woozcode

pnpm

$ pnpm add -g woozcode

Verify the installation:

$ wooz --version
1.0.0

Setup Guide

Once installed, initialize woozcode in your project directory. This creates a local config file that you can version-control.

$ cd your-project
$ wooz init

  woozcode initialized
  Config written to .woozcode.json
  Run `wooz start` to begin compression

Configuration file

The generated .woozcode.json controls compression behavior:

{
  "model": "gpt-4o",
  "compression": {
    "level": "balanced",
    "stripComments": true,
    "deduplicateImports": true,
    "contextWindow": 8000
  },
  "cache": {
    "enabled": false,
    "ttl": 3600
  }
}

Option	Values	Description
compression.level	light / balanced / aggressive	Controls how aggressively tokens are trimmed.
stripComments	true / false	Removes inline and block comments from code.
deduplicateImports	true / false	Collapses duplicate import statements.
contextWindow	number	Target token limit for compressed output.

Concepts

Compression pipeline

Woozcode operates as a proxy layer between your code and any LLM API. When you send a request, the compression pipeline strips tokens that carry no semantic weight: duplicate imports, verbose docstrings, commented-out code, and repeated context. The resulting prompt is functionally identical but 40-60% smaller.

Local-first architecture

All processing runs in a local daemon (wooz start). No payload ever reaches a woozcode server. Your prompts, source files, and API keys remain entirely on your machine. The daemon communicates with your AI provider directly using your own credentials.

Semantic cache

When enabled, the cache stores embeddings of previous requests and returns cached responses for semantically similar queries. This can eliminate API calls entirely for repeated or near-repeated questions. Cache entries expire after the configured TTL.

API Usage

Woozcode exposes a local REST API on port 7350 when the daemon is running. You can also use the Node.js SDK directly.

SDK (Node.js)

import { Woozcode } from 'woozcode'

const wooz = new Woozcode()

const result = await wooz.compress({
  prompt: longPromptString,
  level: 'balanced',
})

console.log(result.compressed)    // compressed string
console.log(result.tokensSaved)   // number
console.log(result.ratio)         // 0.61

REST API

POST http://localhost:7350/v1/compress
Content-Type: application/json

{
  "prompt": "...",
  "level": "balanced"
}

// Response
{
  "compressed": "...",
  "tokensSaved": 760,
  "ratio": 0.61,
  "latencyMs": 2
}

Troubleshooting

The daemon fails to start

Ensure port 7350 is free: lsof -i :7350. If another process is using it, either kill it or set a custom port in .woozcode.json with "port": 7351.

Output quality has degraded

Switch compression.level from "aggressive" to "balanced" or "light". Aggressive mode removes more context and may affect model reasoning on complex tasks.

Token count is not reduced

Verify that wooz start is running and that your integration points to localhost:7350. Check wooz status for daemon health.

Permission denied errors on macOS

Run sudo wooz init once to set the required filesystem permissions, then switch back to a non-root user for all subsequent commands.