Blog
Writing
Thoughts on AI tooling, token economics, and local-first development.
AllEngineeringProductInsightsRelease
EngineeringMay 12, 20257 min read
How we cut token costs by 60% without changing models
A breakdown of the compression techniques we use internally, and why context pruning beats prompt engineering for cost reduction.
ProductApr 28, 20255 min read
The case for local-first AI tooling
Why every serious development team should be running AI infrastructure on their own hardware, and what trade-offs to expect.
EngineeringApr 10, 20259 min read
Semantic cache: skip the API call entirely
A deep dive into how embedding-based caching works, when it helps, and how we tune the similarity threshold for precision.
InsightsMar 22, 20256 min read
Token economics in 2025: what developers actually pay
We analyzed 100 anonymized codebases to understand where token spend goes. The results were surprising.
ReleaseMar 5, 20254 min read
Announcing woozcode v1.0
After six months in private beta, woozcode is now available to everyone. Here is what we built and where we are going next.