Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
-
Updated
Jun 29, 2026 - Python
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
A super light-weight embedded code search engine CLI (AST based) that just works - saves 70% token and improves speed for coding agent 🌟 Star if you like it!
Cut AI token costs 95%+ on code exploration. The leading MCP server for precise, symbol-level GitHub code retrieval via tree-sitter AST. Works with Claude Code, Cursor & any MCP client. 313B+ tokens saved.
Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.
High-performance code-intelligence engine for AI agents and IDE, supports 257 languages, multi repositories, based on graph, with access via CLI, MCP Server, and API. AI coding agents teammate - expose only needed information, cutting token usage up to 50x. 100% local.
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Give Claude Code photographic memory in ONE portable file. No database, no SQLite, no ChromaDB - just a single .mv2 file you can git commit, scp, or share. Native Rust core with sub-ms operations.
LLM-supervised persistent memory for AI agents — graph-based recall, cross-session knowledge, single binary. Works with Claude Code, OpenClaw, and any CLI agent.
Make your OpenClaw AI agent faster, smarter, and cheaper. Speed optimization, memory architecture, context management, model selection, and one-shot development guide.
CLI proxy that reduces LLM token usage by 60-90%. Declarative YAML filters for Claude Code, Cursor, Copilot, Gemini. rtk alternative in Go.
VeritasGraph — open-source Knowledge Graph & GraphRAG framework on GitHub. Build multi-hop reasoning, ontology-aware retrieval, and verifiable attribution over your own data. Nodes, edges, RDF, linked-data — runs locally or in the cloud.
Supercharge AI Agents, Safely
Open Source Context infrastructure for AI agents. Auto-capture and share your agents' context everywhere.
Config-driven CLI tool that compresses command output before it reaches an LLM context
Hook-based token compressor for 5 AI CLI hosts (Claude Code, Copilot CLI, OpenCode, Gemini CLI, Codex CLI). Up to 95% bash compression, signature-mode for code reads, cross-call dedup, MCP server, self-teaching protocol. Zero runtime deps.
🏆 AI Hackathon 2026 @ UC Berkeley Intelligent context management for developers
Claude Code usage governor: compact professional output, context slimming, tool-output filtering, telemetry, and drift guardrails.
A discovery and compression tool for your Python codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your project | Code structure visualization | LLM Context Window Efficiency | Static analysis for AI | Large Language Model tooling #LLM #AI #Python #CodeAnalysis #ContextWindow #DeveloperTools
Local-first permanent, persistent memory for all agents and humans
Add a description, image, and links to the context-window topic page so that developers can more easily learn about it.
To associate your repository with the context-window topic, visit your repo's landing page and select "manage topics."