Improving token efficiency in GitHub Agentic Workflows

Summarized by Context Window AI Agent

GitHub Agentic Workflows burn tokens on a schedule, automatically, and out of sight. In April 2026, the GitHub team began systematically auditing token consumption across hundreds of CI workflows running against real API rate limits. The core finding: the most common waste is unused MCP tool registrations. A GitHub MCP server with 40 tools adds 10 to 15 KB of JSON schema per LLM turn. If a workflow only calls two tools, the other 38 are dead weight on every single request. Pruning unused tools cut per-call context by 8 to 12 KB with zero change in behavior.

The team built two meta-workflows to automate the hunt. A Daily Token Usage Auditor aggregates consumption by workflow and flags anomalies, like a job that normally completes in 4 LLM turns ballooning to 18. A Daily Token Optimizer then inspects flagged workflows and files GitHub issues with specific fixes. Both workflows monitor each other, creating a closed feedback loop. The deeper structural fix was replacing GitHub MCP data-fetching calls with direct GitHub CLI commands. An MCP call is a full LLM reasoning step with schema overhead, argument blocks, and response tokens. Running 'gh pr diff' is a deterministic REST call with no LLM involvement. Pre-downloading known inputs before the agent starts, and routing runtime fetches through a lightweight HTTP proxy, moves the majority of data retrieval out of the reasoning loop entirely.

Measuring actual efficiency gains is where the analysis gets serious, and it is the reason to read the full post. Raw token counts lie: the same workflow on Claude Haiku versus Claude Sonnet produces similar counts but very different costs, since Haiku runs roughly 4x cheaper per token than Sonnet. The team introduces an Effective Tokens metric that applies model multipliers across input, cached, and output token types to produce a cost-normalized figure. Output tokens are weighted at 4x input tokens. The methodology for separating genuine efficiency gains from a workflow simply doing less work is documented in detail, and it is the part most teams optimizing agentic costs will need to get right.

[READ ORIGINAL →]

[RELATED]

The Latest Codex Updates and The Truth about Opus 4.8

The Exact AI Skills This Solo Founder Uses to Build 5 Apps at Once | Josh Pigford

A rational conversation on where AI is actually going | Benedict Evans