Skip to content
3 min read · 550 words

Tokenizable

Primitives covers the eight-primitive overview and the four rules every primitive obeys.

Every piece of text the ADK has to reason about lives inside a Tokenizable. The reasoning content in a Thought, the message body, the recalled memory, the retrieved chunk, the participant's representation, the system prompt, the standing instructions — all of it. The reason is that nothing the rest of the library does about a turn — deciding what fits in the context window, what to compact, what to shed when the budget runs out — can be done responsibly without knowing how many tokens a piece of content is actually worth. A bare string can't answer that question; a Tokenizable can.

The honest part of the story is that "how many tokens" only has an exact answer for the models whose tokenizers are public. Where that's true, Tokenizable counts exactly. Where it isn't, it falls back to a conservative heuristic and says so, so a budget built against it under-spends rather than overruns the context window mid-turn. Exact tokenizers produce exact counts. Everything else uses the conservative local fallback (ceil(length/4)). The contract is not clairvoyance; the contract is one budget call site that rounds toward not blowing the window — so the budget logic upstream can treat token cost as a first-class property of content instead of something to guess at.

Some counts are heuristic on purpose

Models whose tokenizers aren't publicly available (closed providers, custom fine-tunes the ADK has no way to introspect, anything it has never heard of) are counted by a conservative character-per-token heuristic. The fallback over-estimates rather than under-estimates so the budget rounds in the safe direction; the API reference names which encodings count exactly today, and the wrapper is built so a precise tokenizer can drop in the moment one becomes available.

Why not just call the provider's tokenizer endpoint?

Because every budgeting decision the ADK makes — every middleware that asks "will this fit?", every retrieval pass that sorts memories by cost, every compaction step — would turn into a network round-trip to a third-party service. A turn that touches a hundred pieces of content would pay a hundred extra latencies, on a hot path, against a dependency that can rate-limit you, go down, or change its pricing. A conservative local estimate is faster than a remote exact one and degrades gracefully when the network doesn't cooperate. It is also one fewer third party your ADK has to depend on to do its job, which is the kind of trade Tokenizable exists to make on your behalf.

Once a count is taken it is cached against the encoding it was taken for, so re-asking is free. A Tokenizable coerces to its underlying string anywhere a string is expected when you only need text. That convenience is not permission to do budget math on raw strings — if the code is deciding fit, compacting, or shedding, call estimateTokens.

TIP

If you find yourself doing someContent.length / 4 somewhere in middleware, you have lost the thread — that math is what Tokenizable exists to centralise, with as much precision as the active model permits. It is also what lets the budget story in Budgets be a contract instead of a wish.