---
url: 'https://adk.nht.io/api/@nhtio/adk/common/variables/TokenEncoding.md'
description: The set of supported token encoding identifiers.
---

# Variable: TokenEncoding

```ts
const TokenEncoding: readonly [
  "gpt2",
  "r50k_base",
  "p50k_base",
  "p50k_edit",
  "cl100k_base",
  "o200k_base",
  "gemini",
  "llama2",
  "claude",
];
```

Defined in: [lib/classes/tokenizable.ts:26](https://github.com/NHTIO/ADK/blob/v1.20260605.0/src/lib/classes/tokenizable.ts#L26)

The set of supported token encoding identifiers.

## Remarks

Each value maps to a specific estimation backend:

* `gpt2`, `r50k_base`, `p50k_base`, `p50k_edit`, `cl100k_base`, `o200k_base` — exact counts
  via `js-tiktoken` (OpenAI / tiktoken-compatible models).
* `gemini` — exact counts via `@lenml/tokenizer-gemini`, which embeds Gemini's actual
  SentencePiece vocabulary locally with no API call required.
* `llama2` — exact counts via `llama-tokenizer-js` (Llama 1 and 2). Llama 3+ uses a
  different vocabulary and should use the `llama3` identifier once a suitable sync backend
  is available.
* `claude` — heuristic approximation using Anthropic's published ~3.5 chars/token ratio.
  No local tokenizer is available for Claude 3+ models; the Anthropic SDK's
  `messages.countTokens()` API is the only exact path but requires a network call.

When adding a new encoding, add a case to [Tokenizable.estimateTokens](../classes/Tokenizable.md#estimatetokens).
