What is a token, and why do AI models count tokens instead of words?

Question

Accepted Answer

A token is a chunk of text, often a word-piece rather than a whole word. Before a model can read your text it splits it into tokens and maps each to a number. Models bill and budget in tokens, not words, because tokens are the real unit they process. The same idea can cost more or less depending on how it splits: common English words are one token, while code, rare words, and many non-English languages break into more tokens per word.

Tokens and tokenization, explained

What people get wrong

Where you see it in real products

Related explainers