arrow_backAll articles
Tokens vs Words vs Characters: The Most Expensive Confusion in AI Development
Tutorials

Tokens vs Words vs Characters: The Most Expensive Confusion in AI Development

Javier Echeverria··4 min read
WORDS COUNTER

If you've ever tried to estimate how much an AI API call is going to cost, you've probably run into a moment where you weren't sure whether to think in words, characters, or tokens. They're related, but they're not the same thing, and the difference between them matters more than most people realize when you're actually building something and trying to keep costs predictable.

This is one of those foundational things that's worth getting clear on early, because a lot of other decisions, from how you structure prompts to how you estimate budgets, get easier once you have a solid mental model for how these three units relate to each other.

What characters actually are

Characters are the simplest unit. A character is a single letter, number, punctuation mark, space, or symbol. The word "hello" is five characters. The sentence "I like AI." is ten characters including the period and the space. There's no ambiguity here, counting characters is completely straightforward and consistent across every system.

Most programming languages have built-in ways to count characters, and the number you get will be the same regardless of what AI model you're working with. Characters are the raw material that everything else gets built from.

What words are

Words are the next level up. In everyday language, a word is pretty easy to recognize, it's the thing between the spaces. But when you're dealing with text programmatically, the definition gets fuzzier. Is "don't" one word or two? Is "GPT-4o" one word or three? What about a URL, or a line of code, or a string of numbers?

Different systems count words differently depending on how they handle these edge cases, which is why word count is actually a less reliable unit than it seems. It's useful as a rough estimate but it's not precise enough to use for planning API costs with any confidence.

What tokens are and why they're different from both

Tokens sit somewhere between characters and words, but they don't map neatly to either one. A token is the chunk of text that an AI model's tokenizer treats as a single unit, and those chunks are determined by patterns in the training data rather than by any linguistic rule.

Common short words in English are usually one token each. Longer or less common words often get split into two or three tokens. Punctuation marks and spaces sometimes get attached to the word before or after them, and sometimes they're tokens on their own. The exact behavior depends on the specific tokenizer the model uses.

The practical result is that the relationship between tokens and words is approximate. As a rough guide, one token is about three quarters of a word in English, or about four characters. So a hundred words is somewhere around 130 tokens, and a thousand characters is somewhere around 170 to 200 tokens. These are starting points, not exact numbers, and they shift depending on the content.

According to Stanford's Human-Centered AI research, the way language models process and segment text has direct implications for both model performance and cost efficiency, which is why understanding the basic units matters even for people who aren't building models from scratch.

Why mixing these up causes real problems

The most common mistake people make is writing a prompt, estimating its length in words, and then using a generic conversion to guess the token count. That works well enough for plain English, but it falls apart in a few common situations.

If your prompt includes code, the token count will be higher than the word-based estimate because code has a lot of special characters and short tokens that don't match natural language patterns. If your prompt includes text in another language, the estimate can be off by a significant margin because non-English text often tokenizes less efficiently. If your prompt has a lot of numbers, dates, or structured data, the same issue applies.

The other common mistake is confusing character limits with token limits. Some older APIs and some simpler tools express limits in characters, but most modern AI APIs express limits in tokens. If you're trying to stay within a context window and you're thinking in characters, you might be leaving a lot of usable space on the table, or you might be going over the limit without realizing it.

A concrete example to make this stick

Take this sentence: "The quick brown fox jumps over the lazy dog."

In characters, that's 44. In words, that's 9. In tokens, depending on the model, it's somewhere around 10 to 12, because some of the words get split or have the punctuation counted separately.

Now take a line of code like: "const response = await fetch(url, { method: 'POST' });"

In characters, that's 53. In words, if you count by spaces, maybe 6 or 7. But in tokens, it's closer to 20 or more, because every special character, bracket, and symbol adds up in ways that the word count completely misses.

This is exactly why you need a real token counter rather than a formula when you're working with anything other than plain conversational text. The Tokens to Words Converter on Prompt Toolbox helps you move between these units quickly so you always know what you're actually working with.

How to think about this when building something

The mental model that works best in practice is to think of characters as the raw input, words as the human-readable estimate, and tokens as the billable unit. When you're writing, think in words. When you're planning costs or checking limits, think in tokens. When you're doing anything at a technical level like parsing text or checking string lengths in code, think in characters.

Keep a token counter accessible when you're working on prompts, especially system prompts that get sent with every request. Run your most common input types through it before you finalize your design so you have real numbers to work with instead of estimates.

The difference between a prompt that costs what you expected and one that costs twice as much often comes down to whether you were thinking in the right unit when you built it.

Try the tools