Tokens in LLMs For TypeScript Developers
- Staff Desk
- 1 day ago
- 4 min read

Many developers today work with Large Language Models (LLMs) every day, yet a surprising number don’t fully understand what tokens are or how tokenization works. This blog provides a full, practical, and technical deep dive into tokens, tokenization, vocabulary, encoding, decoding, and how different LLM providers treat the same text differently. Everything is explained in clear language and supported by TypeScript-based examples rather than Python.
1. What Tokens Really Are in LLMs
Tokens are the fundamental unit of text that LLMs work with. You may think LLMs process words, sentences, or characters, but under the hood, every LLM only understands numbers, and those numbers represent tokens.
Tokens as the currency of LLMs
When you send "hello world" to an LLM:
Your text is broken into tokens
Each token corresponds to a number from the model's vocabulary
You’re billed based on input tokens and output tokens
For example, if "hello world" becomes 3 tokens, and your model replies with 3 output tokens, you pay:
(input_tokens + output_tokens) / 1000 × price_per_1k_tokens
Different models charge different rates for input vs output.
This is why understanding tokens is important: your cost depends on them, and different models tokenize text differently.
2. A Practical Example Using TypeScript and the AISDK
The example uses Claude 3.5 Haiku first.
// Input: "hello world"
The model responds with:
"Hello, how are you doing today? Is there anything I can help you with?"
And the usage numbers might show:
11 input tokens
20 output tokens
This is strange because "hello world" is only two words. You then send the same "hello world" to Google’s Gemini 2.0 Flash:
4 input tokens
11 output tokens
The exact same input text produces different token counts across providers.This confusion disappears once you understand how token vocabularies are built.
3. What Is a Token Vocabulary?
Every LLM has its own vocabulary — a giant list of:
words
subwords
characters
Each entry is assigned a unique number, and that number is the token.
When you send text to the model:
The text is split into the largest possible chunks that exist in the vocabulary
Each chunk is replaced by its token number
Only numbers are sent into the model for processing
This explains why "hello world" can be 2 tokens in one model and 5 tokens in another: They are using different vocabularies.
4. Encoding and Decoding Tokens in TypeScript (Using TikToken / js-tiktoken)
OpenAI uses a tokenizer called TikToken.The JavaScript version is js-tiktoken.
Example:
You encode a file with text:
“The wise owl of moonlight forest where ancient trees stretch their branches toward the starry sky.”
Character length: ~2300 charactersTokens (using GPT-4.0 tokenizer): ~500 tokens
A short example:
input: "hello world"
characters: 12
tokens: 3
The model never sees the characters — it only sees an array like:
[ 1845, 21233, 108, ... ]
Decoding reverses the process:
You give it the array of numbers
It returns the text
This is how LLMs convert text → tokens → processing → tokens → text.
5. How Tokenizers Are Built From Data
To understand differing token counts, we need to understand how tokenizers are trained.
Tokenizers are trained from the same large datasets that the model is trained on.But to make the explanation simple, let’s look at a tiny dataset:
"the cat sat on the mat"
A) Character-Level Tokenizer (Very Inefficient)
Extract every unique character:
t
h
e
space
c
a
s
o
n
m
This creates only about 10 total tokens. Encoding "cat sat mat":
11 characters
11 tokens
This is extremely inefficient.More tokens = slower = more expensive for the model to process.
6. Subword-Level Tokenizers (Much More Efficient)
To improve efficiency, tokenizers create subwords by noticing patterns:
"th" appears in "the"
"he" appears in "the"
"at" appears in "cat", "sat", "mat"
In a simple subword tokenizer example:
Input: "cat sat mat"Characters: 11Tokens: 8
Subwords like "at" reduce the token count.
When logging the vocabulary in code, you may see:
"at" → token
"the" → token
"he" → token
Real tokenizers go far beyond this. They build:
subwords
subwords of subwords
frequent patterns across millions of documents
This leads to token vocabularies of:
50,000
100,000
200,000 tokens
The larger the vocabulary, the longer each token can be, and the fewer tokens your text becomes.
But vocabularies can’t grow forever:
Larger vocabularies require bigger models
They need more memory
They slow down inference
So every model provider balances performance, cost, memory, and dataset characteristics differently.
7. Why Different Models Produce Different Token Counts
Because they use different vocabularies.
Example:
"hello world" → 11 tokens in Claude
"hello world" → 4 tokens in Gemini
Their tokenizers:
were trained on different datasets
chose different subwords
compress text differently
This is why identical prompts result in different token bills.
8. How Tokenizers Handle Unusual or Rare Words
Let’s take:
oFrabjusDay
From Lewis Carroll — a made-up word.
A tokenizer like OpenAI’s o200k might split it like:
o
Fra
bjus
Day
Total: 4 tokens
Why?
Because:
The word isn’t common in training data
The tokenizer doesn’t have a subword for "Frabjus"
This also affects:
rare languages
coding languages
rare file formats
uncommon names
Example with coding languages:
20 lines of JavaScript → fewer tokens20 lines of Haskell → more tokens
Because the tokenizer knows JavaScript patterns better.
9. Full Summary of the Tokenization Process
A) Encoding
Take your input text ("hello world")
Split it into the largest possible vocabulary chunks
Map each chunk to its token number
Send the array of numbers into the LLM
B) LLM Computation
The model thinks only in numbers
All text meaning is stored in vector representations of these token numbers
C) Decoding
Take the output token numbers
Look up their matching vocab entries
Join them to form output text
Tokens are the actual medium of computation, not text.
10. Key Takeaways
Tokens are the currency of LLM usage and cost.
Models have different tokenizers, so the same prompt yields different token counts.
Tokenizers break text into words, subwords, or characters based on their training.
Larger vocabularies → longer subwords → fewer tokens.
But overly large vocabularies make models too big and slow.
Uncommon words split into more tokens.
Popular coding languages tokenize more efficiently.
Final Thoughts
This deep dive shows why understanding tokens matters:
It affects cost
It affects performance
It affects how your prompts behave in different models
It explains output differences between Claude, Gemini, OpenAI, etc.
If you're building AI-powered applications with TypeScript, understanding tokens is foundational. Tokenization is happening behind the scenes every single time you call an LLM.






Comments