Tokens in LLMs For TypeScript Developers

Staff Desk
1 day ago
4 min read

Many developers today work with Large Language Models (LLMs) every day, yet a surprising number don’t fully understand what tokens are or how tokenization works. This blog provides a full, practical, and technical deep dive into tokens, tokenization, vocabulary, encoding, decoding, and how different LLM providers treat the same text differently. Everything is explained in clear language and supported by TypeScript-based examples rather than Python.

1. What Tokens Really Are in LLMs

Tokens are the fundamental unit of text that LLMs work with. You may think LLMs process words, sentences, or characters, but under the hood, every LLM only understands numbers, and those numbers represent tokens.

Tokens as the currency of LLMs

When you send "hello world" to an LLM:

Your text is broken into tokens
Each token corresponds to a number from the model's vocabulary
You’re billed based on input tokens and output tokens

For example, if "hello world" becomes 3 tokens, and your model replies with 3 output tokens, you pay:

(input_tokens + output_tokens) / 1000 × price_per_1k_tokens

Different models charge different rates for input vs output.

This is why understanding tokens is important: your cost depends on them, and different models tokenize text differently.

2. A Practical Example Using TypeScript and the AISDK

The example uses Claude 3.5 Haiku first.

// Input: "hello world"

The model responds with:

"Hello, how are you doing today? Is there anything I can help you with?"

And the usage numbers might show:

11 input tokens
20 output tokens

This is strange because "hello world" is only two words. You then send the same "hello world" to Google’s Gemini 2.0 Flash:

4 input tokens
11 output tokens

The exact same input text produces different token counts across providers.This confusion disappears once you understand how token vocabularies are built.

3. What Is a Token Vocabulary?

Every LLM has its own vocabulary — a giant list of:

words
subwords
characters

Each entry is assigned a unique number, and that number is the token.

When you send text to the model:

The text is split into the largest possible chunks that exist in the vocabulary
Each chunk is replaced by its token number
Only numbers are sent into the model for processing

This explains why "hello world" can be 2 tokens in one model and 5 tokens in another: They are using different vocabularies.

4. Encoding and Decoding Tokens in TypeScript (Using TikToken / js-tiktoken)

OpenAI uses a tokenizer called TikToken.The JavaScript version is js-tiktoken.

Example:

You encode a file with text:

“The wise owl of moonlight forest where ancient trees stretch their branches toward the starry sky.”

Character length: ~2300 charactersTokens (using GPT-4.0 tokenizer): ~500 tokens

A short example:

input: "hello world"
characters: 12
tokens: 3

The model never sees the characters — it only sees an array like:

[ 1845, 21233, 108, ... ]

Decoding reverses the process:

You give it the array of numbers
It returns the text

This is how LLMs convert text → tokens → processing → tokens → text.

5. How Tokenizers Are Built From Data

To understand differing token counts, we need to understand how tokenizers are trained.

Tokenizers are trained from the same large datasets that the model is trained on.But to make the explanation simple, let’s look at a tiny dataset:

"the cat sat on the mat"

A) Character-Level Tokenizer (Very Inefficient)

Extract every unique character:

t
h
e
space
c
a
s
o
n
m

This creates only about 10 total tokens. Encoding "cat sat mat":

11 characters
11 tokens

This is extremely inefficient.More tokens = slower = more expensive for the model to process.

6. Subword-Level Tokenizers (Much More Efficient)

To improve efficiency, tokenizers create subwords by noticing patterns:

"th" appears in "the"
"he" appears in "the"
"at" appears in "cat", "sat", "mat"

In a simple subword tokenizer example:

Input: "cat sat mat"Characters: 11Tokens: 8

Subwords like "at" reduce the token count.

When logging the vocabulary in code, you may see:

"at" → token
"the" → token
"he" → token

Real tokenizers go far beyond this. They build:

subwords
subwords of subwords
frequent patterns across millions of documents

This leads to token vocabularies of:

50,000
100,000
200,000 tokens

The larger the vocabulary, the longer each token can be, and the fewer tokens your text becomes.

But vocabularies can’t grow forever:

Larger vocabularies require bigger models
They need more memory
They slow down inference

So every model provider balances performance, cost, memory, and dataset characteristics differently.

7. Why Different Models Produce Different Token Counts

Because they use different vocabularies.

Example:

"hello world" → 11 tokens in Claude
"hello world" → 4 tokens in Gemini

Their tokenizers:

were trained on different datasets
chose different subwords
compress text differently

This is why identical prompts result in different token bills.

8. How Tokenizers Handle Unusual or Rare Words

Let’s take:

oFrabjusDay

From Lewis Carroll — a made-up word.

A tokenizer like OpenAI’s o200k might split it like:

o
Fra
bjus
Day

Total: 4 tokens

Why?

Because:

The word isn’t common in training data
The tokenizer doesn’t have a subword for "Frabjus"

This also affects:

rare languages
coding languages
rare file formats
uncommon names

Example with coding languages:

20 lines of JavaScript → fewer tokens20 lines of Haskell → more tokens

Because the tokenizer knows JavaScript patterns better.

9. Full Summary of the Tokenization Process

A) Encoding

Take your input text ("hello world")
Split it into the largest possible vocabulary chunks
Map each chunk to its token number
Send the array of numbers into the LLM

B) LLM Computation

The model thinks only in numbers
All text meaning is stored in vector representations of these token numbers

C) Decoding

Take the output token numbers
Look up their matching vocab entries
Join them to form output text

Tokens are the actual medium of computation, not text.

10. Key Takeaways

Tokens are the currency of LLM usage and cost.
Models have different tokenizers, so the same prompt yields different token counts.
Tokenizers break text into words, subwords, or characters based on their training.
Larger vocabularies → longer subwords → fewer tokens.
But overly large vocabularies make models too big and slow.
Uncommon words split into more tokens.
Popular coding languages tokenize more efficiently.

Final Thoughts

This deep dive shows why understanding tokens matters:

It affects cost
It affects performance
It affects how your prompts behave in different models
It explains output differences between Claude, Gemini, OpenAI, etc.

If you're building AI-powered applications with TypeScript, understanding tokens is foundational. Tokenization is happening behind the scenes every single time you call an LLM.

Tokens in LLMs For TypeScript Developers

1. What Tokens Really Are in LLMs

Tokens as the currency of LLMs

2. A Practical Example Using TypeScript and the AISDK

3. What Is a Token Vocabulary?

4. Encoding and Decoding Tokens in TypeScript (Using TikToken / js-tiktoken)

Example:

Decoding reverses the process:

5. How Tokenizers Are Built From Data

A) Character-Level Tokenizer (Very Inefficient)

6. Subword-Level Tokenizers (Much More Efficient)

7. Why Different Models Produce Different Token Counts

Because they use different vocabularies.

8. How Tokenizers Handle Unusual or Rare Words

Why?

Example with coding languages:

9. Full Summary of the Tokenization Process

10. Key Takeaways

Final Thoughts

Recent Posts

Comments

Talk to a Solutions Architect — Get a 1-Page Build Plan

Get In Touch