Member-only story

“Tokenizer Tax” — The Hidden Cost of Anthropic Models

Lavanya Gupta
4 min read1 day ago

--

TLDR; In this article, we begin by presenting a comparative story on cost and pricing of two frontier model families: OpenAI’s GPT vs Anthropic’s Claude models. Although their advertised “cost per token” figures appear similar, our experiments reveal that Anthropic models can be 20%–30% more expensive than GPT models!
We expose a critical limitation and disadvantage of Anthropic models — call it “Tokenizer Tax” — which is essentially the hidden cost induced due to the verbose nature of the Anthropic tokenizer. Simply put, for the exact same input prompt, Anthropic models generate 20%–30% more tokens than OpenAI’s GPT models, ultimately leading to (hidden) higher costs.

Introduction

“Per token pricing” advertised by model providers presents only half the story when comparing costs in real-world use cases.

It is a well-known fact that different model families (can) use different tokenizers. However, there has been limited analysis on how the process of “tokenization” itself varies across these tokenizers. Do all tokenizers result into the same number of tokens for a given input text? If not, how different are the generated tokens? How significant are the differences?

In this article, we explore these questions and examine the practical…

--

--

Lavanya Gupta
Lavanya Gupta

Written by Lavanya Gupta

Carnegie Mellon Grad | AWS ML Specialist | Instructor & Mentor for ML/Data Science

No responses yet