AI Fundamentals
Token
The basic unit of text that AI models process, roughly equivalent to 4 characters or 0.75 words.
A token is the basic unit of text that AI language models process. Tokenization breaks text into subword units that the model can understand. In English, one token roughly equals 4 characters or 0.75 words. 'Artificial intelligence' might be 3 tokens: 'Art', 'ificial', 'intelligence'. Token counts matter because they determine costs (APIs charge per token) and context limits. Understanding tokenization helps optimize prompts and estimate API costs.