AI & Automation

Token

The basic unit of text that language models process — roughly three-quarters of a word in English. Token counts determine API costs, context limits, and how much information a model can consider at once.

Tokens and words

A token is not the same as a word. Common words like "the" or "and" are single tokens. Longer or less common words get split into multiple tokens. "Unforgettable" might become three tokens: "un", "forget", "table". On average, 100 tokens equals roughly 75 English words.

Understanding tokens matters because language model pricing is per-token, and every model has a maximum context window measured in tokens.

Context windows

A model's context window is the total number of tokens it can process in a single request — including both your input (prompt + documents) and the model's output. GPT-4 Turbo supports 128K tokens (roughly 96,000 words). Claude supports up to 200K tokens.

For RAG applications, context window size determines how many document chunks you can include alongside a question. Larger windows allow more context but cost more per request.

Managing costs

Input tokens (what you send) are cheaper than output tokens (what the model generates). Efficient prompt engineering and smart document chunking keep token usage — and therefore costs — under control. For most SMB automation tasks, API costs are measured in pennies per interaction.

Say hello

Quick intro