Pricing page

The final cost of a request may vary, as each model has its own pricing parameters listed on the page.

You can access it by clicking on "Pricing" in the top menu or by following the link.

Instructions for use

Choose a model to view its cost. Use categories or search to locate the model quickly.

The table displays the main categories that will help you obtain all the necessary information for calculating the cost:

• MODEL: The product model, including the Provider

• CONTEXT: How much information the model can "hold in memory"

• PRICE (per X): Different models have different criteria for pricing. For example, chat-models charge you based on the amount of input and output credits consumed, while some video-models charge per 1 second of video.

Context

The number in the Context section when using AI indicates the maximum number of tokens (e.g., words, characters, or parts of words) that the model can process at once. This value determines how much information the model can "hold in memory" for analysis and generating responses.

For example, if it is stated that the model supports Context 128K, it means that it can process up to 128,000 tokens simultaneously.

What does this mean in practice?

Long texts: The model can analyze or generate lengthy documents, articles, or dialogues without losing context.
Complex tasks: The model is capable of considering more information from previous requests or data, which is useful for tasks requiring deep understanding (e.g., analyzing legal documents or long correspondences).
Improved quality: The larger the context, the more accurately the model can interpret requests and provide relevant results.

Example:

If a model with a 128K context is used for chat, it can "remember" up to 128,000 tokens from previous messages, enabling long and meaningful conversations.

In the case of text analysis, the model can process entire books or large datasets at once.

Thus, Context 128K is an indicator of the model's capability, determining how complex and large-scale tasks it can handle.

Price

This column shows the cost of using each model. The way usage is billed may vary between models. The main pricing types are listed below.

Per Input Tokens

Input refers to the data or requests that a user or system sends to an AI for processing. This can include text, images, audio, video, or any other type of information that the AI is capable of analyzing. For example, when you ask a voice assistant a question (e.g., "What's the weather today?"), your question is an input request.

Per Output Tokens

Output is the result generated by the AI based on the input data. This could be an answer to a question, a recommendation, data classification, an image creation, or even the execution of a task. In the example of a voice assistant, the response "Today is sunny, +25 degrees" is the output.

Per Generation

Per-generation pricing charges a fixed fee per completed request (i.e., per generated asset), regardless of token count.

Additional Price

For a small number of models, a more complex call cost calculation applies. The value in this column is added to the cost calculated from the Price column.