GLM 5 API
OpenAI-compatible chat completions, models, API keys, billing, streaming, and function calling.
The GLM 5 API is an OpenAI-compatible developer API for chat completions. Use your GLM 5 API key with the /api/v1 base URL, choose a public model ID, and pay with the same credits used by Chat.
Base URL
https://glm5.app/api/v1Authentication
Create an API key in /settings/apikeys. Plaintext keys are shown once and stored as hashes.
Send the key as a bearer token:
Authorization: Bearer sk-glm5-...Quickstart
curl https://glm5.app/api/v1/chat/completions \
-H "Authorization: Bearer $GLM5_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-5.2",
"messages": [
{
"role": "user",
"content": "Write a short launch checklist for a developer API."
}
]
}'from openai import OpenAI
client = OpenAI(
api_key="sk-glm5-...",
base_url="https://glm5.app/api/v1",
)
completion = client.chat.completions.create(
model="glm-5.2",
messages=[
{"role": "user", "content": "Write a short launch checklist."}
],
)
print(completion.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.GLM5_API_KEY,
baseURL: "https://glm5.app/api/v1",
});
const completion = await client.chat.completions.create({
model: "glm-5.2",
messages: [
{ role: "user", content: "Write a short launch checklist." },
],
});
console.log(completion.choices[0]?.message?.content);Chat Completions
POST /chat/completionsParameters
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
model | string | Yes | - | One of glm-5.2, glm-5, kimi-k2, deepseek-r1. |
messages | array | Yes | - | OpenAI chat messages: system, user, assistant, and tool. |
max_tokens | integer | No | 8192 | Capped by the selected model. |
temperature | number | No | provider default | Sampling temperature. |
top_p | number | No | provider default | Nucleus sampling. |
top_k | integer | No | provider default | Restrict choices to the top K tokens. |
seed | integer | No | - | Deterministic sampling when supported. |
stop | string or array | No | - | Stop sequences. |
stream | boolean | No | false | Streams OpenAI-style SSE chunks. |
tools | array | No | - | Function tools. Supported by glm-5.2, glm-5, and kimi-k2. |
tool_choice | string or object | No | auto | auto, none, required, or a named function choice. |
Streaming
Set stream to true to receive Server-Sent Events:
curl https://glm5.app/api/v1/chat/completions \
-H "Authorization: Bearer $GLM5_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-5.2",
"stream": true,
"messages": [{"role": "user", "content": "Say hello in one sentence."}]
}'The stream emits OpenAI-compatible chat.completion.chunk events and ends with:
data: [DONE]Function Calling
Function calling works with multi-turn tool loops. Send tools, receive tool_calls, execute the tool in your application, then send the tool result back with role: "tool".
{
"model": "glm-5.2",
"messages": [
{
"role": "user",
"content": "What is the weather in Shanghai?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}
]
}Send the result back:
{
"model": "glm-5.2",
"messages": [
{
"role": "user",
"content": "What is the weather in Shanghai?"
},
{
"role": "assistant",
"tool_calls": [
{
"id": "call_123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"Shanghai\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "call_123",
"content": "Shanghai is 24C and clear."
}
]
}Models
GET /modelsReturns the public model IDs available through GLM 5.
| Model | Tools | Max output tokens | Input price | Output price |
|---|---|---|---|---|
glm-5.2 | Yes | 8192 | $2.50 / 1M tokens | $7.50 / 1M tokens |
glm-5 | Yes | 8192 | $1.50 / 1M tokens | $5.00 / 1M tokens |
kimi-k2 | Yes | 8192 | $1.50 / 1M tokens | $6.00 / 1M tokens |
deepseek-r1 | No | 8192 | $1.75 / 1M tokens | $6.50 / 1M tokens |
Billing
API usage is billed from your existing GLM 5 credits. Credits power both Chat and API calls; there is no separate API balance or separate API subscription tier.
Credits are reserved before an upstream request is made, then reconciled after final usage is known. Failed requests are refunded.
Rate Limits
The default per-key limit is 60 requests per minute. If a key exceeds its limit, the API returns 429 rate_limit_exceeded.
Errors
Errors use the OpenAI shape:
{
"error": {
"message": "Invalid API key provided.",
"type": "invalid_request_error",
"code": "invalid_api_key",
"param": null
}
}| Status | Code | Meaning |
|---|---|---|
| 400 | invalid_request_error | Malformed request body or messages. |
| 400 | unsupported_parameter | A parameter is unsupported for the selected model. |
| 401 | invalid_api_key | Missing or invalid API key. |
| 402 | insufficient_quota | Not enough credits. |
| 404 | model_not_found | Unknown model ID. |
| 429 | rate_limit_exceeded | Per-key rate limit exceeded. |
| 503 | service_unavailable | Public API is temporarily disabled. |
| 500 | internal_error | Unexpected server error. |