
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).
Context
202,752
tokens
Input
0.980
per 1M tokens
Output
3.08
per 1M tokens
Downloads
241,258
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).
State-of-the-Art Agentic Coding Performance
GLM-5.1 achieves a score of 58.4 on SWE-Bench Pro, outperforming comparable frontier models at the time of its release. It also leads significantly on NL2Repo (repository generation) and Terminal-Bench 2.0 (real-world terminal tasks), making it one of the strongest publicly benchmarked models for complex, real-world software engineering work.
8-Hour Sustained Autonomous Execution
Unlike models that plateau quickly when used as agents, GLM-5.1 is built to maintain productive, goal-aligned execution for up to 8 hours on a single task — running experiments, analyzing results, revising strategies, and iterating across hundreds of rounds and thousands of tool calls without human intervention. Demonstrated examples include building a complete Linux desktop environment from scratch and optimizing a CUDA kernel from 2.6× to 35.7× speedup through autonomous iteration.
200K Context Window with Advanced Tool Support
GLM-5.1 operates with a 200K token context window and supports up to 128K maximum output tokens — essential for holding large codebases and extended reasoning chains in memory. It natively supports function calling, structured output, streaming, thinking mode, context caching, and MCP integration for connecting external tools and data sources.
Open Weights, Flexible Deployment
Released under the MIT license on Hugging Face (zai-org/GLM-5.1), GLM-5.1 supports local deployment via SGLang, vLLM, xLLM, Transformers, and KTransformers, as well as API access through Z.AI's platform with OpenAI SDK compatibility. Teams can self-host the full model or access it via API — no vendor lock-in required.