Z.ai launches GLM-5.2 trained on Huawei Ascend chips

Z.ai released GLM-5.2, a 744-billion-parameter model trained on Huawei Ascend chips and published under an MIT license. It scored 74.4 on the FrontierSWE benchmark.

Z.ai released GLM-5.2 on June 16. The model has 744 billion parameters, uses a mixture-of-experts architecture, and is published under an MIT license. Z.ai made the open-source weights and quantized versions available for download on Hugging Face and for limited testing on its platform.

GLM-5.2 provides a one-million-token context window. That is five times the 200,000-token limit of GLM-5.1. The model’s architecture and extended context are intended for workflows that need long documents, whole-repo navigation, multi-file refactors and long agentic pipelines.

On the FrontierSWE benchmark, which measures an AI agent’s ability to complete multi-hour technical projects, GLM-5.2 scored a dominance rate of 74.4. Claude Opus 4.8 scored 75.1 on the same test and GPT-5.5 scored 72.6. On SWE-bench Pro, which assesses autonomous resolution of real-world GitHub issues, GLM-5.2 scored 62.1; GPT-5.5 scored 58.6 and GLM-5.1 scored 58.4. On the prolonged engineering test SWE-Marathon, GLM-5.2 scored 13.0 compared with Opus 4.8’s 26.0. Aggregate rankings list GLM-5.2 as a leading open-source model on several quality indices.

The model was trained entirely on Huawei Ascend hardware; Z.ai reports no NVIDIA GPUs were used in the training pipeline. An industry estimate put total training costs at about $25 million, with roughly 80% of that attributed to post-training work. Z.ai has been on the U.S. Entity List since January 2025. The company’s stock rose about 90% in the week around the model’s release and recent U.S. restrictions on some AI services.

Compressed 2-bit GGUF quantizations reduce the raw model size from roughly 1.51 terabytes to about 238 gigabytes while retaining about 82% of the original accuracy, according to published files. Running the model locally typically requires about 256 GB of unified memory or a combination of system RAM and GPU VRAM with mixture-of-experts offloading. Cloud-hosted APIs are likely to be the more practical option for many users.

Z.ai set API pricing at $1.40 per million input tokens and $4.40 per million output tokens. The company offers a Coding Plan that starts at about $18 per month and integrates with several agentic development environments.

Early hands-on tests reported that GLM-5.2 produced more varied outputs on zero-shot creative prompts, generating diverse game states and shifting enemy behaviors in a prototype task. Reviewers noted that user-interface output quality trailed some competitors on single-shot polished outputs.

GLM-5.2 follows GLM-5.1 and joins other large models that provide extended context windows and open licensing. The open weights, quantized files and API access are available now.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author