Is GLM 5.2 better than Claude?

Not proven. On some agentic coding benchmarks (Terminal-Bench 2.1, FrontierSWE) Zhipu's self-reported numbers put it near Claude Opus 4.8, and some developers genuinely can't tell the outputs apart on specific tasks. But independent verification was still ongoing at the time of writing, and practitioner reports say it often needs 3-4 prompts to do what Opus does in one. Near-frontier on benchmarks, behind on prompt efficiency is the honest current read.

The weights are free under an MIT licence, so you can download and self-host it without paying Zhipu anything. But it's a ~750B-parameter model, so 'free' means serious GPU hardware or a cloud GPU bill. The hosted API is cheap rather than free: roughly $1.40 USD per million input tokens and $4.40 per million output tokens, commonly cited as about a sixth of Western frontier pricing before you adjust for prompt efficiency.

Can Australian businesses use GLM 5.2?

Legally, yes, the MIT licence permits commercial use. Practically it depends who you are. Self-hosting gives genuine data sovereignty because nothing leaves your infrastructure, but AU-hosted providers are thin, so that means real hardware or an Australian cloud GPU bill. Using Zhipu's API sends your data to a Chinese vendor's servers, which most Australian government and healthcare organisations won't accept regardless of price.

What is an open-weights model?

A model whose trained parameters (the weights) are published for anyone to download and run on their own hardware, rather than being accessible only through the vendor's API. Open weights under a permissive licence like MIT mean the vendor can stop distributing the model, but they can't retract the copies already downloaded. That's the property that made GLM 5.2's timing interesting: it launched the day after a US export order switched off Claude Fable 5 worldwide.

GLM 5.2 is the model nobody can switch off

June 2026 gave us a split-screen I don't think anyone in this industry will forget in a hurry.

On one side: the US government orders Anthropic to switch off Claude Fable 5 worldwide, three days after launch, and it stays dark for 18 days. OpenAI ships GPT-5.6 to roughly 20 government-approved companies and everyone else gets to read about it. Two frontier American models, both effectively controlled by a letter from Washington.

On the other side: a Beijing lab called Zhipu AI (trading as Z.ai) releases GLM 5.2, a frontier-class coding model, under an MIT licence, weights downloadable by anyone, at roughly a sixth of the price of its Western equivalents. It landed on 13 June. Fable 5 was suspended on 12 June. One day apart.

Nobody planned that timing as a piece of theatre, but it worked as one. Because here's the thing an export-control order genuinely cannot do: it can't un-download an MIT-licensed model that people have already pulled onto their own hardware. The Commerce Department flipped a switch and Fable 5 vanished from every customer on the planet simultaneously. There is no equivalent switch for GLM 5.2. Once the weights are on your hardware, they're yours, and no government, Zhipu's included, can reach into your server room and take them back.

To be clear, that argument itself isn't new. DeepSeek shipped R1 under MIT back in January 2025, and the Llama and Qwen lines have been making the open-weights case for a couple of years. What June did is give the argument its first proper demonstration: for eighteen days, everyone could see exactly what the alternative looks like when it fails. That's worth an honest look. Which also means looking at the parts of GLM 5.2 that aren't nearly as tidy as the headline, because there are a few, and one of them quietly eats most of the famous price advantage.

What GLM 5.2 actually is

The specs first, because they're worth taking seriously. GLM 5.2 is a mixture-of-experts model of roughly 744 billion parameters, with about 40 billion active per token and a 1-million-token context window. It's built for long-horizon agentic coding, the grinding multi-step work where a model has to hold a plan together across a whole session rather than answer one question well. It went out to GLM Coding Plan subscribers around 13 June 2026, with the open weights and broader availability following around 16 June.

On the benchmarks, GLM 5.2 clears last year's GPT cleanly and sits behind both of Anthropic's frontier models. On SWE-bench Pro it scores 62.1, ahead of GPT-5.5's 58.6 and its own predecessor's 58.4, but behind Claude Opus 4.8's 69.2 and a long way behind Claude Fable 5's 80.3 on the same test. On FrontierSWE it lands a 74% win-rate, a genuine near-tie with Opus 4.8 and just ahead of GPT-5.5, though Fable 5 towers over the whole field there too. So "frontier-adjacent" is fair; "beats the frontier" isn't.

Coding / agentic benchmark	GLM 5.2	Claude Fable 5	Claude Opus 4.8	GPT-5.5
SWE-bench Pro (vendor-reported)	62.1	80.3	69.2	58.6
FrontierSWE Best@5 win-rate	74	90	75	73

The two rows aren't the same scale, so read down the columns, not across the rows. SWE-bench Pro figures are self-reported by each vendor (none of these models sit on Scale's public leaderboard yet), so treat them as "the lab's own telling." FrontierSWE is a relative win-rate against the field published by the benchmark's owner (Proximal Labs), not a percentage of problems solved. GPT-5.6, the current American frontier, isn't in the table at all: OpenAI has published no benchmark numbers for its gated preview, which tells you something in itself. And note the model topping both columns is Claude Fable 5, the one the US government switched off for 18 days.

Self-reported numbers from the company selling the model deserve the same scepticism whether the company is in San Francisco or Beijing. I said exactly this about OpenAI's GPT-5.6 preview figures last week, and I'm not going to develop selective amnesia because the accent changed. File them as "plausible, unverified."

Pricing is where the eyebrows go up. Zhipu's API runs about $1.40 USD per million input tokens and $4.40 per million output tokens, commonly cited as around one sixth of the cost of GPT-5.5 or Claude Opus equivalents. Third-party hosts like OpenRouter can undercut even that. And if you self-host, the model itself costs nothing at all, because MIT licence means MIT licence. You pay for hardware and electricity and the therapy sessions after provisioning GPU clusters, but not for the model.

One more spec, offered with a raised eyebrow: Zhipu says the model was trained entirely on Huawei chips, no Nvidia anywhere. Nobody outside Zhipu can audit training hardware, so file that with the benchmarks as "plausible, unverified." And strictly speaking it changes nothing about whether your downloaded copy can be revoked; it matters to Zhipu's supply chain, not to your server room. It's in the article because you'll see it quoted everywhere, not because it should drive a decision.

If the idea of running serious AI on your own hardware sounds like something only Google-sized companies do, it isn't, and we've written about the small-business end of exactly that.

A compact metallic server box sitting on a small-business desk, projecting a glowing holographic AI command centre interface.

Small businesses are building their own AI command centre. Here's what it actually costs.

Enterprise business-intelligence setups can run into the tens of thousands a year. Small businesses are building something similar from free...

Read full article

The developer reaction: genuine, not astroturf

The response from working developers has been the interesting kind: not press-release enthusiasm, but people running their own tests and coming back surprised.

Patrick C Toulme

@PatrickToulme

I ran GLM 5.2 with OpenCode harness against Claude Opus this week deployed locally.

Bottom line: It is a real frontier coding model and insanely good for the price (free). Open source model + open source harness + local serving on my own chips is an amazing value proposition.

1.6K

20 June 2026

That's someone running GLM 5.2 locally, head to head against Claude Opus, and concluding most people couldn't tell the outputs apart. Deployed locally. For free. A year ago that sentence would've been science fiction, or at least marketing fiction.

crashout

@0xCRASHOUT

initial glm-5.2 review, im late but i literally own claude 20x and chatgpt 20x plans, there was no reason to try it out until now.

i gave it a simple open ended prompt "look for improvements and simplifications in my *** package, dont edit"

glm-5.2 - 20 findings, 8m 13s run

26 June 2026

"Never been an open source llm advocate until today" is the kind of conversion story you can't buy, and there were plenty like it. The most notable endorsement for our purposes came from Jeremy Howard, the Australian co-founder of fast.ai and someone with two decades of credibility in this field, who called GLM 5.2 "at least as good as Opus 4.8 and GPT 5.5. It's super fast, inexpensive." When Howard says a model is good, I pay attention, partly because he's earned it and partly, I'll admit, because it's nice when the person validating your article's Australian angle is genuinely one of the most respected practitioners in the world and not someone I had to squint to make relevant.

So: strong benchmarks, real developer enthusiasm, a serious endorsement, one sixth the price, and weights you own forever. Case closed?

No. Now the part most of the excited posts skipped.

The catch, and it's a proper one

Start with the launch itself, which was a mess by any standard.

BridgeMind

@bridgemindai

GLM 5.2 just dropped from Z.ai.

And the release is a mess.

No benchmarks. No API.

They released it on a Saturday in response to the US government banning Claude Fable 5.

The only way to touch it is their GLM Coding Plan.

A flagship model launch with zero

387

13 June 2026

That reaction was fair on day one. A flagship model launch with no benchmarks published and access paywalled behind a subscription is a strange way to announce you're competing with Opus 4.8. The numbers and the open weights came within days, so the "hiding a weak model" theory didn't survive, but "the release is a mess" absolutely did. Western labs have spoiled us with launch-day system cards and API access; Zhipu shipped like a company that hadn't thought hard about how sceptical Western developers would read a paywalled, benchmark-free debut.

The bigger issue is the one the same account landed on a week later, after actually using the thing.

BridgeMind

@bridgemindai

GLM 5.2 is a massive jump from GLM 5.1.

But there is a tradeoff nobody is talking about.

You will prompt GLM 5.2 three or four times to accomplish what GPT 5.5 or Opus 4.8 nails in a single prompt.

Open source models are missing a level of intelligence that benchmarks do not

551

19 June 2026

Read that one twice, because it's the single most important caveat in this entire article. Three or four prompts to accomplish what GPT-5.5 or Opus 4.8 nails in one.

Let me do the maths on that, honestly, because "one sixth the price" is the number everyone's quoting and it doesn't survive contact with this observation. I know, I'm about to inflict arithmetic on you in a Saturday article, but this is the whole ballgame.

If GLM 5.2 costs one sixth as much per token but needs three to four prompts per task, then on a per-task basis you're paying somewhere between three and four sixths of the Western price. Call it half. Still cheaper! Genuinely cheaper. But "half price" is a very different pitch from "one sixth," and half price comes with a cost the token bill doesn't capture: your developer is sitting there re-prompting, re-explaining, and re-checking three times instead of once. Developer time in Australia isn't free. On complex work, the human minutes spent shepherding a less prompt-efficient model can quietly wipe out the remaining saving altogether, and in some cases flip it negative. We learned a version of this lesson ourselves years ago with offshore development quotes that looked like a third of the price until you counted the rework cycles. Cheap per unit and cheap per outcome are not the same number, and confusing them has burned more project budgets than I can count, including a couple of ours.

Add to that: the benchmarks are self-reported, the independent verification is still catching up, and self-hosting a 750-billion-parameter model needs the kind of GPU hardware that makes your finance person go quiet. The sceptics aren't wrong about any of this. GLM 5.2 is a very good model that's probably not quite as good as its launch-week legend, at a real-world price advantage closer to 2x than 6x on hard tasks.

Which is why the interesting argument for it was never really "it's just as good."

The actual argument: nobody can switch it off

Here's where the June timing stops being trivia and becomes the point.

A high-tech server module representing an AI model is partially ejected and dissolving into glowing particles, leaving an empty gap in the server rack.

The real Fable 5 lesson isn't the model, it's that it can vanish overnight

You can't review a model you had for three days. Claude Fable 5 was pulled globally for 18 days by a US export-control order, and the real lesson...

Read full article

When the export order hit Anthropic on 12 June, every Fable 5 customer on Earth lost access simultaneously, with zero notice, for 18 days. Not because Anthropic failed them. Because a government told Anthropic to flip the switch, and there was a switch to flip. GPT-5.6 launched into the middle of that shutdown already gated, with OpenAI publicly grumbling about its own launch terms.

A locked federal government seal overlaying a glowing AI neural network, with a narrow single beam of light passing through a small keyhole to represent restricted access to a powerful new model.

OpenAI built its best model yet. The government decided who gets to use it.

OpenAI previewed GPT-5.6 on 26 June 2026 and gated it to roughly 20 government-approved companies, not the public. OpenAI itself objected. It...

Read full article

An open-weights model under an MIT licence is the one deployment shape that switch doesn't exist for. Zhipu can stop distributing GLM 5.2 tomorrow. Beijing could order it pulled from every download mirror on the planet. Neither action retracts the copies already sitting on hardware in Sydney, Berlin, or São Paulo. The model you've downloaded keeps running, keeps working, and answers to nobody but your electricity bill. After the month we've just had, that's not an abstract property. It's the exact property Fable 5 customers discovered they didn't have, at the worst possible moment to discover it.

But, and I want to be surgical about this, that is a completely separate claim from "GLM 5.2 is as good as Opus." The availability argument is about who controls the off switch. The capability argument is about benchmark parity and prompt efficiency, and as we just covered, that one's unproven at best. Plenty of launch-week commentary blurred the two into "frontier model, sixth of the price, can't be revoked, why would you use anything else," and every clause in that sentence needs an asterisk except the revocation one. Keep them separate and you can reason clearly. Blur them and you're buying a narrative.

Here's the part that should unsettle everyone, whichever side of the regulation debate they sit on. Fable 5 was pulled over the fear its safety guardrails could be jailbroken. With an open-weights model you don't need to jailbreak anything, you can just remove the guardrails. Three weeks after GLM 5.2 shipped there was already an "abliterated" build on Hugging Face, the refusal behaviour surgically stripped out, a couple of thousand downloads a month, free to anyone. Even a crude one. And by the identical logic that protects you from a government letter, nobody can switch that off either. The resilience is symmetric: it shields the de-safetied fork from the vendor, the regulator, and everyone else, permanently. This isn't a GLM quirk, the same is true of Llama, Qwen and DeepSeek. Which leaves the export-control effort looking stranger than it did: Washington switched off the one deployment model where an off switch exists, while the models whose safety is genuinely, permanently removable were never the ones it could reach.

The counterargument that actually lands

Now let me argue against my own headline, because there's a version of this pushback that's too good to bury in a footnote.

For most businesses, "nobody can switch it off" is rhetorically lovely and practically hollow. The overwhelming majority of companies will never self-host a 750-billion-parameter model. They'll use GLM 5.2 the way they use every model: through an API. And GLM-via-API is just another vendor with another switch. This time the switch is held in Beijing instead of Washington, which for an Australian business isn't obviously an upgrade. I'll concede that completely, because it's true. The irrevocability argument holds for genuine self-hosting, or for weight-escrow arrangements where you've got the model stored even if you serve it through someone else. It does not hold for "I signed up to Z.ai's API," which carries the same availability risk profile as any other hosted model, plus a few new ones. And for that majority, the cheaper insurance against a Fable-style shutdown was never self-hosting anyway. It's a second API vendor and a tested fallback path, which handles the exact failure mode June demonstrated without buying a single GPU. Worth remembering, too, what the insurance is actually priced against: Fable 5 came back in 18 days, while a self-hosted model is a frozen asset that never gets patched or improved and needs its GPU bill paid whether or not the disaster ever recurs.

And those new ones deserve naming. Open weights from a Chinese lab don't remove sovereignty risk, they swap it for a different flavour. The training data is opaque. The relationship between Zhipu and the Chinese government is not something you or I can audit. Security researchers will spend months probing these weights for anything untoward, and "we haven't found a backdoor" is a weaker guarantee than people treat it as. For a certain class of organisation, and this includes a good chunk of Webcoda's actual client base, the licence is irrelevant because the provenance is disqualifying on its own. That's not paranoia, it's their threat model, and it's a defensible one.

So the honest scope of the thesis is narrower than the headline: GLM 5.2's open weights make it irrevocable for the minority who'll genuinely run it themselves, and a trust trade-off rather than a trust upgrade for everyone else. Narrower. Still real. The Fable 5 shutdown proved availability risk is a live category; GLM 5.2 proved a frontier-adjacent answer to it exists. Both facts stand even after you've sanded the hype off.

The Australian picture

Two things make this concretely relevant here rather than "Australian" in costume.

First, Jeremy Howard. The most credible individual endorsement GLM 5.2 has received came from an Australian, and not a random one: fast.ai's whole philosophy has been making serious AI usable outside big-lab walls, so a frontier-class open-weights model is roughly his life's argument showing up with receipts.

Jeremy Howard

@jeremyphoward

Wow.

@Zai_org GLM 5.2 is a marvel! It is *at least* as good as Opus 4.8 and GPT 5.5. It's super fast, inexpensive, and not too verbose.

It responds with nuance and judgement, & handles long context VERY well.

I've never experienced an open weights model like this before.

7.4K

18 June 2026

When the local expert with the least incentive to hype a Chinese lab says it's at least as good as Opus 4.8, Australian developers noticed, and rightly so. Worth noting he also flagged its biggest gap in the same breath: no vision support, so if your workflows lean on screenshots or document images, that's a hard stop for now.

Second, data sovereignty, which for our government and healthcare clients isn't a talking point, it's a procurement gate. A self-hosted model is the one deployment where the answer to "where does our data go?" is genuinely "nowhere." No US CLOUD Act exposure, no Chinese server either, nothing leaving your infrastructure at all. On paper, an MIT-licensed frontier-class model is the best answer to that question anyone's ever offered.

In practice, the gap between paper and rack is wide. AU-hosted providers for GLM 5.2 are thin on the routing platforms, so "sovereign" deployment means buying real GPU hardware or paying an Australian cloud GPU bill, and for a model this size neither is small money. I'm deliberately not quoting an AUD figure because I don't have one I'd defend, and a made-up number would be worse than none. And the provenance problem cuts hardest exactly here: the government and healthcare organisations who'd benefit most from a model no foreign order can revoke are precisely the ones whose security postures won't touch Chinese-lab weights regardless of licence. The clients with the strongest sovereignty need and this particular model are, for now, ships passing in the night. I don't get to pretend otherwise just because it would make a neater ending.

Usual disclosure: we use Claude every day at Webcoda, and this site's tooling is built on it. Factor that in.

Where that leaves it

Not a switch recommendation. We haven't moved a single workload to GLM 5.2 and I'm not telling you to. The capability story is promising but unverified, the price advantage shrinks to maybe half once you price the prompt-efficiency gap honestly, and the trust questions are real for exactly the organisations that most need what it offers.

What June 2026 did establish is smaller than the launch-week hype, but it's real. Availability risk on closed frontier models stopped being a hypothetical slide in somebody's risk deck and became a dated timeline with invoices attached, and a frontier-adjacent open-weights answer to it exists for the minority genuinely prepared to run it themselves. I'm not going to claim more than that. Open-weights models existed before June and their existence plainly didn't stop Washington issuing the Fable 5 order, so this release doesn't rewrite anyone's negotiating table, and for most businesses the sensible response is still the boring one: a second vendor and a fallback plan, not a GPU rack.

But if the Fable timeline is what finally gets that fallback plan written, the specific model that prompted the conversation almost doesn't matter. That part's done before anyone's deployed anything.

Sources

Z.ai / Zhipu AI. GLM 5.2 release and GLM Coding Plan availability. June 2026. https://z.ai
Patrick Toulme (@PatrickToulme). Local GLM 5.2 vs Claude Opus comparison. 20 June 2026. https://x.com/PatrickToulme/status/206813421258...
BridgeMind AI (@bridgemindai). GLM 5.2 launch reaction. 13 June 2026. https://x.com/bridgemindai/status/2065770088821...
BridgeMind AI (@bridgemindai). GLM 5.2 prompt-efficiency observation. 19 June 2026. https://x.com/bridgemindai/status/2067925035168...
@0xCRASHOUT. GLM 5.2 real-world test vs GPT/Claude. June 2026. https://x.com/0xCRASHOUT/status/207058730605519...
Jeremy Howard (@jeremyphoward). GLM 5.2 endorsement. June 2026. https://x.com/jeremyphoward/status/206775746818...
Anthropic (@AnthropicAI). Fable 5 global restoration announcement. 30 June 2026. https://x.com/AnthropicAI/status/20721638844302...
OpenAI. GPT-5.6 preview announcement and government-gated availability. 26 June 2026. https://openai.com
huihui-ai. Abliterated (refusal-removed) build of GLM 5.2, GGUF format. Hugging Face. 2026. https://huggingface.co/huihui-ai/Huihui-GLM-5.2...
Proximal Labs. FrontierSWE leaderboard (Best@5 win-rate figures for GLM 5.2, Claude Fable 5, Claude Opus 4.8, GPT-5.5). 2026. https://www.frontierswe.com
Anthropic. Introducing Claude Opus 4.8 (SWE-bench Pro 69.2). 28 May 2026. https://www.anthropic.com/news/claude-opus-4-8