Was Anthropic's Mythos AI hacked?

Not in the traditional sense. A small Discord group gained unauthorised access on 7 April 2026 by combining URL inference (based on naming patterns that had previously leaked via Mercor, one of Anthropic's training data partners) with contractor credentials held by at least one group member. The breach was publicly reported on 21-23 April 2026. Anthropic confirmed it was 'investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments.'

What is Project Glasswing?

Project Glasswing was Anthropic's restricted partner program that controlled access to Claude Mythos. It launched with around 50 named organisations including major corporations such as AWS, Apple, Google, Microsoft, and CrowdStrike. It was designed to keep the model away from general public access. The program's structure was reportedly the mechanism through which the breach occurred.

What is the Mercor connection to the Anthropic Mythos breach?

Mercor is an AI training data and human evaluation company that worked with Anthropic (and other major AI labs). Earlier in 2026, Mercor confirmed a data breach that exposed metadata including URL naming conventions used by Anthropic for restricted systems. A Discord group used those leaked URL patterns to infer the Mythos endpoint, combined with contractor credentials. This is the supply chain failure at the heart of the story.

How should Australian businesses respond to AI vendor security incidents?

Australian businesses covered under the Privacy Act (generally those with annual turnover above $3 million) should understand their exposure under the Notifiable Data Breaches scheme. If your vendor suffers a breach involving systems you've integrated, you may have notification obligations. Practically: review vendor contracts for breach disclosure timelines, ask vendors about their supply chain vetting including subcontractors and training partners, and don't conflate a vendor's safety research credentials with their operational security maturity.

Anthropic locked their most dangerous AI in a vault. Four guys in a Discord opened it.

Here's a compressed incident report.

Step one: announce your AI model is too dangerous for public release. Explain that you're only allowing access through a restricted partner program for vetted enterprises and government bodies. Watch the tech press write admiringly about your sense of responsibility.

Step two: watch a small Discord group access the model on the same day by guessing the URL.

Step three: issue a statement confirming you're "investigating a report claiming unauthorized access through one of our third-party vendor environments."

This isn't satire. This is what happened to Anthropic in April 2026. And I'll be honest: we've had nearly three months to process this story, and it somehow keeps getting more interesting the longer you sit with it.

I've written three other Anthropic articles this year. We covered the Fable 5 and Mythos general release in June. We covered the US government pulling both models offline on export control grounds two days after that. We were so caught up in the arc that we nearly skipped over the April chapter entirely. That chapter is the one that makes everything else make sense.

A model Anthropic called too dangerous for public release was accessed by a Discord group before the press release had finished loading. Then it got pulled offline two months later because the US government decided it was still too dangerous. Both things are true. Simultaneously.

Let's go back to the beginning.

What Anthropic was selling

On 7 April 2026, Anthropic announced Claude Mythos (sometimes called Claude Mythos Preview or Mythos 5). The positioning was striking. This wasn't a launch in the usual sense. It was more of a controlled reveal: we've built something significant, but we're not going to let most people touch it.

The UK AI Security Institute had evaluated Mythos and found a 73% success rate on expert-level cybersecurity CTF challenges. That's a category of tasks that every prior AI model had scored at or near zero on. The model could also tackle a simulated 32-step corporate network penetration sequence and discover zero-day vulnerabilities in operating systems. These weren't Anthropic's benchmarks. This was an independent finding from a UK government body.

The access arrangement was called Project Glasswing. Think of it as a velvet rope. To get in, you had to be a vetted enterprise or government partner, with around 50 named organisations in the initial cohort: AWS, Google, Microsoft, CrowdStrike and others. No general public access. No developer API. You applied, Anthropic decided, and if you were lucky you got a seat.

The rationale was straightforward. A model that can autonomously plan attacks on corporate networks is a model that needs to sit behind something more serious than a terms-of-service checkbox. This is not an unreasonable position. Genuinely.

I'd like to pause here and be transparent about something. I've now written about Anthropic more times in 2026 than I've written about any other single company. The Fable launch. The government shutdown. The Pentagon court case. And now this. Usual disclosure: we use Claude every day at Webcoda, and this site's tooling is built on it. Factor that in. I'm not running some kind of Anthropic beat journalism operation from a digital agency in Sydney. It's just that they keep doing things that are genuinely hard to ignore.

Back to the model. The capabilities were real. The restricted access approach was defensible. The press coverage was admiring. And somewhere in the same news cycle, a Discord chat was getting interesting.

How it actually happened

Here's the part that keeps me up at night, in the most literal sense, because I spent an embarrassing amount of time trying to understand how this worked mechanically.

It wasn't a traditional hack. There was no SQL injection, no social engineering campaign, no sophisticated exploit chain. What happened is almost worse than that, because it exposes a different class of problem entirely.

Earlier in 2026, Mercor confirmed a data breach. Mercor is an AI training data and human evaluation company that worked with Anthropic as a third-party training partner. The breach exposed metadata including URL naming conventions Anthropic used for restricted systems. The kind of quiet operational slip that barely makes news: some metadata, some structural patterns. The sort of thing that shows up in a minor infosec report and gets filed away.

The Discord group spotted the Mythos announcement. They had the Mercor data. They looked at the URL patterns Anthropic used for similar restricted systems, made educated guesses about how Mythos would be structured, and typed. At least one member reportedly held third-party contractor credentials through a vendor relationship with Anthropic, which made the whole exercise considerably more informed than pure guessing.

The combination worked. They were in. For hours, possibly longer. The group reportedly still had some access in some form when Bloomberg published the story weeks later.

If you can defeat the access controls on your most restricted model by combining leaked naming patterns with a contractor's credentials, you might want to reconsider calling the whole arrangement "security." I don't mean that as a cheap shot. I mean it as a diagnostic observation. A lock that opens when someone knows your naming convention and knows someone with the right contractor badge isn't really a lock.

@HighWireTalk on X put it well at the time: "The arm of the company designed to protect against unauthorised access, a restricted partner program called Project Glasswing, turned out to be the exact mechanism through which the breach occurred."

The Mercor connection is the thing that should be in every vendor risk conversation happening right now. Mercor isn't a household name. It wouldn't appear on a standard vendor risk questionnaire. But it was close enough to Anthropic's infrastructure to expose the patterns that made this breach possible. That's a supply chain problem, not a hacking problem.

Josh Kale

@JoshKale

Anthropic said Mythos was too dangerous to release. Then four random guys in a Discord gained access on day one by guessing the URL...

This is pretty insane:
→ Group in a private Discord guessed the endpoint from Anthropic's naming conventions
→ They figured out the

24.3K

22 Apr 2026

@CrYpTo_GaWd_ on X gave the cleanest summary I've seen: "not a jailbreak. not a prompt injection. just... access. that's a different kind of problem."

That's exactly right. It's a different kind of problem. And it's a harder one to fix, because it doesn't show up in the threat models most security teams are running.

This is a vendor trust story

I want to be careful here, because there are two things you can say about Anthropic that are both true but that sound contradictory.

First: Mythos is a genuinely capable model. The AISI cybersecurity benchmark is real. The capability to engage with multi-step network attack simulations is real. Anthropic's safety research is serious. They employ people who think deeply about these problems.

Second: on the day they announced the most security-sensitive model they'd ever built, a small Discord group accessed it by combining leaked URL patterns with a contractor's credentials. The group had access for weeks before it was publicly reported.

A model can be technically impressive AND have weak access controls. A company can be genuinely serious about safety research AND fail operationally. These are different functions with different teams and different metrics. Conflating them is where businesses get into trouble.

I'll be self-deprecating about this. When I first wrote about Mythos's offensive cyber capabilities, I described them with some degree of awe. The 73% cybersecurity benchmark. The zero-day discovery. I'd like to revise my position slightly. The model is impressive. The vault is not.

It's not like an open door with a sophisticated lock. It's like an open door with a "Beware of Dog" sign. The sign is doing a lot of the work, and the dog is a URL pattern you can guess from someone else's leaked metadata, if you also happen to know someone with a contractor badge.

Sam Altman's reaction, widely quoted in threads at the time, was that the broader Mythos marketing posture was "fear-based marketing." His argument was that withholding a powerful AI from the public by emphasising its danger is a competitive positioning move as much as a safety decision. Worth noting: Altman runs Anthropic's primary commercial competitor and made this comment on the same day Bloomberg broke the story. He's not a disinterested observer. His timing was pointed. The argument is still worth engaging with, but treat it as competitive positioning that happens to have a point, not as independent expert commentary.

To Anthropic's credit: they later confirmed they found "no evidence that the supposedly unauthorized activity has impacted Anthropic's systems in any way." The group's apparent purpose was curiosity, not attack. Worth acknowledging. The access was still unauthorised, and the monitoring gap remained real, but the scope of demonstrated harm is not what the worst-case reading of this story would suggest.

It's not the first time Anthropic's operational security has made headlines this year.

Abstract digital visualisation of source code flowing out of a broken container, representing the Claude Code npm leak

512,000 lines of Claude Code leaked via npm packaging error

Anthropic shipped their entire Claude Code source in every npm install. A missing config line exposed KAIROS, 44 hidden features, and a stealth mode...

Read full article

The Mercor breach was confirmed on 31 March. Mythos launched on 7 April, six days later. Whether Mercor specifically disclosed to Anthropic that URL naming conventions were among the exfiltrated data in that window isn't confirmed by available reporting. If they didn't, the stronger criticism is about what Anthropic's vendor contracts require Mercor to disclose, not about Anthropic failing to rotate endpoints it may not have known were compromised. Either way, an operational maturity failure. Just one that may belong more to the Mercor-Anthropic contract's breach notification terms than to Anthropic's response alone. They're different targets, and precision matters.

What this means for vendor selection

I'm going to make this practical, because the abstract conversation about AI safety can obscure what businesses actually need to do.

The first question to ask your AI vendor is not "do you take security seriously?" Everyone answers yes. The useful questions are more specific.

Ask about contractor and training partner vetting. Mercor wouldn't have appeared on a standard vendor questionnaire. "Do you use third-party training data partners, and what access do those partners have to your URL and infrastructure conventions?" is not a question most vendor selection processes include. It should be.

Ask about anomaly detection, not just incident response. When Bloomberg reported the breach on 21 April, Anthropic confirmed it was investigating. But here's the detail that matters: Anthropic learned about the breach from Bloomberg, not from their own monitoring. Nearly two weeks had passed. The group reportedly still had some access. Anthropic's own vendor-tier systems apparently hadn't flagged it. The question isn't whether your vendor responds quickly once they know. It's whether their monitoring is capable of detecting this class of access at all. Ask: "How would you know if someone was using a restricted endpoint through an unauthorised pathway?" If the answer involves Bloomberg calling them, that's your answer.

Ask about the difference between safety research and security operations. They're genuinely different disciplines. A vendor can have world-class safety researchers and an operations team that hasn't rotated URL conventions after a supply chain metadata exposure. The marketing materials won't tell you which situation you're in.

For Australian businesses that have integrated AI vendor APIs into your systems: if you're covered under the Privacy Act (generally annual turnover above $3 million), there's a Notifiable Data Breaches angle worth understanding. Under APP 8.1, if an Australian business shares personal data with a third-party AI service that then suffers a breach, you may have assessment and notification obligations, not just the vendor. Your vendor contracts need to specify breach disclosure timelines. If they don't, that's a gap worth closing now rather than after something happens.

A futuristic AI microprocessor cube locked and restricted by a heavy dark metallic security clamp with a translucent red warning overlay, representing an AI model shutdown by government directive

The US government just pulled Claude Mythos offline globally. Three days after Anthropic launched it.

Five days ago we published three predictions about Claude Fable 5. One has already resolved wrong: a US government export control directive pulled...

Read full article

I'm not saying don't use Anthropic. I use Anthropic products. We use them at Webcoda. I'm saying read every press release about AI safety as marketing, then ask what the operational controls actually look like. Because those are two separate documents, written by two different teams, and they don't always describe the same company.

The April chapter

We're writing this in June. The June chapters are dramatic: Fable 5 released to the public, then pulled by the US government on export control grounds two days later. That's genuinely wild. But the April chapter is the one that makes the whole story coherent.

A model that can find patterns in expert-level cybersecurity challenges was accessed because a group combined leaked URL patterns with contractor credentials. The AI that tackles 32-step corporate network attack simulations got inside because a training partner's breach included the right naming conventions. The model marketed on the premise of dangerous capability couldn't protect its own endpoint.

The joke writes itself, but I'll resist writing it too loudly. The lesson underneath the joke is serious.

Vendor branding is not vendor security. A company's capability claims and a company's operational maturity are different things, measured differently, maintained by different people. When your vendor says "trust us, we're careful," the useful follow-up question is: careful about what, exactly? The research? The deployment? The third-party training partners? The URL conventions?

If your AI vendor's response to the Mythos story is "that wouldn't happen to us," ask them how they know. Ask about their training partners. Ask what URL conventions they're using right now. Ask about their rotation policy after any metadata exposure. If they can't answer, you have your answer.

My prediction: Anthropic will publish formal, updated partner vetting and access control documentation before the end of 2026. If they don't, that absence will be its own kind of statement. I'll note this and check back. (If they prove me right, I'll be relieved. If they prove me wrong, I'll write about it.)

This was the April chapter. The June chapters were louder. But the April chapter is the one where the interesting structural question actually lives: can you trust a vendor's safety marketing? And what does it cost if you can't?

---

Sources

Bloomberg. "Anthropic's Mythos Model Is Being Accessed by Unauthorized Users." 21 April 2026. https://www.bloomberg.com/news/articles/2026-04...
TechCrunch. "Unauthorized Group Has Gained Access to Anthropic's Exclusive Cyber Tool, Report Claims." 21 April 2026. https://techcrunch.com/2026/04/21/unauthorized-...
Fortune. "Anthropic Mythos Leak: Dario Amodei CEO, Cybersecurity, Hackers, AI." 23 April 2026. https://fortune.com/2026/04/23/anthropic-mythos...
Engadget. "Anthropic Is Investigating Unauthorized Access of Its Mythos Cybersecurity Tool." April 2026. https://www.engadget.com/ai/anthropic-is-invest...
Cybernews. "Anthropic Mythos AI: Unauthorized Access." April 2026. https://cybernews.com/security/anthropic-mythos...
UK AI Security Institute. "Our Evaluation of Claude Mythos Preview's Cyber Capabilities." April 2026. https://www.aisi.gov.uk/blog/our-evaluation-of-...
Security Magazine. "AI Startup Mercor, Which Works with OpenAI and Anthropic, Confirms Data Breach." 2026. https://www.securitymagazine.com/articles/10220...
TechCrunch. "Sam Altman Throws Shade at Anthropic's Cyber Model Mythos: 'Fear-Based Marketing'." 21 April 2026. https://techcrunch.com/2026/04/21/sam-altman-th...
@JoshKale on X, 22 April 2026. https://x.com/JoshKale/status/2046774243799511156
@MarioNawfal on X, 22 April 2026. https://x.com/MarioNawfal/status/20467985440324...