I watched a client's IT manager turn pale last month. He'd just discovered that three of his developers had been pasting proprietary source code into ChatGPT for months. Not maliciously. They were just trying to debug faster. But every line of that code now sits on OpenAI's servers, potentially training future models, definitely outside his company's control.

That's when he asked me: "Can we run something like ChatGPT ourselves? On our own hardware?"

The answer used to be "not really." The hardware was too expensive, the models too limited, the performance too slow. But something's changed in 2025, and it's changed dramatically. Running genuinely capable AI locally isn't just possible anymore. For many Australian businesses, it's becoming essential.

The 77% Problem: Your Data Is Already Leaking

Let's start with the uncomfortable truth. According to LayerX Security's Enterprise AI and SaaS Data Security Report 2025, 77% of employees paste corporate data into AI chatbots. That's not a typo. More than three-quarters of your workforce is potentially exposing sensitive information to third-party AI systems.

But here's what really keeps me up at night: 82% of those paste events come from unmanaged personal accounts, completely outside enterprise oversight (LayerX Security, 2025). Your IT department can't see it happening. Your security team can't stop it. And your compliance officer is probably having nightmares about it right now.

Harmonic Security's Q2 2025 research makes it even clearer: 72.6% of sensitive AI prompts go through ChatGPT. Nearly 22% of uploaded files contain sensitive content. And 26.3% of sensitive data still flows through free ChatGPT accounts with zero enterprise protections.

AI has become the number one data exfiltration channel in the enterprise, surpassing file sharing, email, and every other channel that security teams have spent years trying to secure (The Hacker News, 2025). It's not malware. It's not hackers. It's your own employees, trying to be productive.

And here's the cost: IBM's 2025 Cost of a Data Breach Report found that having high levels of "shadow AI" (where workers use unapproved AI tools) adds an extra USD $670,000 to the average breach cost (IBM, 2025). That's not the total breach cost. That's the additional cost just from shadow AI exposure.

Meanwhile, only 17% of organisations have technical controls that actually block sensitive data from entering public AI tools (Kiteworks, 2025). The other 83%? They're running on the honour system, hoping employees won't do anything catastrophic with company data.

Samsung, OpenAI Credentials, and the Incidents That Should Scare You

Theory is one thing. Let's talk about what's actually happened.

In April 2023, Samsung allowed employees at its semiconductor division to use ChatGPT. Within three weeks, they'd recorded three separate incidents of confidential data exposure (TechCrunch, 2023). The first engineer pasted faulty source code from a chip measurement database to find a fix. The second uploaded code for identifying defective equipment seeking optimisation. The third converted a meeting recording to text and asked ChatGPT to generate meeting minutes.

All three employees are now facing disciplinary action. Samsung banned ChatGPT company-wide. But here's the thing: that data is gone. It's been absorbed into OpenAI's systems, potentially used for training, and Samsung has no way to retrieve it or guarantee its deletion (Cybernews, 2023).

Then there's the credential situation. Between January and October 2023, over 225,000 compromised ChatGPT credentials appeared for sale on dark web markets (The Hacker News, 2024). These weren't stolen from OpenAI directly. They came from devices infected with infostealer malware like LummaC2, Raccoon, and RedLine. But the result is the same: attackers potentially accessing entire conversation histories, including whatever sensitive information employees had shared.

By 2024, KELA collected over 3 million compromised OpenAI accounts (SC Media, 2024). Three million accounts worth of conversation history, accessible to whoever pays for the credentials.

And regulators are taking notice. In December 2024, Italy's data protection authority fined OpenAI EUR 15 million for GDPR violations (Euronews, 2024). The Italian watchdog found that OpenAI processed users' personal data for training without adequate legal basis, failed to notify regulators of a security breach, and lacked age verification mechanisms. OpenAI called the fine "disproportionate" and is appealing, but the precedent is set.

This isn't abstract anymore. This is regulators actively pursuing AI companies for data handling practices. And Australian regulators are watching closely.

December 2026: The Australian Deadline Nobody's Talking About

Australia's Privacy and Other Legislation Amendment Act 2024 received Royal Assent on 10 December 2024 (Norton Rose Fulbright, 2024). Most provisions took effect immediately. But one major change has a two-year grace period ending 10 December 2026: the automated decision-making transparency requirements.

Here's what that means for businesses using AI. You'll need to update your privacy policy to disclose when decisions are made using automated processes that significantly affect individuals' rights or interests (Law Society Journal, 2025). That includes decisions about granting benefits, affecting contract rights, or impacting access to significant services.

The catch? Once these amendments come into effect, they'll apply to all automated decisions, regardless of whether the system was set up before or after the law's commencement (MinterEllison, 2024). You can't grandfather in your existing AI systems.

The new three-tiered penalty structure should focus minds. Tier one administrative failures can attract infringement notices up to $330,000 per contravention. Tier two "non-serious" privacy interferences carry penalties up to $3.3 million for companies (Hall & Wilcox, 2024). And serious interferences? Up to $50 million, three times the benefit obtained, or 30% of annual turnover for the relevant period.

On 29 September 2025, Australian Clinical Labs agreed to pay $5.8 million following a data breach that affected 223,000 customers (Clyde & Co, 2025). The OAIC alleged "serious and systemic failures" that left the company vulnerable to cyberattack. If that's the price for a healthcare data breach, what's the price for pumping client data through an uncontrolled foreign AI system?

For government contractors and healthcare organisations, there's an additional consideration: data sovereignty. When your client contracts specify that data must remain in Australia, running queries through OpenAI's US-based infrastructure isn't just a compliance risk. It's a potential contract breach.

I've had government clients ask me point-blank: "If we use ChatGPT for document summarisation, are we sending protected data offshore?" The honest answer is yes. And for many contracts, that's a deal-breaker.

Professional visualization of C2PA content credentials with verification badge and cryptographic trust elements
Related Article11 min read

The Truth Layer: Content Credentials and The End of Deepfakes

How C2PA content credentials are creating a cryptographic trust layer for digital media, protecting Australian businesses from the $2 billion...

Read full article

Local AI in 2026: What Can Actually Run on Your Hardware?

This is where things get interesting. The AI you can run locally in 2026 isn't the same as what was available even a year ago.

Meta's Llama 3.3 70B, released in December 2024, delivers performance comparable to Llama 3.1 405B with significantly lower computational demands (DataCamp, 2024). It supports a 128,000-token context window, covers eight languages natively, and achieves scores competitive with GPT-4o class models on multiple benchmarks.

On MMLU Pro, a challenging benchmark for general knowledge, Llama 3.3 70B scores 68.9. On MATH, it hits 77.0, outperforming both its predecessor and Amazon Nova Pro (Artificial Analysis, 2024). For coding tasks, it achieves 88.4 on HumanEval, just behind the much larger 405B model.

Then Meta dropped Llama 4 in April 2025 (TechCrunch, 2025). Llama 4 Scout, with 17 billion active parameters across 16 experts (109 billion total), supports context windows up to 10 million tokens and fits on a single H100 GPU. Llama 4 Maverick packs 400 billion total parameters with only 17 billion active at inference, beating GPT-4o and Gemini 2.0 Flash across broad benchmarks while using less than half the active parameters of DeepSeek v3.

Mistral Large 3, released in December 2025, takes things even further: 675 billion total parameters with 41 billion active (sparse MoE architecture), a 256,000-token context window, and full Apache 2.0 licensing (Mistral AI, 2025). It's the strongest fully open model developed outside China, with frontier-level performance in coding, multilingual tasks, and multimodal understanding.

And that's just scratching the surface. Here's what else dropped in the second half of 2025:

GLM-4.7 from Zhipu AI (December 2025) brings 355 billion parameters under an MIT license, with 200,000-token context and particularly strong coding capabilities. It's the only frontier-scale model you can fully self-host and customise without API lock-in (Zhipu AI, 2025).

DeepSeek-V3.1 (August 2025) packs 671 billion total parameters with only 37 billion active per token. The clever bit? It switches between "thinking" and "non-thinking" modes depending on task complexity. MIT licensed, runs on local infrastructure, and matches frontier cloud models on most benchmarks (DeepSeek AI, 2025).

Qwen3 from Alibaba (April 2025, with updates through September) offers up to 235 billion parameters, supports 119 languages natively, and carries Apache 2.0 licensing. The 1-million-token context window in Qwen3-2507 is genuinely useful for processing entire codebases or lengthy documents (Alibaba Cloud, 2025).

For smaller deployments, Ministral 3 (December 2025) runs on just 4GB of VRAM. That means laptops, smartphones, even embedded systems. Completely offline operation, no cloud required. It's not as capable as the big models, but for edge deployment and privacy-critical applications, it's a genuine option (Mistral AI, 2025).

The point isn't that these models beat GPT-5 at everything. They don't, necessarily. The point is that they're good enough for most business use cases, and you can run them on hardware you control.

For tasks like document summarisation, code review, customer service draft responses, and internal knowledge base queries, local models now deliver production-quality results. You're not settling for some hobbled version of AI. You're getting genuinely capable systems that never phone home.

The Hardware Economics That Changed Everything

I remember when running a 70-billion parameter model meant spending $200,000 on enterprise hardware. That's not the world we live in anymore.

The NVIDIA RTX 5090, released in January 2025, packs 32GB of GDDR7 memory with approximately 1,792 GB/s bandwidth (NVIDIA, 2025). That's a 77% improvement over the 4090. The MSRP sits at $1,999, though real-world availability has pushed prices to $2,500-$3,800 depending on the supplier (Tom's Hardware, 2025).

With 32GB of VRAM, the 5090 can run quantised 70-100B parameter models locally. Not smoothly in full precision, but with 4-bit quantisation (which maintains excellent output quality), you're looking at roughly 35-43GB requirements for a 70B model (Local AI Master, 2025). That's at the edge of a single 5090, but achievable with aggressive quantisation or offloading.

For dual-GPU setups, two RTX 4090s (48GB total) or two 5090s (64GB total) handle 70B models comfortably. Real-world testing shows the RTX 4090 achieving 13-60 tokens per second on Llama 70B models depending on quantisation level (LocalLLM.in, 2025). That's not cloud-fast, but it's certainly usable for real work.

Let me run some rough numbers. A capable local AI setup might look like this:

Budget Tier (Around $6,500 AUD):

  • RTX 4090 24GB (used): $2,500
  • 64GB DDR5 RAM: $900
  • High-speed SSD: $400
  • Supporting hardware: $1,200
  • Contingency for RAM volatility: $500
  • Result: Run 30-40B models smoothly, smaller 70B models with heavy quantisation

Professional Tier (Around $15,000 AUD):

  • 2x RTX 4090 24GB: $5,000
  • 128GB DDR5 RAM: $2,000
  • NVMe storage array: $800
  • Workstation motherboard and CPU: $3,000
  • Power supply and cooling: $1,000
  • Result: Run 70B models at good speeds, 100B+ with quantisation

Enterprise Tier (Around $38,000 AUD):

  • 2x RTX 5090 32GB: $10,000
  • Professional workstation base: $8,000
  • 256GB DDR5 RAM: $3,500
  • High-speed storage: $2,000
  • Enterprise support and redundancy: $6,000
  • Result: Run large models with full context windows, multiple concurrent users

A quick note on RAM pricing, because it's genuinely absurd right now. As I write this in early 2026, DDR5 prices have increased 80-130% since September 2025. A 64GB kit that cost $300 last year now runs $900 or more. Why? Because every memory manufacturer on the planet is prioritising AI data centre chips over consumer RAM. Micron literally shut down their Crucial consumer brand to focus entirely on AI. (I had a client delay their build by three months hoping prices would drop. They didn't. They went up another 30%.) The forecasts suggest this won't stabilise until late 2026 at the earliest, possibly 2028. So if you're planning a build, factor in the RAM situation. It's the hidden cost that catches everyone off guard.

Compare that to cloud AI costs. ChatGPT Enterprise runs roughly $60 per user per month. For a 50-person company, that's $36,000 annually, $108,000 over three years (CloudZero, 2025). And that assumes no usage overages.

The professional tier setup, at $15,000 one-time cost plus maybe $2,000 annually in electricity and maintenance, still pays for itself in under two years compared to enterprise cloud subscriptions. After that, it's effectively free. (Well, free until you need to upgrade RAM again. At current prices, that stings.)

But cost isn't really the point. The point is that your data never leaves your building. Your queries never get logged on someone else's server. Your compliance posture becomes dramatically simpler because you control the entire chain.

Cinematic ultra-modern boardroom scene with a panoramic view of a nighttime metropolis, featuring a diverse team of executives collaborating around a large, golden holographic digital globe representing global AI investment
Related Article10 min read

The $2 Trillion Bet: Your Business Will Have AI Employees by Next Year

Global AI spending is projected to hit $2 trillion in 2026. But here's what nobody's telling you: most of that money isn't buying chatbots. It's...

Read full article

Implementation Roadmap: From Cloud-Dependent to Data Sovereign

I've helped several clients through this transition. Here's what actually works.

Phase 1: Audit Your Current AI Usage (Week 1-2)

You can't fix what you can't see. Start by surveying actual AI tool usage across your organisation. I promise it'll be worse than you think. (At one client, we found 17 different AI tools in use across 200 employees. Seventeen. Nobody in IT knew about 14 of them.)

Document what's being used for what purpose. Customer service drafting? Code assistance? Document summarisation? Research? Each use case has different requirements and different risk profiles.

Identify your highest-risk data flows. Where's confidential client data potentially entering AI systems? Where's intellectual property going? Where's protected health information or financial data at risk?

Phase 2: Deploy Sandboxed Local AI for Testing (Week 3-6)

Don't try to replace everything at once. Pick one moderate-risk use case and deploy a local alternative for testing. Ollama makes this remarkably straightforward for basic setups (Ollama Documentation, 2025).

A Llama 3.3 8B model runs on surprisingly modest hardware and gives you a genuine sense of what local AI feels like. It's not as capable as the 70B version, but it'll handle most summarisation and simple Q&A tasks.

Get feedback from actual users. What works? What's frustratingly slow? What capabilities are missing? This informs your hardware decisions for production deployment.

Phase 3: Size Your Production Hardware (Week 7-8)

Based on your pilot, determine what you actually need. If you're doing basic document processing for a small team, a single RTX 4090 might suffice. If you're running a customer service operation with multiple concurrent users, you're looking at dual high-end GPUs or potentially a dedicated inference server.

Consider your growth trajectory. Hardware that barely handles today's load won't serve you in two years. Budget for headroom.

Phase 4: Production Deployment with Governance (Week 9-12)

This isn't just a technology project. It's a governance project. Document your AI usage policies. Train your staff on what's appropriate for local AI versus what still needs human review. Implement logging and auditing so you can demonstrate compliance.

Set up monitoring. How's the system performing? Are users actually using it, or reverting to ChatGPT because it's faster? Where are the bottlenecks?

Phase 5: Hybrid Strategy for Remaining Use Cases (Ongoing)

Not everything needs to run locally. For non-sensitive research queries, cloud AI might still make sense. The goal isn't zero cloud AI. The goal is zero cloud AI touching sensitive data.

Create clear policies about what goes where. Make it easy for employees to use the right tool for the right task. If your local AI is slow and awkward to access, people will route around it.

The Honest Limitations (Because Nothing's Perfect)

I'd be doing you a disservice if I didn't acknowledge the downsides.

Local AI is slower than cloud AI. Not dramatically slower, but noticeably. When you're used to cloud AI's response speed, waiting an extra few seconds for each query feels sluggish. For batch processing, this matters less. For interactive work, it's a real consideration.

The largest models still require significant hardware. You're not running a full Llama 4 Behemoth on a desktop machine. The 400B-parameter class models need enterprise infrastructure. For most businesses, that means working with smaller models or accepting some capability tradeoffs.

Setup and maintenance require technical expertise. This isn't plug-and-play like signing up for ChatGPT. You need someone who understands GPU configurations, model quantisation, and inference optimisation. For smaller businesses without internal IT depth, this can be a real barrier.

And you're responsible for updates and security. When OpenAI patches a vulnerability, they roll it out automatically. When you're running local infrastructure, patching is your job. That's both a feature (you control the timeline) and a burden (you can't ignore it).

Split visualisation showing chaotic failed AI projects (80%) versus organised successful deployments (20%) on a boardroom table
Related Article7 min read

The GenAI Paradox: Why 80% of AI Projects Show No Bottom-Line Impact

Most enterprises have adopted AI, yet only 20% see meaningful business impact. Here's why GenAI projects fail, and what successful ones do...

Read full article

Key Takeaways

For IT Decision-Makers:

  • 83% of organisations lack technical controls for AI data governance. Assess your current exposure before regulators do.
  • Local AI hardware costs have dropped to the point where three-year TCO often favours on-premise deployment, especially for consistent workloads.
  • Start with a sandboxed pilot on moderate-risk use cases before committing to full deployment.

For Business Owners:

  • AI data exfiltration is now the top channel for corporate data leakage, ahead of email and file sharing.
  • The December 2026 Privacy Act deadline requires disclosure of automated decision-making. Get your governance house in order before then.
  • Shadow AI usage is probably worse in your organisation than you think. Survey actual behaviour, not just official policy.

For Compliance Officers:

  • Italy's EUR 15 million fine against OpenAI signals regulatory willingness to pursue AI companies. Australian regulators are watching.
  • The new three-tiered penalty structure includes up to $3.3 million for non-serious privacy interferences, $50 million for serious ones.
  • Local AI deployment doesn't eliminate compliance obligations, but it simplifies data sovereignty and control documentation.

I'll be honest: I'm still figuring out where this all lands. The technology is moving faster than regulatory frameworks, faster than organisational change management, faster than most businesses can absorb. Some of what I've written here will probably be outdated within six months.

But here's what I'm confident about. The days of casually piping sensitive business data through cloud AI without consequences are ending. Australian regulators have new tools and new appetite for enforcement. And for the first time, genuinely capable alternatives exist.

Whether you deploy local AI, implement strict cloud AI governance, or find some hybrid approach, doing nothing isn't really an option anymore. Not with 77% of your employees already sharing data with AI systems you don't control.

Welcome to the club. We're all figuring this out together.

Golden artisan seal glowing with amber light centered amidst a sea of uniform grey digital blocks, symbolizing human authenticity in the AI era.
Related Article10 min read

The Human-Verified Badge: Why 'AI-Free' Is the New Luxury Brand Signal

In a world where 74% of new web content is AI-generated, human authenticity has become the scarcest commodity. Here's how smart brands are...

Read full article

---

Sources
  1. LayerX Security. "The LayerX Enterprise AI & SaaS Data Security Report 2025". 2025. https://go.layerxsecurity.com/the-layerx-enterp...
  1. LayerX Security. "AI Is Now the #1 Data Exfiltration Vector in the Enterprise". October 2025. https://layerxsecurity.com/blog/ai-is-now-the-1...
  1. Harmonic Security. "GenAI Data Exposure Report Q2 2025". August 2025. https://www.harmonic.security/blog-posts/genai-...
  1. The Hacker News. "New Research: AI Is Already the #1 Data Exfiltration Channel in the Enterprise". October 2025. https://thehackernews.com/2025/10/new-research-...
  1. IBM. "2025 Cost of a Data Breach Report: Navigating the AI Rush". 2025. https://www.ibm.com/think/x-force/2025-cost-of-...
  1. Kiteworks. "The 2025 AI Security Gap: Why 83% of Organizations Are Flying Blind". 2025. https://www.kiteworks.com/cybersecurity-risk-ma...
  1. TechCrunch. "Samsung bans use of generative AI tools like ChatGPT after April internal data leak". May 2023. https://techcrunch.com/2023/05/02/samsung-bans-...
  1. Cybernews. "Lessons learned from ChatGPT's Samsung leak". 2023. https://cybernews.com/security/chatgpt-samsung-...
  1. The Hacker News. "Over 225,000 Compromised ChatGPT Credentials Up for Sale on Dark Web Markets". March 2024. https://thehackernews.com/2024/03/over-225000-c...
  1. SC Media. "ChatGPT credentials snagged by infostealers on 225K infected devices". 2024. https://www.scworld.com/news/chatgpt-credential...
  1. Euronews. "Italy's privacy watchdog fines OpenAI EUR 15 million after probe into ChatGPT data collection". December 2024. https://www.euronews.com/next/2024/12/20/italys...
  1. Norton Rose Fulbright. "Australian Privacy Alert: Parliament passes major and meaningful privacy law reform". December 2024. https://www.nortonrosefulbright.com/en/knowledg...
  1. Law Society Journal. "The countdown is on for automated decision making transparency requirements". 2025. https://lsj.com.au/articles/the-countdown-is-on...
  1. MinterEllison. "Privacy and Other Legislation Amendment Act 2024 now in effect". 2024. https://www.minterellison.com/articles/privacy-...
  1. Hall & Wilcox. "Privacy penalties: the beginning of a new era". 2024. https://hallandwilcox.com.au/news/privacy-penal...
  1. Clyde & Co. "Cyber and privacy law update: accountability gets real". October 2025. https://www.clydeco.com/en/insights/2025/10/cyb...
  1. DataCamp. "What Is Meta's Llama 3.3 70B? How It Works, Use Cases & More". December 2024. https://www.datacamp.com/blog/llama-3-3-70b
  1. Artificial Analysis. "Llama 3.3 70B: Intelligence, Performance & Price Analysis". 2024. https://artificialanalysis.ai/models/llama-3-3-...
  1. TechCrunch. "Meta releases Llama 4, a new crop of flagship AI models". April 2025. https://techcrunch.com/2025/04/05/meta-releases...
  1. Meta AI. "The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation". April 2025. https://ai.meta.com/blog/llama-4-multimodal-int...
  1. Mistral AI. "Introducing Mistral 3". December 2025. https://mistral.ai/news/mistral-3
  1. NVIDIA. "GeForce RTX 5090 Graphics Cards". January 2025. https://www.nvidia.com/en-us/geforce/graphics-c...
  1. Tom's Hardware. "Nvidia announces RTX 5090 for $1,999, 5070 for $549". January 2025. https://www.tomshardware.com/pc-components/gpus...
  1. Local AI Master. "AI Hardware 2025: RTX 5090 vs 4090 Setup Guide". 2025. https://localaimaster.com/blog/ai-hardware-requ...
  1. Introl Blog. "Local LLM Hardware Guide 2025: Pricing & Specifications". 2025. https://introl.com/blog/local-llm-hardware-pric...
  1. LocalLLM.in. "Ollama VRAM Requirements: Complete 2025 Guide to GPU Memory for Local LLMs". 2025. https://localllm.in/blog/ollama-vram-requiremen...
  1. LocalLLM.in. "How to Run a Local LLM: A Comprehensive Guide for 2025". 2025. https://localllm.in/blog/how-to-run-local-llm-g...
  1. Markaicode. "Cloud vs Local AI: Total Cost of Ownership Comparison 2025". 2025. https://markaicode.com/cloud-vs-local-ai-cost-c...
  1. Tom's Hardware. "The RAM pricing crisis has only just started, Team Group GM warns". December 2025. https://www.tomshardware.com/pc-components/dram...
  1. TrendForce. "64GB DDR5 RAM reportedly now pricier than a PlayStation 5 amid soaring memory costs". November 2025. https://www.trendforce.com/news/2025/11/27/news...
  1. G.SKILL. "G.SKILL addresses sharp DDR5 RAM price increases for 2025-2026". 2025. https://www.neowin.net/news/gskill-addresses-sh...
  1. Zhipu AI. "GLM-4.7: Advancing the Coding Capability". December 2025. https://z.ai/blog/glm-4.7
  1. DeepSeek AI. "DeepSeek-V3 GitHub Repository". 2025. https://github.com/deepseek-ai/DeepSeek-V3
  1. Alibaba Cloud. "Qwen3 GitHub Repository". 2025. https://github.com/QwenLM/Qwen3