A Fake OpenAI Model Hit #1 on Hugging Face. It Was Malware.

A malicious Hugging Face repository managed to hit the platform's top trending position by impersonating OpenAI's recently released Privacy Filter model, delivering a credential-stealing payload to Windows users before the platform pulled it down.

Researchers at HiddenLayer discovered the campaign on May 7 after noticing a repository called Open-OSS/privacy-filter sitting among the platform's most downloaded projects. The fake repo had copied OpenAI's legitimate model card nearly verbatim and shipped a Python loader that, once executed, fetched and ran an information stealer on Windows machines.

Before Hugging Face removed it, the repository had accumulated approximately 244,000 downloads and 667 likes within 18 hours. HiddenLayer suspects both numbers were artificially inflated using bot accounts to create an illusion of legitimacy.

What the Malware Actually Does

The attack chain is fairly sophisticated. The repository's README differed from OpenAI's original in one key way: it instructed users to clone the repo and run a batch file on Windows or execute a Python script directly. That loader file disguised its malicious code behind decoy functions that mimicked a real model loader, complete with a DummyModel class and fake training output.

Hidden inside was a function that disabled SSL verification, decoded a Base64-encoded URL pointing to JSON Keeper, a public paste service, and pulled down a command to execute via PowerShell. Using a paste service as the command-and-control channel allowed the attackers to swap payloads without modifying the repository itself.

The PowerShell command downloaded a batch script that elevated its privileges through a UAC prompt, configured Microsoft Defender exclusions, and set up a scheduled task to launch the final payload. The infostealer harvested data from Discord, cryptocurrency wallets and extensions, system metadata, browser data from Chromium and Gecko-based browsers, FileZilla configurations, and wallet seed phrases. It also checked for debuggers, sandboxes, and virtual machines to avoid analysis, and attempted to disable Windows security telemetry.

A Broader Campaign

This wasn't an isolated incident. HiddenLayer identified six additional repositories under the same account, all uploaded on April 24, that used nearly identical loader infrastructure and the same command-retrieval URL. The researchers also found infrastructure overlaps with an npm typosquatting campaign that had previously distributed the WinOS 4.0 implant, suggesting the attacks are part of a broader supply chain operation targeting open-source ecosystems.

The attackers reportedly promoted the malicious repositories through LinkedIn, Reddit, and SEO manipulation to ensure searches for OpenAI tools surfaced the fake projects.

AI Platforms as Attack Surfaces

Hugging Face has become critical infrastructure for local AI deployment. Ollama pulls from it. Research pipelines reference it. Custom inference stacks frequently download models directly from Hub URLs in deployment scripts. That makes it an obvious target.

This isn't the first time. Security firms JFrog and ReversingLabs identified models containing hidden backdoors on the platform as early as 2024. Acronis recently documented active malware distribution campaigns abusing both Hugging Face and the ClawHub AI agent registry, finding hundreds of malicious skills across both platforms. Protect AI, which has partnered with Hugging Face to scan its model library, has examined more than four million models and identified approximately 352,000 unsafe or suspicious issues.

The underlying problem is structural. Model hubs operate on the same trust assumptions as npm and PyPI before those ecosystems hardened against supply chain attacks. Users download files, run install scripts, and assume the repository host has performed baseline verification. In practice, Hugging Face scans for known malware signatures but does not comprehensively sandbox arbitrary Python in model repositories.

OpenAI's legitimate Privacy Filter launched on April 22 under an Apache 2.0 license. It's a 1.5 billion parameter model designed to detect and redact personally identifiable information in text. The irony of attackers using a privacy tool as cover for credential theft is not lost on anyone.

Anyone who cloned Open-OSS/privacy-filter and executed start.bat or loader.py on Windows should treat the system as fully compromised. HiddenLayer recommends reimaging over cleanup, rotating all stored credentials, invalidating browser sessions and tokens, and replacing cryptocurrency wallets and seed phrases. Do not log into anything from the affected machine before wiping it.

The incident points to a gap between how fast the AI industry is building and how slowly security infrastructure is catching up. Model provenance, digital signatures, and sandboxed execution are table stakes for package managers. They remain optional for most of the repositories powering production AI systems.

A Fake OpenAI Model Hit #1 on Hugging Face. It Was Malware.

What the Malware Actually Does

A Broader Campaign

AI Platforms as Attack Surfaces

Related Stories

Stay ahead of the signal