Hugging Face Transformers & Hub: Supply-Chain Risks and Real Advisories

The Hugging Face Hub is the default distribution channel for models, datasets, and the transformers library that ties them together. That makes it the supply chain for a huge fraction of production ML — and supply chains are where attackers go. The verified CVEs below sort cleanly into two buckets: loading something runs code and processing something hangs the service. Knowing which bucket an advisory falls into tells you how worried to be. Every CVE here was checked against its NVD record before being cited.

Bucket one: deserialization RCE in Transformers

The most dangerous Transformers advisories are a cluster fixed in version 4.48.0, all the same root cause — deserialization of untrusted data reaching code execution.

CVE-2024-11392 ↗ (CVSS 8.8): deserialization of untrusted data within configuration file handling, from lack of proper validation of user-supplied data, allowing code execution in the context of the current user. Affected versions prior to 4.48.0.
CVE-2024-11393 ↗ (CVSS 8.8): the MaskFormer model deserialization RCE, allowing remote attackers to execute arbitrary code on affected installations. Fixed in 4.48.0.
CVE-2024-11394 ↗ (CVSS 8.8): the Trax model deserialization RCE, same shape, same fix version.

There is also an earlier entry of the same family: CVE-2023-6730 ↗ (CVSS 8.8), deserialization of untrusted data in huggingface/transformers prior to 4.36. That this exact bug class produced a fix in 4.36 and then again across 4.48 is the supply-chain lesson in miniature: the library keeps finding new model and config types whose loading path trusts data it should not.

The thing that makes these worse than a generic library bug is the delivery mechanism. The Hub is built around from_pretrained pointing at a repository you did not author. A config or model artifact is exactly the “untrusted data” these CVEs deserialize. The exploit does not need to compromise your network — it needs you to download a model, which is the entire point of the platform.

The newest entry in the same lineage

The pattern is still producing CVEs. CVE-2026-1839 ↗ (CVSS 7.8, published April 2026) is a code-execution vulnerability in the Transformers Trainer class: its _load_rng_state() method calls torch.load() without weights_only=True, so a malicious checkpoint file (an rng_state.pth dropped into a resumed training run) can execute code on load. Resolved in v5.0.0rc3. This is the PyTorch torch.load problem wearing a Transformers hat — proof that the two ecosystems’ deserialization risks compound rather than stay in their lanes.

Bucket two: ReDoS — denial of service, not code execution

A second, less alarming cluster is regular-expression denial of service in tokenizers. These hang a worker on a crafted input; they do not run code.

CVE-2024-12720 ↗ (CVSS 7.5): ReDoS in tokenization_nougat_fast.py, fixed before 4.48.0.
CVE-2025-1194 ↗ (CVSS 6.5): ReDoS in tokenization_gpt_neox_japanese.py, fixed before 4.50.0.
CVE-2025-2099 ↗ (CVSS 7.5): ReDoS in the preprocess_string() function of transformers.testing_utils, affecting 4.48.3 and earlier.

Why separate them out? Because triage differs sharply. A ReDoS in a tokenizer you do not load (most teams use a handful of model families, not the Nougat or GPT-NeoX-Japanese tokenizers) is not applicable to your deployment, even at CVSS 7.5. The deserialization CVEs above apply the moment you from_pretrained an untrusted repo; the ReDoS CVEs apply only if your service feeds attacker-controlled text through that specific tokenizer. The score does not capture that gap — your model inventory does.

The protection layer can fail too

The uncomfortable third category: the tools you deploy to detect malicious models have their own CVEs. CVE-2025-1944 ↗ (CVSS 6.5) describes how the model scanner picklescan before 0.0.23 is vulnerable to a ZIP archive manipulation attack: by altering the filename in the ZIP header while keeping the original in the directory listing, an attacker makes the scanner raise a BadZipFile error and crash — yet PyTorch’s more forgiving ZIP implementation still loads the model, letting a malicious payload bypass detection entirely. CWE-345, insufficient verification of authenticity.

This is the most important entry in the post for a defender, because it punctures the comfortable assumption that “the Hub scans uploads, so I’m covered.” Scanners are heuristic and have bypasses. They are one layer, not the layer.

What this means for trusting the Hub

The Hugging Face ecosystem’s risk is not that it is uniquely insecure — it is that its core workflow is download and execute artifacts authored by strangers, which is precisely the action every deserialization CVE above exploits. Practical posture:

Upgrade Transformers past the fix lines — 4.48.0 clears the 11392/11393/11394/12720 cluster; 4.50.0+ clears CVE-2025-1194; the 4.36 line is far too old. The Trainer checkpoint fix (CVE-2026-1839) lands in the v5 line.
Prefer safetensors for weights so the model side of from_pretrained cannot carry a code-execution payload. This is the structural fix for the whole RCE bucket.
Pull models by digest, not floating tags, and re-scan on every pull. Record the hash you scanned and reject mismatches at load time.
Do not treat scanning as a guarantee. CVE-2025-1944 shows the scanner itself can be bypassed; keep scanners current and pair them with provenance pinning and sandboxed loading.
Inventory which tokenizers and model families you actually load. It turns the ReDoS CVEs from a recurring fire drill into a quick “not applicable” most of the time.
Map these to OWASP LLM Supply Chain. Tracking them under a single OWASP LLM supply-chain ↗ label keeps the recurring pattern visible instead of looking like unrelated one-offs.

The Hub made model sharing frictionless, and the friction it removed was exactly the friction that used to make running a stranger’s code hard. The CVEs are the bill for that convenience. They are payable — with current versions, safe formats, and a healthy distrust of artifacts you did not build.

Hugging Face Transformers & Hub: Supply-Chain Risks and Real Advisories

Bucket one: deserialization RCE in Transformers

The newest entry in the same lineage

Bucket two: ReDoS — denial of service, not code execution

The protection layer can fail too

What this means for trusting the Hub

See also

Sources

ML CVEs — in your inbox

Related

PyTorch Security: Notable CVEs and How to Harden Your Loading Path

trust_remote_code and the ML Orchestration CVE Class

Unsafe Model Deserialization: The Pickle Problem Behind ML CVEs

Comments