Threat surface

POISONING ISN'T ONE ATTACK.IT'S A FAMILY OF THEM.

The attack surface on ingested ML data is wide and growing. Below is the families Agentiks blocks at the gate — not a closed list, but an honest taxonomy. After the families, real incidents where the data reached production because no one was blocking in real time.

Attack families

Eight you'll see in production.

The full space is larger and evolving. These are the recurring shapes. We add new families to the stack as they're observed in the wild.

01LABEL CORRUPTION

Wrong labels on clean samples

Flipped, mis-mapped, or backdoor-triggered labels shift the decision boundary. The sample looks valid; the class assignment is poisoned.

02CLEAN-LABEL

Poison without touching the label

Adversarially crafted inputs with correct-looking labels that still steer the model. The most insidious family — standard label audits miss them entirely.

03BACKDOOR

Embedded triggers in training data

A hidden pattern that activates only in production, causing controlled misclassification. Models trained on triggered data test fine in dev.

04SYBIL

Coordinated identity poisoning

One actor, many sources. Per-source rate limits do nothing when every source is the same attacker wearing different hats.

05TRUST GROOMING

Reputation-first, poison-later

A source submits clean data for weeks or months to build trust, then ships poisoned batches that skate past trust-gated thresholds.

06SPLIT VIEW

Different data at each pipeline stage

Clean samples served to validation and staging, poisoned samples served to training. Everything tests green. Production ships the poison.

07GRADIENT MATCHING

Optimized for maximum model shift

Attackers who see gradients (or approximate them) craft inputs designed to move specific parameters. Small volume, high impact.

08PROVENANCE SPOOF

Forged source identity

Reused credentials, signed manifests that don't match content, hash mismatches between pipeline stages. Attacks the trail itself.

Real-world incidents

These have already happened.

A selection of public incidents that a pipeline with verifiable provenance, orchestrated defense, and continuous verdict learning would have materially changed the outcome of.

Hugging Face supply-chain attack

2024

Provenance spoof · supply chain

Security researchers found over 100 malicious models uploaded to the public hub. Model weights carried executable payloads that ran on anyone who loaded them.

Caught byL1 · Provenance

LAION-5B contamination

2023

Unverified provenance at scale

Stanford researchers found CSAM and unlicensed content inside the dataset that trained Stable Diffusion and Imagen. The dataset was pulled; downstream models were already shipped.

Caught byL1 · Provenance

Anthropic sleeper agents

2024

Backdoor trigger · clean-label

Backdoor triggers embedded in training data survived safety fine-tuning and RLHF. The model behaved correctly in evaluation and maliciously in production.

Caught byL3 · AdversarialL6 · Forensic

PoisonGPT

2023

Provenance spoof · targeted misinformation

Mithril Security uploaded a targeted-falsehood model to Hugging Face under a look-alike org. Tens of thousands of downloads before takedown.

Caught byL1 · ProvenanceL4 · Semantic

ChatGPT data extraction

2023

Training-data leakage · unverifiable corpus

The divergence attack extracted gigabytes of raw training data from a production model via crafted prompts. The training corpus was unknowable by audit.

Caught byL6 · Forensic

Nightshade

2024

Coordinated clean-label poisoning

Coordinated adversarial pixel-level poisoning released as a public tool. Designed to look clean to humans and corrupt generative image models at scale.

Caught byL2 · StatisticalL3 · Adversarial

Google AI Overviews

2024

Cohort disagreement · source trust

Unvetted Reddit content and other low-trust sources surfaced as authoritative answers. Glue-on-pizza went global; trust in ML-surfaced content cratered overnight.

Caught byL5 · Consensus

Not every incident is a poisoning attack in the strict sense — some are provenance failures, supply-chain lapses, or coordinated manipulation. All are cases where Agentiks-style controls at the ingestion boundary would have narrowed or closed the exposure.

We'll run these against your pipeline.

In a demo: we show you which families your current stack catches today, and which ones slip through.