Pro feature

You have the same llama3 in three places

aiclean dedupe finds duplicate ML weights across every major tool and reclaims them via hardlinks. Nothing breaks. Your tools don't notice. Typical first run: 15–50 GB.

~ / aiclean
$

Same weight, three caches

You pulled llama3 with Ollama, then the Transformers library pulled it again, then LM Studio did too. Three copies of 4.7 GB.

Content-hashed, not path-based

We SHA-256 every model file > 100 MB. If two files hash identical, they are the same bytes — safe to share via hardlink.

Hardlinks are invisible

After linking, every tool still sees its expected file at its expected path. They just point at the same inode on disk.

How it works

  1. 1

    aiclean dedupe

    Walks ~/.ollama, ~/.cache/huggingface, LM Studio, torch.hub, Diffusers. Files > 100 MB are candidates.

  2. 2

    Two-stage hashing

    Stage 1: group by exact byte size (cheap). Stage 2: SHA-256 only the candidates with a size collision. Avoids hashing 50 GB of unique files.

  3. 3

    Report first, act later

    Without --hardlink, dedupe just shows you the duplicates and potential savings. You decide whether to reclaim.

  4. 4

    Atomic replacement

    We rename the original, create the hardlink, then delete the rename — so a failure never leaves you with a missing file.

Try aiclean Pro

$7/mo — cancel anytime