How it works

aur-scan does one thing and refuses to do another: it reads a package and everything it pulls in, and it never runs any of it. Every stage below holds to that line — the scan cannot become the thing that executes the payload.

The pipeline

Fetch → Parse → Analyze → Resolve the tree → Report (and optional SBOM).

1. Fetch — hardened, read-only

Package metadata comes from the AUR RPC over a locked-down HTTP client: a 30s timeout, redirects refused, HTTPS-only, and the JSON body size-capped (16 MiB) so a hostile or MITM’d response can’t stream you out of memory. RPC URLs are built from validated, percent-encoded path segments — a package name can’t inject into the request.

The PKGBUILD itself is retrieved with a deliberately defanged git clone:

git -c core.hooksPath=/dev/null \
    -c protocol.file.allow=never \
    -c protocol.ext.allow=never \
    -c core.symlinks=false \
    clone --depth=1 --no-tags --no-recurse-submodules -- <url> .

No hooks can fire, file:///ext:: protocols are blocked, symlinks are written as plain files (no directory escape), submodules and tags are never fetched, and -- means the URL can’t be read as a flag. The package name is validated before it’s ever used as a path. Nothing is built, sourced, or evaluated.

2. Parse — text, not execution

A static parser reads the PKGBUILD and any .install script as text — pure pattern/AST analysis, no bash evaluation. It extracts the fields, arrays (depends, source, the checksum arrays), the source+=() appends, and the function bodies. The brace scanner is quote- and comment-aware, so echo "}" or # } can’t truncate a function early, and backslash-newline continuations are spliced back together so curl evil \⏎| sh can’t slip past a single-line rule.

3. Analyze — the catalog

The parsed package is run against the authoritative detection catalog — 118 codes across 13 categories (see Detection Codes). That’s pattern rules plus structural analyzers: privilege escalation, source and transport integrity, checksum laundering, a remote-exec analyzer, a multi-line decode-and-execute pass, and an IOC match against known campaigns. You can add your own in Custom Rules.

One analyzer is opt-in and networked: with threat intelligence enabled and your own key, declared sha256sums are checked against VirusTotal and source= URLs against abuse.ch/URLhaus. It is off by default — a default scan stays fully offline and static — and even when on, every lookup fails open and only data already public in the PKGBUILD ever leaves the machine.

4. Resolve the dependency tree

The package you name is rarely the tampered one — it’s usually something a few levels down. aur-scan resolves the full transitive AUR dependency tree by breadth-first walk, and a critical rule governs it:

Resolution follows only static, declared metadata (depends/makedepends/…). It never fetches a source= artifact, follows a URL found in a PKGBUILD, or executes anything.

Every AUR package in the closure is scanned; official-repo and virtual packages are trusted leaves. Depth and node caps stop runaway expansion. When a package fetches and runs external code, that’s an opaque boundary — it’s flagged (EXEC-REMOTE, with the URL) and the tree stops there. The scanner won’t chase the link, because chasing it is exactly how it would run the payload.

The static-only invariant

The whole design rests on one provable property: the scan can’t compromise the machine doing the scanning. Across the entire core, the only subprocesses ever spawned are:

the hardened git clone above, to fetch a PKGBUILD;
pacman -Si — is this name in the official repos?
pacman -Qm — which foreign (AUR) packages are installed?

It never invokes makepkg, never bash -c, never eval, never sources a PKGBUILD. The PKGBUILD is read with a plain capped file read and treated as data. That’s the line that makes pointing it at a malicious package safe — see Security for the full threat model.