CLI
anneal audit <file> runs the whole pipeline from the terminal: static analysis, the multi-LLM cascade, the Arsia gas profile, the corpus match, and (optionally) an on-chain attestation.
Run an audit
Each run opens with an animated TRYANNEAL banner, then prints an is_this_safe() → SAFE / UNSAFE verdict line with the 0–100 score, the list of deduplicated findings (each tagged with the engines that flagged it), and the Arsia gas profile.
A single-contract audit runs the full critic cascade by default — ChainGPT pre-screen, then two architecturally-distinct Stage-2 critics fan out in parallel: Groq Llama-3.3-70B and OpenAI GPT-OSS-120B (both served on Groq), cross-validating each other. Gemini 2.5 Pro is an optional third critic, off by default (its key is rate-limited). Pass --quick for a pre-screen-only pass. The cascade is resilient: a ChainGPT pre-screen failure is non-fatal and the critics still run, and if nothing could analyze the contract the verdict is flagged analysisIncomplete — it is never reported as safe.
Deterministic, reproducible audits.The same contract always returns the same verdict — TryAnneal's answer to “AI audits are non-deterministic.” Every model decodes at temperature 0 (greedy, seeded); a corroboration rule requires every reported finding to have ≥2 independent sources (≥2 models, or a model plus Slither) when the full panel runs, so no single-model hunch drives the verdict; scoring is confidence-weighted; and the Telegram bot and hosted MCP memoize by code hash (keccak/sha3 of the source), so identical source returns the identical audit.
Set HUNYUAN_API_KEY to translate the finished verdict and findings into the reader's language (zh, es, ja, ko, fr, and more) — the audit runs in English, then Tencent Hunyuan renders the multilingual report.
Flags
| Flag | Effect |
|---|---|
--threshold <score> | Fail (exit 1) if the verdict scores below N; 0 = severity-only, fails on any high/critical |
--quick | ChainGPT pre-screen only; skip the critic cascade |
--no-llm | Static only (Slither + Aderyn + corpus), no API calls |
--no-aderyn | Skip the Aderyn (Rust) static-analysis layer |
--gas-only | Skip the security audit; only profile gas |
--attest | Post the verdict on-chain via AnnealValidation |
--no-encrypt | Skip AES-GCM encryption / report storage |
--detectors <mode> | all · builtin · tryanneal |
-n, --network | mantle (mainnet) or mantle-sepolia |
Exit code is non-zero when a high/critical finding is present, or — with --threshold <score> — when the verdict scores below N (use --threshold 0 for severity-only gating). That single exit code is the whole CI story.
Use it in CI — a GitHub Action PR gate
.github/workflows/anneal-audit.yml runs the deterministic audit (Slither + 16 TryAnneal detectors + the 98-pattern corpus — no LLM keys, no chain calls) on every PR that changes a .sol file. It does three things:
1. Audits each changed contract with --threshold $ANNEAL_THRESHOLD. 2. Posts a ✅ TryAnneal — PASSED / ❌ TryAnneal — BLOCKED comment on the PR with the full per-contract verdict. 3. Emits a red/green check-run that fails when a contract has a high/critical finding or scores below the threshold — so branch protection can block the merge. The threshold defaults to 80 and is set per-repo via the ANNEAL_THRESHOLD repository variable.
anneal audit (Slither + corpus)a required status check and a PR can't merge while a changed contract is high/critical or below threshold — code review and CI gate in one.