Documentation
How it works

Benchmarks

Black-box claims score lower. TryAnneal ships a reproducible benchmark — anyone can run it and get the same precision / recall.

Results

ContractExploit analogLossesDetected
MinterestVuln.solMinterest Jul 2024 (Mantle)$1.4M✅ HIGH
EulerDonation.solEuler Mar 2023$197M✅ HIGH
NomadInit.solNomad Aug 2022$190M✅ HIGH
LayerZeroDVN.solKelpDAO Apr 2026$292M✅ HIGH
Clean1.sol✅ CLEAN
Clean2.sol✅ CLEAN
Precision 100% · Recall 100% · F1 1.00 · (TP=4, FN=0, FP=0, TN=2)

Gas optimization — measured before/after

The gas profiler’s saving estimates are no longer hand-waved. Each Mantle technique it advertises has a real naive/optimized contract pair that we compile with solc 0.8.24 and run through the engine’s own computeFee Arsia model. The saving is measured on the L1-data fee — the FastLZ-driven, size-dependent component these techniques actually move on Mantle. L2 execution and operator fees are held fixed across both sides so the comparison isolates the size win.

TechniqueL1 before (MNT)L1 after (MNT)Measured saving
calldata_packing0.0000000005400.0000000005066.3%
batch_operations0.0000000020520.00000000020590%
storage_layout0.0000000009710.0000000009017.2%
batch_operations collapses ten separately-floored L1 minimums into one — the largest, most honest win. calldata_packing strips ABI’s 32-byte word alignment (2,372→580 bytes); storage_layout uses compile-time constants to drop the constructor SSTOREs from the deploy init code (608→568 bytes). Numbers are the verbatim output of pnpm --filter @tryanneal/engine benchmark:gas.

Reproduce it

Every fixture runs runAudit({ noLlm: true }) — Slither + Aderyn + corpus only, no API keys, deterministic across runs. That’s the point: the verdict isn’t a black box.

bash
pnpm --filter @tryanneal/engine benchmark
# writes packages/engine/benchmarks/results/latest.json

pnpm --filter @tryanneal/engine benchmark:gas
# writes packages/engine/benchmarks/results/gas-latest.json

Methodology + the committed results live in packages/engine/benchmarks.