How it works

Benchmarks

Black-box claims score lower. TryAnneal ships a reproducible benchmark — anyone can run it and get the same precision / recall.

Results

Contract	Exploit analog	Losses	Detected
MinterestVuln.sol	Minterest Jul 2024 (Mantle)	$1.4M	✅ HIGH
EulerDonation.sol	Euler Mar 2023	$197M	✅ HIGH
NomadInit.sol	Nomad Aug 2022	$190M	✅ HIGH
LayerZeroDVN.sol	KelpDAO Apr 2026	$292M	✅ HIGH
Clean1.sol	—	—	✅ CLEAN
Clean2.sol	—	—	✅ CLEAN

Precision 100% · Recall 100% · F1 1.00 · (TP=4, FN=0, FP=0, TN=2)

Gas optimization — measured before/after

The gas profiler’s saving estimates are no longer hand-waved. Each Mantle technique it advertises has a real naive/optimized contract pair that we compile with solc 0.8.24 and run through the engine’s own computeFee Arsia model. The saving is measured on the L1-data fee — the FastLZ-driven, size-dependent component these techniques actually move on Mantle. L2 execution and operator fees are held fixed across both sides so the comparison isolates the size win.

Technique	L1 before (MNT)	L1 after (MNT)	Measured saving
calldata_packing	0.000000000540	0.000000000506	6.3%
batch_operations	0.000000002052	0.000000000205	90%
storage_layout	0.000000000971	0.000000000901	7.2%

batch_operations collapses ten separately-floored L1 minimums into one — the largest, most honest win. calldata_packing strips ABI’s 32-byte word alignment (2,372→580 bytes); storage_layout uses compile-time constants to drop the constructor SSTOREs from the deploy init code (608→568 bytes). Numbers are the verbatim output of pnpm --filter @tryanneal/engine benchmark:gas.

Reproduce it

Every fixture runs runAudit({ noLlm: true }) — Slither + Aderyn + corpus only, no API keys, deterministic across runs. That’s the point: the verdict isn’t a black box.

bash

pnpm --filter @tryanneal/engine benchmark
# writes packages/engine/benchmarks/results/latest.json

pnpm --filter @tryanneal/engine benchmark:gas
# writes packages/engine/benchmarks/results/gas-latest.json

Methodology + the committed results live in packages/engine/benchmarks.

← Contracts & ERC-8004

Business model →