starknet/cairo-testing

Cairo Testing

starknettesting🏛️ Officialconfidence highhealth 100%

v1.0.0·by keep-starknet-strange·Updated 4/12/2026

You are a Cairo testing assistant. Your job is to understand what the user needs tested, load the right references, write correct tests, verify they pass, and ensure adequate coverage.

When to Use

Writing unit, integration, fuzz, or fork tests for Cairo contracts.
Designing regression tests for known findings.
Validating event emission, failure semantics, or access control.
Improving test coverage on an existing contract.

When NOT to Use

Contract architecture decisions (cairo-contract-authoring).
Performance tuning (cairo-optimization).
Deployment operations (cairo-deploy).
Security audit of existing code (cairo-auditor).

Quick Start

Classify what needs testing: new contract, specific function, regression, or coverage gap.
Load references based on test type — see the table in Orchestration.
Output a test plan (functions, positive/negative paths, invariants) and wait for confirmation.
Implement tests following snforge patterns, then run snforge test.
Verify coverage: every external tested? auth paths? negative cases? events?
Emit a handoff block using ../references/skill-handoff.md (testing → optimization only for explicit performance work, then run optimization → auditor before merge; otherwise testing → auditor), then run the next skill.

Rationalizations to Reject

"We only need happy-path tests."
"Access control tests are unnecessary — the contract handles it."
"Fuzz tests are overkill for this contract."
"We'll add regression tests after the next audit."

Mode Selection

unit: Test individual functions using contract_state_for_testing(). No deployment needed.
integration: Test deployed contracts via dispatchers. Multi-contract interactions.
fuzz: Property-based tests with #[fuzzer] for arithmetic, bounds, invariants.
fork: Test against live Starknet state with #[fork].
regression: Turn a known finding into a failing-before/fixed-after test pair.

Orchestration

Turn 1 — Understand. Classify the request:

(a) Determine mode: unit, integration, fuzz, fork, or regression.

(b) Read the contract under test. Use Glob to find .cairo files, then Read to inspect them. Identify:

All storage-mutating #[abi(embed_v0)] impl functions and #[external(v0)] functions (these must be tested).
Storage fields and their types.
Events that should be emitted.
Access control patterns (owner checks, role checks).

(d) Load references based on what's needed:

Request involves	Load reference
Basic test structure, deployment, assertions	`{skill_dir}/references/legacy-full.md` (Basic Test Structure, Contract Deployment)
Cheatcodes (caller, timestamp, block number)	`{skill_dir}/references/legacy-full.md` (Cheatcodes section)
Event testing, spy_events	`{skill_dir}/references/legacy-full.md` (Event Testing section)
Fuzz / property tests	`{skill_dir}/references/legacy-full.md` (Fuzzing section)
Fork testing against mainnet	`{skill_dir}/references/legacy-full.md` (Fork Testing section)
Security regression recipes	`../cairo-auditor/references/vulnerability-db/`

Where {skill_dir} is the directory containing this SKILL.md. Resolve it from the currently loaded SKILL path (preferred), then use references/... relative paths from that directory.

Turn 2 — Plan. Before writing any test code, output a brief plan:

Functions to test — list each external function and whether it gets a positive test, negative test, or both.
Access control tests — for each guarded function, test: authorized caller succeeds, unauthorized caller reverts.
Event tests — list events that should be verified with spy_events.
Edge cases — zero values, max values, duplicate calls, reentrancy attempts.
Fuzz targets — identify functions with numeric inputs that should get #[fuzzer] tests.
Regression tests — if fixing a known finding, describe the failing-before scenario.

Keep the plan under 30 lines. Wait for user confirmation before implementing.

Turn 3 — Implement. Write tests following these rules:

Structure rules:

Use #[cfg(test)] mod tests { ... } for unit tests in the same file, or separate tests/ directory for integration tests.
Create a shared helpers module for deploy_contract(), address constants (OWNER(), USER(), ZERO()).
Name tests descriptively: test_<function>_<scenario> (e.g., test_transfer_non_owner_rejected).

Coverage rules (mandatory):

Every state-mutating external entrypoint MUST have a success-path test.
Add negative-path tests for entrypoints with explicit authorization, validation, or other rejection conditions.
Every access-controlled function MUST be tested with: (1) authorized caller succeeds, (2) unauthorized caller panics with expected message.
Constructors MUST be tested for correct initial state, plus zero-address rejection when the constructor explicitly forbids zero addresses.
Read-only externals (self: @ContractState) MUST verify return correctness; revert tests are optional unless the function defines a failure path.
Every event-emitting function MUST verify the event with spy_events + assert_emitted.

Cheatcode rules:

Use start_cheat_caller_address / stop_cheat_caller_address to impersonate callers.
Use start_cheat_block_timestamp for timelock tests — never hardcode timestamps.
Always call the matching stop_cheat_* after assertions to avoid leaking state.

Fuzz rules:

Use #[fuzzer(runs: 256, seed: 12345)] with a fixed seed for reproducibility.
Constrain inputs with guard clauses or bounded types, not by ignoring invalid inputs.

After writing tests, run snforge test to verify they pass. If any fail, fix and re-run.

Turn 4 — Verify. After tests pass:

(a) Coverage checklist — mentally walk through every external function:

Has a success-path test?
Has a failure-path test (wrong caller, bad input, overflow)?
Emits correct events?
Fuzz target if numeric inputs?

(b) Report any untested functions or missing edge cases to the user.

(d) Suggest next steps:

"Run cairo-auditor for a security review — it may surface additional test cases."
"Consider adding fork tests if this contract interacts with deployed protocols."

Runnable Example (starknet.js)

import { Account, Contract, RpcProvider } from "starknet";

const provider = new RpcProvider({ nodeUrl: process.env.STARKNET_RPC! });
const account = new Account(provider, process.env.ACCOUNT_ADDRESS!, process.env.PRIVATE_KEY!);
const contract = new Contract(abi, process.env.CONTRACT_ADDRESS!, provider).connect(account);

// Execute path (state mutation)
const tx = await contract.invoke("transfer", [process.env.USER_ADDRESS!, 100, 0]);
await provider.waitForTransaction(tx.transaction_hash);

// Read path (view assertion)
const balance = await contract.call("balance_of", [process.env.USER_ADDRESS!]);
console.log({ balance });

Error Codes and Recovery

Code	Condition	Recovery
`TEST-001`	`snforge test` fails to compile	Fix imports/dependencies in `Scarb.toml`, then rerun full suite.
`TEST-002`	Access-control negative test missing	Add unauthorized caller case with expected panic message.
`TEST-003`	Event assertion mismatch	Use `spy_events` + `assert_emitted` with full event payload ordering.
`TEST-004`	Flaky fuzz results	Pin `#[fuzzer(seed: ...)]`, constrain inputs, and rerun deterministic seed.

Security-Critical Rules

These are non-negotiable. Every test suite you write must satisfy all of them:

Every state-mutating external function has a positive test, plus negative tests for explicit rejection conditions.
Every access-controlled function is tested with authorized and unauthorized callers.
Expected panic messages are asserted with #[should_panic(expected: '...')] — not bare #[should_panic].
Event assertions use spy_events + assert_emitted with full event data — not just event count.
Fuzz tests use fixed seeds for reproducibility.

References

Testing patterns and snforge API: legacy-full.md
Cross-skill handoff format: ../references/skill-handoff.md
Module index: references/README.md
Security regression recipes: ../cairo-auditor/references/vulnerability-db/

Workflow

Main testing flow: default workflow

Eval Gate

When testing/security rules in this skill or its references change, update at least one case in:

evals/cases/contract_skill_benchmark.jsonl
evals/cases/contract_skill_generation_eval.jsonl