II. Formal Properties
Verification Status
A machine-checked account of Clawq’s proven properties.
Last verification refresh: 2026-03-07 · Coq 8.19 · 153 theorem/lemma statements tracked across public phases F1–F12
Planning note. The current proof-expansion plan lives in Formal Verification Roadmap - 2026-03-07.
Coq-Verified Core
Statements Tracked
Modules Extracted
Lines of Coq
Tracked Theorem/Lemma Statements
Phase Status
Verification Ledger
ConfigProofs.vPathSafety.vAuditChain.vRateLimiter.vConfig.v + ConfigProofs.vQuoteParsing.v + ShellSafety.vSecretStore.vChannelAuth.vAuditRetention.vAgentLoop.vSessionIsolation.vLandlockPolicy.vModule Breakdown
F1: Configuration Validation
Source: coq/theories/Clawq/ConfigProofs.v
For all p, valid_port p = true implies 1 <= p <= 65535.
Proof: ConfigProofs.v
validate_config default_config = true. The built-in defaults always pass validation.
Proof: ConfigProofs.v
Provider weights sum to a consistent total; no weight is negative.
Proof: ConfigProofs.v
For all t, valid_temperature t = true implies 0.0 <= t <= 2.0.
Proof: ConfigProofs.v
The default security configuration enables sandboxing, rate limiting, and audit logging.
Proof: ConfigProofs.v
F2: Directory Traversal Immunity
Source: coq/theories/Clawq/PathSafety.v — Extracted to OCaml and integrated into tools_builtin.ml
Path normalization eliminates all ’..’ segments. The output of normalize never contains a dotdot component.
Proof: PathSafety.v
Normalizing an already-normalized path yields the same path: normalize (normalize p) = normalize p.
Proof: PathSafety.v
A path whose segments are all safe is itself safe.
Proof: PathSafety.v
Any path that passes the safety check is a prefix of the allowed workspace root. No file access escapes the sandbox.
Proof: PathSafety.v
F3: HMAC Audit Chain Integrity
Source: coq/theories/Clawq/AuditChain.v
Appending a correctly-linked entry to a valid chain preserves chain validity.
Proof: AuditChain.v
A chain produced by build_chain always verifies. Construction implies validity.
Proof: AuditChain.v
Any suffix of a valid chain is itself valid. Truncation preserves integrity.
Proof: AuditChain.v
The last signature of a concatenated chain equals the last signature of the appended segment.
Proof: AuditChain.v
F4: Token Bucket Rate Limiter
Source: coq/theories/Clawq/RateLimiter.v
After refill, tokens never exceed the bucket capacity. The bucket cannot overflow.
Proof: RateLimiter.v
Refill is monotonically non-decreasing: tokens after refill >= tokens before refill.
Proof: RateLimiter.v
Properties P1–P6: consume decrements by exactly the requested amount when sufficient tokens exist; returns the bucket unchanged when insufficient; the allowed/denied decision is consistent with the token count.
Proof: RateLimiter.v
F5: Extended Configuration Validation
Source: coq/theories/Clawq/Config.v + ConfigProofs.v
Port validation is both sound and complete with respect to the 1–65535 range.
Proof: ConfigProofs.v
Temperature validation enforces the 0.0–2.0 range with correct boundary behavior.
Proof: ConfigProofs.v
Full configuration validation (P8–P15) checks all fields: port, temperature, max_tokens, history depth, provider weights, security flags, and rate limits.
Proof: ConfigProofs.v
F6: Shell Safety
Source: coq/theories/Clawq/QuoteParsing.v + ShellSafety.v — Extracted to OCaml
split_command_words produces a list of tokens with no embedded shell metacharacters when the input passes is_shell_safe.
Proof: QuoteParsing.v
The character blacklist (;&|${">"}{"<"}!) is complete for the threat model: no injection is possible via unlisted characters in a POSIX shell without expansion.
Proof: ShellSafety.v
is_allowed cmd allowlist = true implies is_allowed cmd (allowlist ++ extra) = true. Extending the allowlist never revokes existing permissions.
Proof: ShellSafety.v
Trust Boundary
Verification Roadmap
The remaining phase work (F7—F12) is prioritized by runtime-risk reduction and model/runtime alignment. Extraction is selective and deferred unless a pure kernel is stable and demonstrably worth replacing in shipped OCaml.
FAQ: Should Every Formalization Be Extracted?
No. We aim to extract the right pure kernels, not every formalization by default.
Extract when the Coq artifact is a small pure computation, matches shipped runtime behavior closely, and is actually worth making the runtime’s single source of truth.
Keep a theory spec-only when it mostly models trusted boundaries or effects — like crypto, databases, locking, OS policy, or higher-level control flow — or when the proof model still diverges from the OCaml implementation.
Today the best extraction candidates are things like Cli, Config, PathSafety, QuoteParsing, selective ShellSafety, and eventually some AuditChain or RateLimiter helpers. Modules like SecretStore, SessionIsolation, AgentLoop, and most of AuditRetention are higher-value as proofs first, extraction later.
Priority Matrix
| Phase | Module | Extract? | Security ROI | Difficulty | Order |
|---|---|---|---|---|---|
| F7 | SecretStore.v | No (spec) | High | Medium | 1st |
| F8 | ChannelAuth.v | Candidate later (allowlist kernels only) | High | Low | 2nd |
| F9 | AuditRetention.v | Candidate later (pure purge helpers only) | Medium-High | Low | 3rd |
| F10 | AgentLoop.v | No (spec) | Medium | Low | 4th |
| F11 | SessionIsolation.v | No (spec) | Medium-High | Medium | 5th |
| F12 | LandlockPolicy.v | No (spec) | Medium | Medium-High | 6th |
Planned Phases
F7: Encryption Correctness — Spec-only proofs for secret_store.ml. Prove encrypt/decrypt is identity (modelling AES-GCM as abstract parameters), nonce uniqueness birthday-bound, and resolve_secret completeness across encrypted, env-var, and plaintext cases.
F8: Channel Authentication — Keep formal/runtime alignment current for shared allowlist, replay/freshness semantics, and Telegram pairing validity. Treat extraction as optional follow-through once kernels are stable.
F9: Audit Retention Safety — Maintain anchored-retention correctness under runtime assumptions (suffix-preserving age purge + count purge). Extraction remains optional and limited to pure helper kernels.
F10: Agent Loop Termination — Spec-only proof that the agent turn loop stays within max_tool_iterations, preserves the tool-call/tool-result history shape, and keeps newest-first ordering under trim_history and emergency compaction. Models the loop as a Coq Fixpoint with fuel.
F11: Session Isolation — Spec-only proof using Coq FMap for cross-key non-interference plus same-key reset/create stability and idempotence; runtime lock behavior remains modeled with explicit assumptions and review notes.
F12: Landlock Sandbox Policy — Spec-only proof of least-privilege access flags, path set closure under extra_paths extension, and policy monotonicity. Assumes C FFI stubs implement Landlock semantics correctly.
Data sourced from the F1–F12 slice of docs/src/data/formal_verification.yml (with F0 tracked separately as internal CLI work) · Theorem counts checked by make update-fv