Clawq follows a Coq-first architecture: core properties are machine-checked in Coq, extracted to OCaml, and wrapped by a runtime that provides I/O, networking, and channel integrations.

Build Pipeline

Coq theories → extracted OCaml → runtime libraries → clawq executable

The extracted OCaml is tracked in git so the project builds without Coq installed. Run make extract to regenerate from Coq sources, and make extract-check to verify no drift.

Coq Theories

FileRole
Interfaces.v7 record-based interface definitions: Provider, Channel, Tool, Memory, RuntimeAdapter, Tunnel, Security
Config.vConfiguration records (GatewayConfig, MemoryConfig, SecurityConfig, ClawqConfig) with defaults and validation
Cli.vCommand ADT, parse_command, dispatch, and usage string for 18 commands
Extract.vExtraction directives: type mappings and function list

Extracted OCaml

FileRole
clawq_core.mlAuto-generated OCaml from Coq. Contains command parsing, dispatch, config validation.
clawq_core.mliAuto-generated interface. Both files are tracked in git.

Runtime Module Map

ModuleRole
main.mlEntry point; Cmdliner CLI argument parsing
command_bridge.mlBridges CLI to extracted Coq dispatch; handles runtime-only commands
runtime_config.mlConfiguration loading, validation, and defaults for all subsystems
config_loader.mlFile-based config loading: reads JSON, merges with env vars
agent.mlAgent loop: prompt assembly, provider calls, tool dispatch, conversation management
session.mlSession lifecycle: create, resume, persist conversation state
provider.mlLLM provider abstraction: OpenAI-compatible HTTP calls, streaming, model selection
memory.mlKey-value memory backend: SQLite-backed store/recall/forget with namespace support
vector.mlLocal vector index: embedding storage, cosine similarity, hybrid FTS+vector merge
tool.mlTool type definitions and invocation framework with risk-level enforcement
tool_registry.mlDynamic tool registration: register, lookup, list by name or category
tools_builtin.mlBuilt-in tool implementations: file I/O, shell exec, web fetch, search
mcp_server.mlMCP server: exposes tools over JSON-RPC with configurable filtering
skills.mlSkill loader: discovers and loads skill definitions from the filesystem
http_server.mlHTTP server: Cohttp-lwt endpoint for web channel and health checks
http_client.mlHTTP client: shared Cohttp-lwt client for provider and API calls
telegram.mlTelegram channel: bot API integration via long polling
discord.mlDiscord channel: WS gateway mode, REST rate limit buckets, reconnect loop
discord_gateway.mlDiscord gateway protocol state machine (Hello, Identify, Resume, Heartbeat, Dispatch)
slack.mlSlack Events API channel with HMAC verification
slack_socket.mlSlack Socket Mode: WSS-based event receiving via app_token
ws_client.mlShared TLS WebSocket client (httpun-ws, gluten, tls-lwt, ca-certs)
daemon.mlSupervisor: gateway + telegram + discord + slack_socket, signal handling, periodic cleanup
service.mlService orchestrator: starts/stops subsystems (server, tunnel, scheduler)
scheduler.mlScheduled tasks: cron-like recurring job execution
audit.mlAudit logging: append-only log of tool invocations and security events
secret_store.mlSecret encryption at rest: AES-256-GCM, PBKDF2 key derivation, $ENC: prefix format
migrate.mlDatabase migrations: versioned schema upgrades for SQLite stores
resilience.mlReliability policies: timeout, retry (exponential backoff), fallback
runtime_native.mlNative runtime adapter: wraps daemon/service for start/stop/status/health
runtime_docker.mlDocker runtime adapter: manages clawq in containers via docker CLI
tunnel_cloudflare.mlCloudflare tunnel: manages cloudflared process, extracts assigned URL
rate_limiter.mlToken bucket rate limiter (IP, session, chat)
landlock.mlLandlock OS sandboxing (C FFI)
stt.mlSpeech-to-text: audio transcription via Whisper-compatible API

Dune Libraries

LibraryKey ModulesDependencies
clawq_extractedclawq_coreNone (unwrapped, -w -39 for extraction artifacts)
clawq_runtime_coreConfig, agent, session, provider, memory, tools, audit, security, rate limiter, scheduleryojson, sqlite3, lwt, cohttp-lwt-unix, mirage-crypto, digestif, clawq_extracted
clawq_runtime_integrationsHTTP server, telegram, discord, slack, daemon, MCP, WS client, tunnels, runtimesclawq_runtime_core + httpun-ws-lwt-unix, gluten-lwt-unix, ca-certs
clawq (executable)mainclawq_runtime_core, clawq_runtime_integrations, cmdliner

Both libraries use (wrapped false) so modules are accessible directly (e.g., Clawq_core.dispatch rather than Clawq_extracted.Clawq_core.dispatch).

Interface Inventory

From Interfaces.v, these 7 records define the contract surface:

InterfaceFieldsPurpose
Providername, complete, healthLLM provider abstraction
Channelname, start, stop, sendCommunication channel (web, telegram, etc.)
Toolname, invoke, risk_levelAgent tool with risk classification
Memorystore, recall, forgetKey-value memory backend
RuntimeAdaptername, start, stopRuntime lifecycle management
Tunnelname, start, statusNetwork tunnel (e.g., Cloudflare)
Securityworkspace_only, audit_enabled, encrypt_secretsSecurity policy flags

Dependency Direction

Coq theory dependency order
OCaml library dependency order

Runtime Split

The runtime is split into core and integrations to support a minimal build (clawq-min) that excludes network dependencies:

  1. Optional integrations stay out of clawq_runtime_core
  2. Network/server features belong in clawq_runtime_integrations
  3. Integration-only commands return “disabled in minimal build” messages in command_bridge_min.ml
  4. New dependencies are evaluated for core vs integration placement before linking