Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions doc/llm/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,45 @@ SEARCH (fdom _)
SEARCH (_ %/ _)
```

### When the agent can't hold a persistent pipe

Some LLM front-ends (Claude Code, Codex CLI, etc.) expose the shell as
one-shot command execution — the agent *cannot* keep stdin/stdout open
to `easycrypt llm` across turns. Naively, every tactic would then
require re-`LOAD`-ing the file prefix.

The `scripts/llm/` directory contains a reference fifo-backed daemon
that lets many short-lived shell invocations drive one persistent EC
REPL. Start it once at the beginning of a session, then issue each
tactic as a separate `ec_send.sh` call. See `scripts/llm/README.md` for
the full pattern.

### Iterate interactively, then codify

For long proofs — especially program-logic bodies with 5+ chained
`ecall`s, or big algebraic derivations — write tactics one at a time
in the REPL, inspecting `GOALS` after each, and only commit the final
sequence back to the file once it works. Two reasons:

- **SMT budget.** Committing a long batch and issuing one LOAD
compounds SMT cost across every intermediate side-condition at once.
The same tactics often pass individually with a shorter per-call
timeout.
- **Cheap backtracking.** Use `CHECKPOINT` at every branching point
and `REVERT` liberally — it's O(1), whereas re-LOAD re-compiles the
whole prefix.

Pattern:

```
LOAD "file.ec" 100 ← note uuid:N
CHECKPOINT before_body
<try tactic>
GOALS ← inspect
<try tactic>
REVERT before_body ← if the branch fails
```

## EasyCrypt proof strategy

### General approach
Expand Down Expand Up @@ -246,6 +285,64 @@ SEARCH (_ %/ _)
- `rewrite lemma in H` modifies hypothesis `H` in place (it does
not consume it). If you need to preserve the original, copy it
first: `have H' := H; rewrite lemma in H'`.
- **`have H := lemma args` may leave unevaluated lambdas.** Lemma
applications that produce `init`/`map`/`filter` terms often carry
β-redexes into the introduced hypothesis. Later `rewrite H` can then
hang indefinitely while the kernel tries to unify. Force
normalization at introduction with `have /= H := lemma args`.
- **Tactic-repetition (`!tac`) over-applies on nested structures.**
`!mapiE`, `!initiE`, `!allP` descend into inner layers at different
array sizes, generate side-conditions that can't be closed by `/#`,
and either leave stale goals or hang. Use the exact count needed
(usually 1 or 2 for an equality: `rewrite mapiE 1:/# /= mapiE 1:/#`).
- **`1:/#` vs `1:smt(hints)` in rewrite chains.** Inside a
`rewrite lemma 1:...` position only `/#` parses; `1:smt(hints)` is
a parse error. Break the chain:
`rewrite lemma 1:/#; first smt(hints). rewrite next_lemma.`
- **`suff` not `suffices`.** The surface syntax uses `suff H : P.`;
`suffices` triggers a parse error.
- **`rewrite allP` only touches the first `all`.** When unfolding
range predicates on both sides of an implication
(`all P x => all Q x`), use `!allP` — otherwise the subsequent
`=> H j Hj` fails with "nothing to introduce" because the conclusion
is still wrapped in `all`.
- **`congr` vs `congr 1`.** `congr` walks down an array equality all
the way to list/element equality; `congr 1` takes one structural
step. Reach for `congr 1` when `congr` descends too far and exposes
stale `init`/`mkseq` structure.

### Reconciling syntactically distinct constants

When a word-type modulus (`W13.modulus`, `W8.modulus`, …) appears on
one side of a goal and its numeric literal (`8192`, `256`, …) on the
other — for instance after one subterm was lowered through a circuit
lemma while the other came from the spec — `smt()` treats them as
distinct atoms and won't bridge. Normalize explicitly:

```easycrypt
have W13_eq : W13.modulus = 8192 by smt(W13.ge2_modulus).
rewrite W13_eq.
```

Same pattern for `modulus_W`, cardinalities of finite types, and other
constants that have both a symbolic and a numeric form.

### Ring/field commutativity: use the fully-qualified path

Bare `mulrC`/`addrC` may not resolve against a concrete ring clone's
theory. When working in a cloned ring (`Zq`, `Fq`, polynomial rings
over Zq, …), the fully-qualified name of the `ComRing`/`ZModule` lemma
usually succeeds where the short form fails:

```easycrypt
by rewrite Zq.ComRing.mulrC.
```

Use `SEARCH` to find the right path:

```
SEARCH (_ * _) (commutative _)
```

## EasyCrypt language overview

Expand Down
107 changes: 107 additions & 0 deletions scripts/llm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# LLM-oriented tooling for `easycrypt llm`

This directory contains two example clients for the interactive REPL
exposed by `easycrypt llm` (see `doc/llm/CLAUDE.md` for the wire
protocol). They are layered: each can be useful independently, depending
on how your driver (human, script, or LLM agent) is structured.

| Layer | File | Use when |
|-------|------|----------|
| Python binding | `ec_llm.py` | Writing a Python program that drives one EasyCrypt REPL. |
| Fifo daemon | `daemon.py` + `ec_send.sh` | Driving one persistent REPL from *many* short-lived shell invocations — notably LLM agents that issue one bash tool call per tactic. |

## Why a fifo daemon?

`easycrypt llm` is a stateful, long-running process. Python programs can
keep its stdin/stdout pipe open across commands, but many LLM-agent
front-ends (Claude Code, Codex CLI, etc.) only expose the shell via
one-shot command execution. Without a persistent intermediary, every
tactic would require re-LOADing the file prefix — minutes wasted per
step on a large project.

The daemon opens `easycrypt llm` once, listens on a named pipe for JSON
requests, and writes JSON responses to a second named pipe. The
companion `ec_send.sh` is the shell-side client: it takes a single JSON
request, forwards it through the pipes, and prints the structured
response. Combined, they let an agent drive a persistent REPL with one
bash call per step.

## Python binding (Layer 1)

```python
from ec_llm import ECLLM

with ECLLM(cwd="/path/to/project", extra_args=["-p", "Z3"]) as ec:
ok, resp = ec.load("proofs/myfile.ec", 42, nosmt=True)
print(ec.goals())
ok, resp = ec.tactic("smt().")
print(resp)
```

Everything in the `easycrypt llm` protocol is exposed as a method:
`load`, `checkpoint`, `revert`, `undo`, `goals`, `goals_all`, `tactic`,
`tactic_multiline`, `search`, `quiet`, `close`. Status parsing tracks
the running uuid in `ec.uuid`.

CLI mode provides a quick "LOAD and report" check:

```
python3 ec_llm.py proofs/myfile.ec 42 --cwd /path/to/project --nosmt
```

## Fifo daemon (Layer 2)

Start the daemon once (most easily in the background):

```
# Create the fifos if needed and start the daemon
mkfifo /tmp/ec_in /tmp/ec_out 2>/dev/null || true
python3 scripts/llm/daemon.py \
--cwd /path/to/project \
--prover Z3 --timeout 5 &
```

Drive it with `ec_send.sh`:

```
./ec_send.sh '{"op":"load","arg":"proofs/myfile.ec","arg2":"42","nosmt":true}'
./ec_send.sh '{"op":"goals"}'
./ec_send.sh '{"op":"tactic","arg":"smt()."}'
./ec_send.sh '{"op":"checkpoint","arg":"before_split"}'
./ec_send.sh '{"op":"tactic","arg":"split."}'
./ec_send.sh '{"op":"revert","arg":"before_split"}'
```

For payloads with shell-hostile content (EC escapes like `/\\`),
write the JSON to a file and use `--file`:

```
cat > /tmp/req.json <<'EOF'
{"op":"tactic","arg":"rewrite /wpoly_srng !allP => H j Hj; smt(hint)."}
EOF
./ec_send.sh --file /tmp/req.json
```

## Operational tips

- **Per-session setup.** Create fifos once per session. The daemon keeps
them; don't delete them while it's running.
- **Recovering from a dead daemon.** If the EasyCrypt process crashes
(stack overflow, OOM on SMT), `ec_send.sh` will hang. Kill the daemon
process, recreate the fifos (`rm -f /tmp/ec_in /tmp/ec_out; mkfifo
/tmp/ec_in /tmp/ec_out`), and restart.
- **Fast SMT feedback.** Pass `--timeout 1 --prover Z3` to the daemon
during exploration; fragile SMT calls surface immediately. Raise the
timeout when codifying a final proof.
- **Parallel sessions.** To run multiple daemons side-by-side, give each
a distinct pair of fifo paths and set `EC_IN_FIFO` / `EC_OUT_FIFO`
for the `ec_send.sh` caller.
- **Logging.** Exceptions inside the daemon are written to `/tmp/ec_daemon.log`
(or `--log`). Check there first if a request seems silently dropped.

## Security note

The daemon trusts whatever shows up on its input fifo — it forwards the
`arg` payload straight to EasyCrypt as tactic text. Don't expose the
fifos on a shared filesystem, and don't run the daemon as a privileged
user.
142 changes: 142 additions & 0 deletions scripts/llm/daemon.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#!/usr/bin/env python3
"""
Fifo-backed daemon exposing one persistent `easycrypt llm` instance to many
shell callers. Typical use case: an LLM agent (Claude Code, etc.) that
invokes a bash tool *once per tactic* — the agent can't hold stdin/stdout
open across turns, so without a daemon every tactic would re-LOAD the
whole file prefix.

Protocol (matches the JSON line format produced by `ec_send.sh`):

request (one line on stdin-fifo):
{"op": "<op>", "arg": "...", "arg2": "...", "nosmt": bool}

response (one line on stdout-fifo):
{"ok": bool, "uuid": N, "resp": "..."}

Supported ops:
load arg=path, arg2=line (string or int), optional nosmt: bool
tactic arg="proc."
tactic_multiline arg="<multi-line body>" (wraps in <BEGIN>/<DONE>)
undo
revert arg="<uuid-or-name>"
checkpoint arg="<name>"
goals
goals_all
search arg="<pattern>"
quiet arg="on"|"off"
close

Usage:
# start daemon (in background)
python3 scripts/llm/daemon.py --cwd /path/to/project \\
--in-fifo /tmp/ec_in --out-fifo /tmp/ec_out &

# then drive with scripts/llm/ec_send.sh (see that file for details)

The fifos are created if they don't exist. Only one connected writer and
one connected reader are expected at a time — the client (`ec_send.sh`)
enforces this by opening them sequentially.
"""

import argparse
import json
import os
import sys
import traceback

HERE = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, HERE)
from ec_llm import ECLLM # noqa: E402


def _ensure_fifo(path):
if not os.path.exists(path):
os.mkfifo(path)
elif not os.path.exists(path): # raced out; recreate
os.mkfifo(path)


def _handle(req, ec):
op = req.get("op")
arg = req.get("arg", "")
arg2 = req.get("arg2", "")

if op == "load":
ok, resp = ec.load(arg, int(arg2), nosmt=bool(req.get("nosmt", False)))
elif op == "tactic":
ok, resp = ec.tactic(arg)
elif op == "tactic_multiline":
ok, resp = ec.tactic_multiline(arg)
elif op == "undo":
ok, resp = ec.undo()
elif op == "revert":
ok, resp = ec.revert(arg)
elif op == "checkpoint":
ok, resp = ec.checkpoint(arg)
elif op == "goals":
resp, ok = ec.goals(), True
elif op == "goals_all":
resp, ok = ec.goals_all(), True
elif op == "search":
resp, ok = ec.search(arg), True
elif op == "quiet":
ok, resp = ec.quiet(on=(arg.lower() == "on"))
elif op == "close":
ec.close()
return {"ok": True, "uuid": ec.uuid, "resp": "bye"}, True
else:
return {"ok": False, "uuid": ec.uuid, "resp": f"unknown op: {op}"}, False

return {"ok": ok, "uuid": ec.uuid, "resp": resp}, False


def main():
p = argparse.ArgumentParser(description=__doc__.splitlines()[1])
p.add_argument("--cwd", default=None, help="Project root (default: cwd)")
p.add_argument("--in-fifo", default="/tmp/ec_in", help="Request fifo path")
p.add_argument("--out-fifo", default="/tmp/ec_out", help="Response fifo path")
p.add_argument("--log", default="/tmp/ec_daemon.log", help="Log file")
p.add_argument("--prover", default=None, help="SMT prover (e.g. Z3)")
p.add_argument("--timeout", type=int, default=None, help="Per-goal SMT timeout (s)")
args = p.parse_args()

_ensure_fifo(args.in_fifo)
_ensure_fifo(args.out_fifo)

extra = []
if args.prover:
extra += ["-p", args.prover]
if args.timeout is not None:
extra += ["-timeout", str(args.timeout)]

ec = ECLLM(cwd=args.cwd, extra_args=extra)
with open(args.log, "w") as lf:
lf.write(f"EC started. uuid={ec.uuid}\n")

while True:
try:
# Opening a fifo for reading blocks until a writer opens it.
with open(args.in_fifo, "r") as fi:
line = fi.readline()
if not line.strip():
continue
req = json.loads(line)
result, should_exit = _handle(req, ec)
with open(args.out_fifo, "w") as fo:
fo.write(json.dumps(result) + "\n")
if should_exit:
break
except Exception:
with open(args.log, "a") as lf:
lf.write("EXCEPTION: " + traceback.format_exc() + "\n")
try:
with open(args.out_fifo, "w") as fo:
fo.write(json.dumps({"ok": False, "uuid": -1,
"resp": traceback.format_exc()}) + "\n")
except Exception:
pass


if __name__ == "__main__":
main()
Loading
Loading