From a0632d5263556277b1206b1ef6df0da4c8b99be6 Mon Sep 17 00:00:00 2001 From: Yuri Tatishchev Date: Sun, 3 May 2026 17:54:25 -0700 Subject: [PATCH] add AGENTS.md --- AGENTS.md | 166 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 166 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..531c7a8 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,166 @@ +# AGENTS.md + +FWL (Firewall Language) is a Haskell DSL that compiles to nftables JSON. +Stack: GHC 9.10.3, Cabal, Parsec 3.x, Aeson 2.x, Tasty/HUnit for tests. + +--- + +## Key Commands + +```bash +cabal build # build everything +cabal test # run all test suites +cabal run fwlc -- check examples/router.fwl # parse + type-check a source file +cabal run fwlc -- compile examples/router.fwl # emit nftables JSON to stdout +cabal run fwlc -- pretty examples/router.fwl # pretty-print the parsed AST +``` + +Run tests before marking any task complete. The test suite is `cabal test`. + +--- + +## Project Structure + +``` +fwl/ +├── AGENTS.md +├── doc/ +│ ├── proposal.md ← initial design document and exploration +│ ├── fwl_grammar.md ← authoritative grammar reference; keep in sync with Parser.hs +│ └── ref/ +│ ├── ruleset.nft ← example nftables ruleset +│ └── ruleset.json ← the same example nftables ruleset in json format +├── examples/ +│ └── router.fwl ← canonical example; must parse and compile cleanly +├── src/FWL/ +│ ├── AST.hs ← all data types; source of truth for the AST +│ ├── Lexer.hs ← Parsec TokenParser, reservedNames, reservedOpNames +│ ├── Parser.hs ← top-level parser, all sub-parsers +│ ├── Pretty.hs ← AST → FWL source (round-trip printer) +│ ├── TypeCheck.hs ← effect row checker, exhaustiveness, CIDR intervals +│ ├── Interpret.hs ← evaluator + effect dispatch +│ ├── Compile.hs ← AST → nftables JSON (Aeson Value) +│ └── Util.hs ← shared helpers +└── test/ + ├── Main.hs + ├── ParserTests.hs + ├── TypeCheckTests.hs + └── CompileTests.hs +``` + +The grammar document at `docs/grammar.md` must stay in sync with `Parser.hs` and `Lexer.hs`. +When changing the parser, update the grammar doc in the same commit. + +--- + +## Architecture + +The pipeline is strictly linear with no back-edges: + +``` +source text + → Lexer (Text.Parsec.Token) + → Parser → [Decl] (AST.hs) + → TypeCheck → TypedDecl + → Compile → Aeson Value (nftables JSON) +``` + +The interpreter (`Interpret.hs`) runs the policy against a mock packet environment +and is separate from the compiler. It uses the same typed AST. + +--- + +## Reserved Words Rule + +**Only syntactic keywords belong in `reservedNames` in `Lexer.hs`.** +A word is a syntactic keyword if and only if `Parser.hs` uses `reserved "word"` for it. + +Semantic values — action constructors (`Allow`, `Drop`, `Masquerade`), +effect labels (`Log`, `Warn`, `Error`), result constructors (`Matched`, `Unmatched`), +and type names (`Frame`, `FlowPattern`, `Action`) — must NOT be in `reservedNames`. +They are parsed as plain identifiers so they can appear in type, pattern, +and expression positions without causing parse errors. + +If you add a new keyword: add it to both `reservedNames` in `Lexer.hs` +AND use `reserved "word"` in `Parser.hs`. Never add a word to only one place. + +--- + +## IP Address Representation + +IP addresses are stored as plain `Integer` in the AST (see `AST.hs`): + +- **IPv4**: 32-bit value in the low 32 bits of `Integer`. +- **IPv6**: 128-bit value. All standard notations are supported including `::` compression + and embedded IPv4 (e.g. `::ffff:192.168.1.1`). +- **CIDR**: `(Literal, Int)` — base address literal + prefix length. +- **Validation**: host bits must be zero: `(addr .&. hostMask prefix bits) == 0`. + +Use `ipv4Lit a b c d` from `AST.hs` to construct IPv4 literals in tests. +Never use tuple `(Word8, Word8, Word8, Word8)` — that type is gone. + +--- + +## Priority + +`Priority` is `newtype Priority = Priority { priorityValue :: Int }`. +Named constants are resolved at parse time in `priorityP`: + +| Name | Value | +|-------------|-------| +| `Raw` | -300 | +| `ConnTrack` | -200 | +| `Mangle` | -150 | +| `DstNat` | -100 | +| `Filter` | 0 | +| `SrcNat` | 100 | + +The compiler emits `"prio": ` — always an integer in the nftables JSON, +never a string. Do not use the old `priorityStr` function (deleted). + +--- + +## Parser Conventions + +- All blocks use explicit `{ }` delimiters with trailing `;` on each item. + `endBy p semi` (not `semiSep`) is used wherever trailing semicolons are expected. +- `mapLit` must be tried **before** `setLit` in `atom` — both start with `{` + and `mapLit` consumes `{ expr -> expr }` which `setLit` would misparse. +- `framePat` must be wrapped in `try` in the `pat` alternatives — it is a + reserved-word-prefixed parser that can fail after consuming input. +- Port literals (`:22`, `:8080`) in record field patterns use `fieldLiteral`, + not `literal` — the base `literal` parser does not handle `:N` syntax. +- `Frame` and `FlowPattern` are NOT in `reservedNames`; they appear as type + names and must be accepted by `identifier`. + +--- + +## Testing Conventions + +- Test files use `{-# LANGUAGE OverloadedStrings #-}` — required because + `A.String` expects `Data.Text.Text`, not `String`. +- IP address assertions use `LIP IPv4 n` / `LIP IPv6 n`, not the old + `LIPv4 (a,b,c,d)` tuple constructors. +- Priority assertions use `Priority n` directly, e.g. `Priority 0`, `Priority (-100)`. +- All parse tests must compile and pass before any PR is merged. + +--- + +## Boundaries + +### ✅ Safe to do without asking +- Read any file, list directories +- Run `cabal build`, `cabal test`, `cabal run fwlc` +- Edit `src/`, `test/`, `examples/`, `docs/` +- Add new test cases to existing test files + +### ⚠️ Ask first +- Add or remove Cabal dependencies (`fwl.cabal`) +- Rename or delete source modules +- Change the nftables JSON schema emitted by `Compile.hs` +- Modify `examples/router.fwl` in ways that change its semantics + +### 🚫 Never +- Add semantic value names (`Allow`, `Drop`, `Log`, etc.) to `reservedNames` +- Break the `cabal test` suite +- Emit nftables `"prio"` as a string — it must always be an integer