Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,9 @@ The project uses Hedgehog for property-based testing:

### Known Pitfalls

See also `docs/QUIRKS.md` for the catalogue of compiler/Lua-target quirks
(parser nesting limit, `Char` byte-string shapes, Lua 5.1 floor, …).

- **`unit` must not be `nil`**: Lua tables cannot hold `nil` values, so
`Array Unit` silently collapses to an empty table if the prelude defines
`unit = nil`. Requires `purescript-lua/purescript-lua-prelude` ≥ v7.2.0, where
Expand Down
139 changes: 139 additions & 0 deletions docs/QUIRKS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Compiling to Lua with pslua — behaviors & gotchas

A guide for people writing PureScript and compiling it to Lua with `pslua`:
how PureScript values map onto Lua, what you need to know to write FFI, and the
handful of surprises worth knowing up front.

> This page is for **users** of the compiler. Compiler-internal pitfalls live
> in `CLAUDE.md` and in inline source Notes.

## Target: Lua 5.1

`pslua` targets **stock Lua 5.1**. Code it emits — and any FFI you write — must
run there. The things most likely to trip you up, because they exist in later
Lua but not 5.1:

- **Missing:** `table.unpack` / `table.pack` / `table.move`, the `bit32`
library, the `utf8` library, the `//` floor-division operator.
- **Present (but easy to forget they moved later):** `math.pow`, `math.atan2`.
- Tables are **1-indexed**.
- Relational operators (`<`, `>`, …) **error** on booleans instead of coercing.
- `error(msg)` at level ≥1 prepends `chunk:line:` to your message. Use
`error(msg, 0)` to raise the raw string unchanged.

If your FFI uses a 5.2+ builtin it will work under a newer interpreter and fail
under 5.1 — test on 5.1.

## How PureScript values look in Lua

| PureScript | Lua |
|---|---|
| `Int`, `Number` | `number` (5.1 numbers are doubles) |
| `String`, `Char` | `string` (a **byte** string — see below) |
| `Boolean` | `boolean` |
| `Array a` | table, **1-indexed** |
| record `{ a, b }` | table keyed by field name (`t.a`, `t.b`) |
| `Unit` | `{}` (empty table) — **never `nil`** |
| function `a -> b -> c` | **curried**: call as `f(a)(b)` |
| `Effect a` | a **nullary thunk** `function() … return a end`; run it by calling `m()` |
| data constructor | table carrying a tag plus its fields |

Notes that bite in practice:

- **`Unit` is `{}`, not `nil`** — because Lua tables can't store `nil` (a `nil`
field just doesn't exist). The same rule applies to your own FFI: never put
`nil` into an array or record you build, or those elements silently vanish.
- **Strings/Chars are byte strings.** A non-ASCII code point is *several* bytes.
Code-point-aware operations go through `Data.String.CodePoints`;
`Data.String.CodeUnits` slices **byte-wise**. If you write FFI that consumes a
`Char`, it may receive either a single byte (from `CodeUnits`) or a full
multibyte sequence (from a literal) — handle both, e.g. guard on `#c` before
reading `c:byte(2)`.
- **`Effect` is a thunk.** FFI that produces an `Effect a` must return a
*function* that, when called, performs the action and returns the `a`. FFI
that produces a plain `a` returns the value directly.

## Writing FFI: the foreign module shape

A foreign `.lua` file is split into a **header** and an **exports table**:

- Everything **before the first line that starts with `return` at column 0** is
a header of shared top-level `local` helpers, in scope for the exported
values. `return`s *inside* header functions must be indented (only the
exports `return` sits at column 0).
- The exports are a single returned table of fields. **Each export value must be
wrapped in parentheses** — `key = (<lua expression>)`. The FFI parser requires
the `(...)`; a bare `key = function … end` will not parse.
- **Do not put `--` comments between table fields** — the parser does not accept
them there. Put comments inside function bodies or in the header.

```lua
-- header: shared helpers (this comment is fine — it's in the header)
local function helper(x)
return x + 1 -- indented: ok
end

return {
foo = (function(a) return helper(a) end),
bar = (function(a) return function(b) return a + b end end),
}
```

## Very long `do` blocks in non-`Effect`/`ST` monads

A straight-line `do` block compiles to a chain of `bind`/`discard` whose
continuations nest lexically, and Lua's parser caps how deeply expressions may
nest (~200 levels). A long enough chain fails to **load** (before any code
runs) with:

```
lua: yourfile.lua:NNN: chunk has too many syntax levels
```

`Effect` and `ST` blocks are exempt: the compiler flattens them into a plain
statement sequence, so they have no practical length limit. The limit only
bites **other** monads — `Maybe`, `Either`, `State`, a custom parser/decoder,
or a large applicative (`ado`) constructor — and only at ~200+ straight-line
statements in a single block. That is rare outside machine-generated code and
very wide record decoders. Recursion does **not** trigger it (a recursive
function is a normal call, not nested).

**If you hit it:** split the block into smaller named pieces (extract
sub-computations into separate functions and sequence them), or break a wide
record decode into chunks. A general (monad-agnostic) fix is tracked in
[#104](https://github.com/purescript-lua/purescript-lua/issues/104).

## Lua's per-function size limits (locals & upvalues)

Separate from the parser-nesting limit above, Lua 5.1 caps the *contents* of a
single function at load time:

- **~200 local variables** per function (`LUAI_MAXVARS`). Exceeding it fails to
load with `function at line N has more than 200 local variables`.
- **~60 upvalues** per function (`LUAI_MAXUPVALUES`) — variables a closure
captures from an enclosing scope. Exceeding it fails with
`function at line N has more than 60 upvalues`. (Lua ≥5.2 raises this to 255;
5.1 is the floor pslua targets.)

Both have bitten the compiler before: long `Effect` chains used to build deeply
nested closures, each capturing many top-level references as upvalues
([#19](https://github.com/purescript-lua/purescript-lua/issues/19)). The
compiler now keeps its own output within these limits — Effect/ST `do` blocks
are flattened to a flat statement sequence (no capturing closure per step), and
that sequence is chunked so no generated function exceeds the local-variable
limit.

If you write your own FFI, the same caps apply to *your* Lua: a single function
with hundreds of locals, or one that captures more than ~60 distinct outer
variables, will not load under Lua 5.1 — split it into smaller functions.

## Deep recursion & stack safety

Lua has proper tail calls, so a function whose recursive call is in tail
position (`return go(next)`) loops in constant stack. But non-tail recursion —
especially building up a monadic computation by recursing — can overflow the Lua
stack at runtime, just as on other backends.

For stack-safe monadic loops use `Control.Monad.Rec.Class` (`tailRecM`,
`tailRecM2`, `forever`) instead of open recursion. Its `Effect` instance runs as
a real loop, so it stays in constant stack regardless of iteration count.
Loading
Loading