Skip to content

Switch the IR pipeline to globally unique local names#150

Merged
Unisay merged 14 commits into
mainfrom
issue-139/guc-unique-names
Jul 3, 2026
Merged

Switch the IR pipeline to globally unique local names#150
Unisay merged 14 commits into
mainfrom
issue-139/guc-unique-names

Conversation

@Unisay

@Unisay Unisay commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Closes #139.

The IR used to address locals by name plus a per-name De Bruijn index, and capture avoidance was index arithmetic spread over every traversal: substitute shifted replacements under same-name binders, passes that drop binders had to unshift, and each walker re-implemented the sequential Let scoping convention. Seven independent scoping bugs (#37, #56, #133, #134, plus the recurse-escape one fixed in #149) traced back to that one distributed invariant, and all of them stayed silent until a shadowed name reached the affected pass.

This PR moves the pipeline to the global-uniqueness condition, the discipline the Plutus compiler uses. A uniquify entry pass renames every local binder to be unique within its top-level binding (both shadowing and parallel duplicates). Everything downstream resolves a reference by name alone, and the index machinery is deleted rather than maintained, up to and including the index field of the Ref node itself: after the final commit a nonzero index is unrepresentable, not merely linted against.

The pieces, in commit order:

  • alphaEq, an index-aware alpha-equivalence for IR terms. removeIfWithEqualBranches uses it instead of structural equality, which would otherwise stop firing once freshening makes branch binder names diverge.
  • A new invariant in the pass vocabulary, UniqueBinders, with its own linter check. The checked runner (test suite always, CLI behind --lint-ir) verifies the declared contract of every pass at every boundary, including each fixpoint iteration. The magic-do discard binder _ is exempt from uniqueness and the linter flags any reference to it, which is what keeps the exemption sound.
  • uniquifyNames itself, with unit tests and three properties: output satisfies the invariants, output is alpha-equivalent to input, and the pass is idempotent.
  • The switch. Term-duplicating rewrites freshen the binders of inserted copies through the shared supply: substituteCopyM freshens every copy (local inlining and cross-binding inlining, where the source or the host may keep same-named binders alive), substituteMoveM keeps the first occurrence verbatim when the rewrite consumes the source binder (beta reduction, magic-do spine classification). A substitution that matches nothing draws no supply names, which keeps the optimize fixpoint convergent and golden numbering stable. shift, unshift, overFreeIndex and the capture-avoiding substitute are gone; DCE resolves references through a name-keyed scope and drops dead binders with no index repair; the old end-of-pipeline renameShadowedNames pass is deleted.
  • The checked runner caught one latent assumption during the switch: FlattenDeepBinds reused chain binder names as helper parameters, which was fine with indices and is a UniqueBinders violation now. Helper parameters are minted fresh instead.

A follow-up group of test commits widens the property coverage: the scoped-expression generator now produces Let bindings too, so the uniquify, freshening and full-pipeline properties exercise sequential Let scoping rather than just lambda binders; a DCE property pins the GUC contract on both sides (uniquified input, invariants intact on output); the --lint-ir failure rendering moved into the library and its exact text is unit-tested; and alphaEq gets a symmetry property on top of reflexivity.

The final commit removes the De Bruijn index from the Ref node altogether. Every producer of a nonzero index was already gone: the CoreFn translation emits references that resolve to the innermost binder of their name (which is what lexical scoping means), and the rewrites freshen binders instead of shifting indices. References carry a name only, scopes become name sets, and the IndicesZero invariant, its linter check, and the index arithmetic in alphaEq, countFreeRefs, qualifyTopRefs and the well-scopedness lint are deleted. The uniquify pass keeps its per-name rename stack (its depth seeds the digit-suffix numbering), so minted names are unchanged and golden.lua files are untouched by that commit; golden.ir files lose only the printed index field.

Generated Lua changes only in local variable names (one wave of structural golden churn, 76 files). All 23 eval/golden.txt oracles pass byte-identical, which is the semantic evidence that behaviour did not move. This unblocks #136 (float-in as a plain Pass requiring UniqueBinders) and #144.

Also in this branch: a chore commit adding tricorder to the dev shell, and a style commit paying off fourmolu debt in two modules this PR does not otherwise touch.

Unisay added 12 commits July 3, 2026 10:45
Structural Eq on branches misses pairs that differ only in binder
names; after the GUC switch (#139) freshening makes such pairs the
norm, silently disabling the rule. alphaEq compares references by
resolved binder position (lockstep levels), following
Note [Sequential scoping of Let bindings], so it is correct both
before the switch (shadowing, nonzero indices) and after.
The two halves of the global-uniqueness condition (GUC, #139) become
checkable: UniqueBinders (no binder name bound twice per top-level
site, the magic-do discard binder _ exempt while unreferenced) and
IndicesZero (every local ref at De Bruijn index 0). The checked runner
dispatches each declared invariant to its own linter. No pass declares
the new invariants yet, so behaviour is unchanged.

lintUberModule is renamed to lintWellScoped now that it is one check
of three; discardName moves to IR.Names so the linter and magic-do
share one definition.
A strengthened renameShadowedNamesInExpr: a used-names accumulator is
threaded sequentially through the whole top-level site (on top of the
scope map that resolves (name, index) references), so parallel
duplicate binders are renamed as well as shadowing ones — full
per-site uniqueness with every reference at index 0. Foreigns are
covered alongside bindings and exports, and the runtime lazy factory
name seeds the accumulator so it is never minted.

Not wired into the pipeline yet: that switch, with its golden churn,
comes separately.
uniquifyNames becomes the entry pass, establishing UniqueBinders +
IndicesZero (GUC) which every later pass declares and preserves; the
checked runner lints the contract at every pass boundary. Term
duplication now freshens binders through the shared supply:
substituteCopyM freshens every inserted copy (local and cross-entry
inlining), substituteMoveM keeps the first occurrence verbatim when
the source binder is consumed (beta reduction, magic-do spine
classification). Zero matching occurrences draw no supply names, which
keeps the optimize fixpoint convergent and golden numbering stable.

The De Bruijn index arithmetic is dismantled: shift/unshift/
overFreeIndex and the capture-avoiding substitute are gone, DCE
resolves references by name without index-keyed scopes and drops dead
binders with no unshift, magicDo and renameShadowedNames-era ordering
constraints disappear (the rename pass itself is deleted; uniquify
subsumes it). FlattenDeepBinds mints fresh helper parameters instead
of reusing chain binder names, which UniqueBinders now forbids.

One module-wide wave of structural golden churn (76 files); all 23
eval oracles pass unchanged.
The GHCi-daemon build/test monitor (github:atelier-hub/tricorder),
plus its cachix binary cache, for a faster feedback loop than
re-running cabal build.
freshenBinders and substituteCopyM/substituteMoveM had only indirect
coverage through the pipeline and goldens. Pin their contracts:
copy-vs-move freshening, no descent into insertions, free references
untouched, alpha-equivalence on GUC-shaped input, and the zero-match ⇒
zero-supply-draws guarantee the optimize fixpoint depends on. The
whole-pipeline property now asserts all three invariants, not just
well-scopedness.
Note [Sequential scoping of Let bindings] now also lists alphaEq and
the linter's unboundLocals among the walkers that must implement the
pre-GUC convention.
The scopedExp generator deliberately excluded Let while the sequential
scoping convention was each pass's job to re-implement; now that the
GUC properties (uniquify, freshenBinders, the full optimizer pipeline,
the linters) all ride on this generator, Let is exactly the binding
form they must see. Groupings thread the scope sequentially per Note
[Sequential scoping of Let bindings]: a Standalone RHS is generated
under the pre-binding scope, recursive-group members (distinct names)
see the whole group. Names come from the shared pool, so shadowing and
parallel duplicates arise naturally.
DCE resolves references by name alone now and requires GUC input, but
the spec neither supplied it nor checked that DCE keeps it. Add the
unit-level twin of the dcePass ensures-contract: on uniquified
scopedExp input (Let forms included since the previous commit), the
output satisfies all three invariants. Make the unused-parameter test
honest about the precondition by uniquifying the body and keeping the
minted binder out of its bound names, and drop the let*-resolution
test: two same-named Let siblings cannot reach DCE anymore, and the
sequential scoping it exercised is covered by Uniquify.Spec.
The --lint-ir failure message was rendered inline in the executable,
unreachable from the test suite. Move the rendering (verbatim) next to
the PassCheckFailure type and pin the exact user-facing text with a
unit test; the CLI handler now just dies with the rendered message.
Reflexivity was the only algebraic law pinned. Independent Gen.exp
pairs cover the mismatch branches; the flipped uniquify direction
covers the equivalent case, which the Uniquify.Spec property checks
only one way around.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Switches the PureScript→Lua backend’s IR pipeline from “(name, De Bruijn index)” locals to a global-uniqueness condition (GUC): locals are uniquely renamed once up-front and all local references become index 0, allowing downstream passes to resolve locals by name only and removing the need for index arithmetic across the optimizer/DCE.

Changes:

  • Add uniquifyNames entry pass + alphaEq, and introduce/validate new invariants (UniqueBinders, IndicesZero) via the checked pass runner and linter.
  • Update IR passes and tests to run under / preserve GUC (including FlattenDeepBinds helper parameter freshening).
  • Regenerate structural golden outputs (IR + Lua) reflecting only renamed locals; eval oracles remain unchanged per PR description.

Reviewed changes

Copilot reviewed 99 out of 102 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/ps/output/Golden.TailRecM2Shadow.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.RecursiveBindings.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.RecordsUpdate.Test/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.RecordsUpdate.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.RecordsAccess.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.RecordsAccess.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.RecGroupOrder.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.RecGroupOrder.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.ProfunctorDictLens.Test/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.ProfunctorDictLens.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.PatternMatching.Test2/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.PatternMatching.Test2/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.PatternMatching.Test1/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.PatternMatching.Test1/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.Newtype.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.Newtype.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.NameShadowing.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.NameShadowing.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.MaybeChainModule.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.MaybeChainModule.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.MaybeChain.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.MaybeChain.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.LongWriterBind.Test/golden.lua Golden Lua churn from uniquified local names (incl. kont numbering changes).
test/ps/output/Golden.LongReaderBind.Test/golden.lua Golden Lua churn from uniquified local names (incl. kont numbering changes).
test/ps/output/Golden.LongReaderBind.Test/golden.ir Golden IR churn from uniquified local names (incl. kont numbering changes).
test/ps/output/Golden.LongDoBlock.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.LongDoBlock.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.LongCallbackChain.Test/golden.lua Golden Lua churn from uniquified local names (incl. kont numbering changes).
test/ps/output/Golden.LongCallbackChain.Test/golden.ir Golden IR churn from uniquified local names (incl. kont numbering changes).
test/ps/output/Golden.Issue37.Test/golden.lua Golden Lua churn from uniquified local names (regression coverage).
test/ps/output/Golden.HelloPrelude.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.HelloPrelude.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.GenericEqTwoTypes.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.DerivedFunctor.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.DerivedFunctor.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.Currying.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.Currying.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.CharLiterals.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.CharLiterals.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.CaseStatements.Test/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.CaseStatements.Test/golden.ir Golden IR churn from uniquified local names / formatting.
test/ps/output/Golden.BugListGenericEq.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.Beta.Test/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.Beta.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.ArrayPatternMatch.Test/golden.lua Golden Lua churn from uniquified local names + IR-level sharing changes reflected in output.
test/ps/output/Golden.ArrayPatternMatch.Test/golden.ir Golden IR churn from uniquified local names + lifted Data.Ring.ringInt binding.
test/ps/output/Golden.ArrayOfUnits.Test/golden.lua Golden Lua churn from uniquified local names.
test/ps/output/Golden.ArrayOfUnits.Test/golden.ir Golden IR churn from uniquified local names.
test/ps/output/Golden.Annotations.M2/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.Annotations.M2/golden.ir Golden IR churn from uniquified local names / formatting.
test/ps/output/Golden.Annotations.M1/golden.lua Golden Lua churn from uniquified local names / formatting.
test/ps/output/Golden.Annotations.M1/golden.ir Golden IR churn from uniquified local names / formatting.
test/Main.hs Wires the new Uniquify spec into the test suite.
test/Language/PureScript/Backend/IR/Uniquify/Spec.hs Adds unit + property tests for uniquification invariants and alpha-equivalence.
test/Language/PureScript/Backend/IR/Pass/Spec.hs Extends pass-runner specs to assert invariant-lint dispatch + CLI rendering.
test/Language/PureScript/Backend/IR/Optimizer/Spec.hs Updates optimizer specs to assert post-pipeline GUC invariants and alpha-equal branch removal.
test/Language/PureScript/Backend/IR/Linter/Spec.hs Adds focused specs for UniqueBinders/IndicesZero and discard-binder behavior.
test/Language/PureScript/Backend/IR/Gen.hs Extends generator to produce Let bindings and recursive groups (sequential scoping coverage).
test/Language/PureScript/Backend/IR/DCE/Spec.hs Updates DCE specs/properties to operate under (and preserve) GUC.
pslua.cabal Exposes the new Language.PureScript.Backend.IR.Uniquify module and test.
lib/Language/PureScript/CoreFn/Laziness.hs Minor import/style adjustment.
lib/Language/PureScript/Backend/Lua/Types.hs Minor type-signature style fix (, ).
lib/Language/PureScript/Backend/Lua.hs Updates local-ref comment to reflect GUC/uniquify entry pass.
lib/Language/PureScript/Backend/IR/Uniquify.hs Implements the uniquify entry pass establishing GUC.
lib/Language/PureScript/Backend/IR/Supply.hs Updates shared-supply documentation for new freshening usage sites.
lib/Language/PureScript/Backend/IR/Pass.hs Adds invariants vocabulary, per-invariant linting, and renderPassCheckFailure.
lib/Language/PureScript/Backend/IR/Names.hs Introduces discardName (_) semantics for UniqueBinders linting.
lib/Language/PureScript/Backend/IR/Linter.hs Splits linting into lintWellScoped/lintUniqueBinders/lintIndicesZero and adds new violations.
lib/Language/PureScript/Backend/IR/FlattenDeepBinds.hs Updates pass to freshen helper parameters under UniqueBinders (GUC).
exe/Main.hs Switches CLI error rendering to library renderPassCheckFailure.
flake.nix Adds tricorder to devshell; adds atelier.cachix.org substituter + trusted key.
changelog.d/20260703_180000_unisay_guc_unique_names.md Changelog entry describing the GUC switch and new --lint-ir checks.

Comment thread lib/Language/PureScript/Backend/IR/Linter.hs Outdated
Comment thread flake.nix
@Unisay Unisay self-assigned this Jul 3, 2026
@Unisay Unisay marked this pull request as ready for review July 3, 2026 12:16
Unisay added 2 commits July 3, 2026 14:18
- Linter.hs:88 — emit at most one RefToDiscard per site: occurrences
  are indistinguishable (the violation carries no location), so
  per-occurrence entries were pure noise
  (#150 (comment))
Every producer of a nonzero index is gone: the CoreFn translation
emits references that resolve to the innermost binder of their name
(which is what lexical scoping means), and the optimizer rewrites
freshen binders instead of shifting indices. The Ref node therefore
carries a name only, and the IndicesZero invariant, its linter check,
and the index arithmetic in alphaEq, countFreeRefs, qualifyTopRefs,
uniquify and the well-scopedness lint are deleted; scopes become name
sets. GUC is UniqueBinders alone: a nonzero index is unrepresentable
rather than linted against.

The uniquify pass keeps its per-name rename stack (the stack depth
seeds the digit-suffix numbering), so minted names are unchanged:
golden.lua files are untouched and eval oracles pass byte-identical;
golden.ir files lose only the printed index field.

The Lua backend renders every local reference as a plain variable,
which retires the UnexpectedRefBound error. The inliner test that
expected a verbatim inserted copy now compares up to alpha-equivalence
(substituteCopyM freshens the binders of every insertion) — the
verbatim expectation was latently flaky whenever the generated inlinee
contained a binder.
@Unisay Unisay merged commit c6fe8e6 into main Jul 3, 2026
2 checks passed
@Unisay Unisay deleted the issue-139/guc-unique-names branch July 3, 2026 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design: globally unique local names (GUC) in the IR pipeline

2 participants