Conversation
SJrX
pushed a commit
that referenced
this pull request
Aug 3, 2018
SJrX
pushed a commit
that referenced
this pull request
Jun 25, 2026
…iant When a value is well-formed but invalid, an ambiguous grammar can yield several full parses that tokenize the same string differently, each with a different first-invalid token. validate() used to keep whichever the lazy stream yielded first, so the reported squiggle/quick-fix depended on incidental combinator iteration order and nothing pinned it. Report instead the bad token from the parse that stayed valid the longest (max start offset), mirroring how SyntaxError reports the furthest offset reached. That rule is invariant under iteration order; the only residual tie — two parses whose first-invalid token starts at the same offset (two enums over the same shape) — is broken by stream order, i.e. the earlier-declared alternative, which an author can steer. Adds ParseTest coverage: order-invariance across both branch orderings, and the declaration-order tie-break. Refs review finding #3 on PR #490. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SJrX
added a commit
that referenced
this pull request
Jun 25, 2026
#490) * feat: add opt-in list-of-successes grammar engine + dual-engine test infra Grows a second matching method, Combinator.parse(), alongside the existing SyntacticMatch/SemanticMatch on every combinator. Where the old engine commits to one greedy result over two near-identical passes, parse() returns every way a combinator can proceed (lazily) and folds the strict semantic check into a `valid` flag per token, so a single lenient pass answers both "could this be colored?" and "is this actually valid?". Failure is a value (Stuck) rather than an empty result, so error localization (and, later, completion) falls out of the return value with no side channel. Nothing in the 200+ production grammars changes. The new path is gated behind an ExperimentalSettings project flag (surfaced as a checkbox on the settings page), so the original engine remains the default and no user-visible behaviour changes. To prove parity, GrammarOptionValue.FORCE_PARSE_ENGINE lets the whole unit-test suite run against the new engine via -Dsystemd.unit.grammarParseEngine=true: - build.gradle.kts forwards the system property to the test task - GitHub Actions gains a grammarParseEngine: [false, true] matrix leg - ci/release.Jenkinsfile runs the suite a second time in a parallel pod with the engine forced on, so every release build validates the experimental path Only validation is forced (not cosmetics), so problem counts are identical and only exact error spans differ; the one such case (SocketBind) is made mode-aware. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: clarify LiteralChoiceTerminal.parse comment; drop unused fullParses Replace the ":"/"::" comment example (those are two separate terminals in the grammar, not choices of one) with CollectMode's ("inactive", "inactive-or-failed"), a real prefix-overlap case, and explain why offering every match avoids greedy dead-ends. Remove Combinator.fullParses, which was never referenced (validate() inlines the equivalent full-parse check). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: port GrammarTest to the new engine (GrammarParseTest) While both engines ship side by side we want full parity coverage of parse() on every combinator, so GrammarParseTest mirrors GrammarTest method-for-method, driving the new list-of-successes engine instead of SyntacticMatch/SemanticMatch. A small View bridge recovers the MatchResult shape so each port reads like the original (syntactic = longest consumption ignoring validity; semantic = longest all-valid consumption). Two intentional differences are asserted and commented rather than hidden: - FlexibleLiteralChoiceTerminal: the old engine prefers an exact choice over the lenient shape regex; parse() currently runs only the regex, so "abcxx" against ("abc","defg") matches the greedy "abcx" (invalid) instead of "abc". Worth resolving before the engine becomes default. - Repeat zero-rep matches: the old engine reported longestMatch = 0; the new one reports the actual end offset of the empty match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: make SemanticError token selection deterministic and order-invariant When a value is well-formed but invalid, an ambiguous grammar can yield several full parses that tokenize the same string differently, each with a different first-invalid token. validate() used to keep whichever the lazy stream yielded first, so the reported squiggle/quick-fix depended on incidental combinator iteration order and nothing pinned it. Report instead the bad token from the parse that stayed valid the longest (max start offset), mirroring how SyntaxError reports the furthest offset reached. That rule is invariant under iteration order; the only residual tie — two parses whose first-invalid token starts at the same offset (two enums over the same shape) — is broken by stream order, i.e. the earlier-declared alternative, which an author can steer. Adds ParseTest coverage: order-invariance across both branch orderings, and the declaration-order tie-break. Refs review finding #3 on PR #490. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: correct the SemanticError tie-break note in validate() The tie-break is "first in stream order," which only equals "earlier-declared alternative" when the tie comes from an AlternativeCombinator. Ties from other combinators follow their own order (LiteralChoiceTerminal longest-first, repetitions shorter-count-first), so the previous wording overclaimed that an author can always steer it by ordering. Clarified both the doc block and the inline note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Steve Ramage <gitcommits@sjrx.net> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…sts.