fix(sql): return the last N rows for LIMIT -N over a covering index#7182
Conversation
A negative LIMIT over a posting/covering index with a residual filter silently returned the first N rows instead of the last N. The covering factory ignored the requested scan order when serving a page-frame cursor: getPageFrameCursor always opened ascending partition frames while getScanDirection() advertised FORWARD, so the parallel negative-limit machinery collected the lowest-timestamp rows believing they were the highest. Single-key covering queries now honor ORDER_DESC with a genuine backward scan: DESC partition iteration plus per-partition row-range sub-frames emitted high-range-first, each read forward so rows stay ascending within a frame. This mirrors BwdTableReaderPageFrameCursor and stays correct when a partition holds more than maxRowsPerFrame matching rows, where a partition-reverse-only fix would not. Parallel filtering is preserved for this shape. Multi-key covering is only per-key-per-partition ordered, never globally timestamp-ordered, so a correct last-N-by-ts cannot come from it without a k-way merge. When the limit may be negative (a constant below zero, or a bind variable whose sign is unknown at compile time), codegen routes multi-key queries to the serial FilteredRecordCursorFactory, where LimitRecordCursorFactory computes last-N via size plus skip. A defensive guard in getPageFrameCursor rejects a backward scan on the multi-key path. getScanDirection() stays FORWARD so ORDER BY ts DESC is not wrongly elided onto the ascending cursor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThis PR fixes a correctness bug where ChangesCovering Index Negative LIMIT and Ordering Fixes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Multi-key queries over a covering (posting) index returned rows grouped by key -- each key's posting list drained in full before the next -- instead of in designated-timestamp order. That disagreed with the plain index path on the same data for every shape that expects timestamp order: bare scans, positive and negative LIMIT, and even an explicit ORDER BY ts, whose sort was elided onto the unordered cursor. Both covering cursors now k-way merge the per-key cursors by row id, mirroring HeapRowCursorFactory on the non-covering path: - the serial record cursor holds one open cursor per key and emits the smallest row id each step; - the parallel page-frame cursor merges keys into timestamp-ordered frames (symbol key written per row), chunked at maxRowsPerFrame with the merge state resumed across frames. Separately, ORDER BY ts DESC with a LIMIT returned the wrong end even for a single key: codegen pushed the limit into the ascending parallel scan ahead of the sort, truncating the oldest rows so the sort then reversed those. The limit is now pushed into the scan only when the covering forward scan order is already the query's final order; otherwise the outer LimitRecordCursorFactory applies it after the sort. Multi-key negative limits still use the serial path, as the page-frame merge is forward-only. Updated the existing covering tests that had encoded the old key-grouped order, and added a cross-check suite that compares the covering path against the no_covering path across data layouts, filters, limits and orderings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java (2)
336-339: ⚡ Quick winConsider using
assertQueryNoLeakCheck()for factory property validation.The current implementation uses
printSql+Assert.assertEquals, which only validates data correctness.assertQueryNoLeakCheck()additionally validates factory properties (supportsRandomAccess, expectSize, expectedTimestamp), providing more comprehensive test coverage.🔍 Example refactor
If there's a suitable
assertQueryNoLeakCheckoverload that accepts expected results as a string:private void assertResult(String expected, String query) throws SqlException { - printSql(query); - Assert.assertEquals(query, expected, sink.toString()); + assertQueryNoLeakCheck(expected, query, null, null, true, true); }Note: Adjust the overload and parameters based on the actual AbstractCairoTest API.
Based on learnings: Use
assertQueryNoLeakCheck()to assert the results of queries. This method asserts factory properties (supportsRandomAccess, expectSize, expectedTimestamp) in addition to data correctness.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java` around lines 336 - 339, Replace the current assertion that only checks data correctness in assertResult (which calls printSql and Assert.assertEquals against sink.toString()) with a call to assertQueryNoLeakCheck so factory properties are also validated; locate the assertResult method and its callers and change the assertion to use assertQueryNoLeakCheck with the same expected string and query (adjust parameters to the available overload on AbstractCairoTest), removing the manual printSql/Assert.assertEquals usage that compares sink.toString().
127-129: 💤 Low valueConsider using Java 17 text blocks for SQL statements.
The INSERT statements use string concatenation. Text blocks (multiline string literals) would improve readability and align with modern Java 17 practices recommended in the coding guidelines.
📝 Example refactor for line 127-129
- execute("INSERT INTO t_disj " + - "SELECT dateadd('d', x::INT, '2024-01-01T00:00:00Z'::TIMESTAMP), " + - "CASE WHEN x <= 5 THEN 'A' ELSE 'B' END, x::DOUBLE FROM long_sequence(10)"); + execute(""" + INSERT INTO t_disj + SELECT dateadd('d', x::INT, '2024-01-01T00:00:00Z'::TIMESTAMP), + CASE WHEN x <= 5 THEN 'A' ELSE 'B' END, + x::DOUBLE + FROM long_sequence(10) + """);Apply similar transformation to other INSERT statements throughout the file.
As per coding guidelines: Use modern Java 17 features: enhanced switch, multiline string literals, and pattern variables in instanceof checks.
Also applies to: 157-159, 210-211, 225-226, 291-293, 345-347
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java` around lines 127 - 129, Replace the concatenated SQL string passed to execute(...) in CoveringIndexMultiKeyOrderingTest (the INSERT INTO t_disj statement) with a Java 17 text block to improve readability; locate the execute(...) call in the test class (CoveringIndexMultiKeyOrderingTest) and convert the multi-part string to a single triple-quoted text block containing the full SQL, then apply the same refactor to the other INSERT statements mentioned (the execute(...) calls around lines referenced: 157-159, 210-211, 225-226, 291-293, 345-347) so all SQL literals use multiline string literals instead of concatenation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java`:
- Around line 68-93: The two static String[] fields are out of alphabetical
order; move EXPLICIT_ORDER_TAILS so it appears before TS_ORDER_TAILS to satisfy
member ordering rules. Edit the class to swap the declarations of
EXPLICIT_ORDER_TAILS and TS_ORDER_TAILS (referencing the exact symbols
EXPLICIT_ORDER_TAILS and TS_ORDER_TAILS) so static fields remain grouped and
sorted alphabetically without changing their contents.
---
Nitpick comments:
In
`@core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java`:
- Around line 336-339: Replace the current assertion that only checks data
correctness in assertResult (which calls printSql and Assert.assertEquals
against sink.toString()) with a call to assertQueryNoLeakCheck so factory
properties are also validated; locate the assertResult method and its callers
and change the assertion to use assertQueryNoLeakCheck with the same expected
string and query (adjust parameters to the available overload on
AbstractCairoTest), removing the manual printSql/Assert.assertEquals usage that
compares sink.toString().
- Around line 127-129: Replace the concatenated SQL string passed to
execute(...) in CoveringIndexMultiKeyOrderingTest (the INSERT INTO t_disj
statement) with a Java 17 text block to improve readability; locate the
execute(...) call in the test class (CoveringIndexMultiKeyOrderingTest) and
convert the multi-part string to a single triple-quoted text block containing
the full SQL, then apply the same refactor to the other INSERT statements
mentioned (the execute(...) calls around lines referenced: 157-159, 210-211,
225-226, 291-293, 345-347) so all SQL literals use multiline string literals
instead of concatenation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 972b7007-d4aa-4a50-83ee-4b68f9ffeb82
📒 Files selected for processing (4)
core/src/main/java/io/questdb/griffin/SqlCodeGenerator.javacore/src/main/java/io/questdb/griffin/engine/table/CoveringIndexRecordCursorFactory.javacore/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.javacore/src/test/java/io/questdb/test/cairo/covering/CoveringIndexTest.java
| private static final String[] TS_ORDER_TAILS = { | ||
| "", | ||
| "LIMIT 3", | ||
| "LIMIT 7", | ||
| "LIMIT 100", | ||
| "LIMIT -3", | ||
| "LIMIT -7", | ||
| "LIMIT -100", | ||
| "LIMIT 2,5", | ||
| "LIMIT -5,-2", | ||
| "ORDER BY ts", | ||
| "ORDER BY ts LIMIT 3", | ||
| "ORDER BY ts LIMIT -3", | ||
| "ORDER BY ts DESC", | ||
| "ORDER BY ts DESC LIMIT 3", | ||
| "ORDER BY ts DESC LIMIT -3", | ||
| }; | ||
|
|
||
| // Shapes with an explicit non-timestamp order; both paths route through the | ||
| // same sort / key-grouping, so they must agree regardless of the bug. | ||
| private static final String[] EXPLICIT_ORDER_TAILS = { | ||
| "ORDER BY sym", | ||
| "ORDER BY sym, ts", | ||
| "ORDER BY price", | ||
| "ORDER BY price DESC", | ||
| }; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Reorder static fields alphabetically.
The static fields violate the alphabetical ordering requirement. EXPLICIT_ORDER_TAILS should appear before TS_ORDER_TAILS.
📋 Proposed reordering
- // Shapes that expect global timestamp order; the covering path must match the
- // plain path on all of them.
- private static final String[] TS_ORDER_TAILS = {
- "",
- "LIMIT 3",
- "LIMIT 7",
- "LIMIT 100",
- "LIMIT -3",
- "LIMIT -7",
- "LIMIT -100",
- "LIMIT 2,5",
- "LIMIT -5,-2",
- "ORDER BY ts",
- "ORDER BY ts LIMIT 3",
- "ORDER BY ts LIMIT -3",
- "ORDER BY ts DESC",
- "ORDER BY ts DESC LIMIT 3",
- "ORDER BY ts DESC LIMIT -3",
- };
-
// Shapes with an explicit non-timestamp order; both paths route through the
// same sort / key-grouping, so they must agree regardless of the bug.
private static final String[] EXPLICIT_ORDER_TAILS = {
"ORDER BY sym",
"ORDER BY sym, ts",
"ORDER BY price",
"ORDER BY price DESC",
};
+
+ // Shapes that expect global timestamp order; the covering path must match the
+ // plain path on all of them.
+ private static final String[] TS_ORDER_TAILS = {
+ "",
+ "LIMIT 3",
+ "LIMIT 7",
+ "LIMIT 100",
+ "LIMIT -3",
+ "LIMIT -7",
+ "LIMIT -100",
+ "LIMIT 2,5",
+ "LIMIT -5,-2",
+ "ORDER BY ts",
+ "ORDER BY ts LIMIT 3",
+ "ORDER BY ts LIMIT -3",
+ "ORDER BY ts DESC",
+ "ORDER BY ts DESC LIMIT 3",
+ "ORDER BY ts DESC LIMIT -3",
+ };As per coding guidelines: Java class members must be grouped by kind (static vs. instance) and visibility, and sorted alphabetically.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private static final String[] TS_ORDER_TAILS = { | |
| "", | |
| "LIMIT 3", | |
| "LIMIT 7", | |
| "LIMIT 100", | |
| "LIMIT -3", | |
| "LIMIT -7", | |
| "LIMIT -100", | |
| "LIMIT 2,5", | |
| "LIMIT -5,-2", | |
| "ORDER BY ts", | |
| "ORDER BY ts LIMIT 3", | |
| "ORDER BY ts LIMIT -3", | |
| "ORDER BY ts DESC", | |
| "ORDER BY ts DESC LIMIT 3", | |
| "ORDER BY ts DESC LIMIT -3", | |
| }; | |
| // Shapes with an explicit non-timestamp order; both paths route through the | |
| // same sort / key-grouping, so they must agree regardless of the bug. | |
| private static final String[] EXPLICIT_ORDER_TAILS = { | |
| "ORDER BY sym", | |
| "ORDER BY sym, ts", | |
| "ORDER BY price", | |
| "ORDER BY price DESC", | |
| }; | |
| // Shapes with an explicit non-timestamp order; both paths route through the | |
| // same sort / key-grouping, so they must agree regardless of the bug. | |
| private static final String[] EXPLICIT_ORDER_TAILS = { | |
| "ORDER BY sym", | |
| "ORDER BY sym, ts", | |
| "ORDER BY price", | |
| "ORDER BY price DESC", | |
| }; | |
| // Shapes that expect global timestamp order; the covering path must match the | |
| // plain path on all of them. | |
| private static final String[] TS_ORDER_TAILS = { | |
| "", | |
| "LIMIT 3", | |
| "LIMIT 7", | |
| "LIMIT 100", | |
| "LIMIT -3", | |
| "LIMIT -7", | |
| "LIMIT -100", | |
| "LIMIT 2,5", | |
| "LIMIT -5,-2", | |
| "ORDER BY ts", | |
| "ORDER BY ts LIMIT 3", | |
| "ORDER BY ts LIMIT -3", | |
| "ORDER BY ts DESC", | |
| "ORDER BY ts DESC LIMIT 3", | |
| "ORDER BY ts DESC LIMIT -3", | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@core/src/test/java/io/questdb/test/cairo/covering/CoveringIndexMultiKeyOrderingTest.java`
around lines 68 - 93, The two static String[] fields are out of alphabetical
order; move EXPLICIT_ORDER_TAILS so it appears before TS_ORDER_TAILS to satisfy
member ordering rules. Edit the class to swap the declarations of
EXPLICIT_ORDER_TAILS and TS_ORDER_TAILS (referencing the exact symbols
EXPLICIT_ORDER_TAILS and TS_ORDER_TAILS) so static fields remain grouped and
sorted alphabetically without changing their contents.
MultiKeyCoveringPageFrameCursor.fillMergedFrame allocated a full INITIAL_CAPACITY buffer set before the merge loop discovered the partition was drained and returned null. Because nextImpl re-enters fillMergedFrame once more per partition to learn it is exhausted, every merged partition paid one buffer allocation it never used -- for VARCHAR/STRING/ARRAY columns that is an aux buffer plus a capacity*32 var-data buffer per column, plus the symbol buffer. fillMergedFrame now scans the merge heads first and returns null before allocFrameBuffers() when every head is drained. The scan short-circuits on the first present head, so data-bearing frames pay a single comparison; the drained call skips the allocation. The early return is equivalent to the prior best < 0 / count == 0 path, and the skipped buffers belonged to a frame that was never produced, so no dispatched frame is affected. The single-key fillFrameForKey already guards its allocation behind the empty-cursor check; this brings the multi-key merge in line. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code reviewReviewed the full change surface, not just the diff. The core fix is correct and well-engineered: the descending scan replicates Verdict: Approve with minor changes. CriticalNone. ModerateM1 — PR description overstates the performance benefit. The Tradeoffs bullet says single-key negative-limit queries "now read fewer rows for small M2 — Multi-key k-way merge uses an O(N)/row linear min-scan; the "mirrors HeapRowCursorFactory" comment is inaccurate. M3 — Test coverage gaps on the new paths. Two worth closing:
M4 — Minorm1 — m2 — Pre-existing error-path leak in m3 — Stale comment. m4 — Boolean field naming. m5 — Duplicated merge logic. The min-head selection loop and the park-before-probe partition-open loop are near-identical in m6 — m7 — Test convention. The explicit-value tests ( m8 — Labels. Consider adding Performance (k-way merge + buffer-alloc commit) and regression (9.4.0 shipped Verified safe (checked and cleared)
SummaryMerge after addressing the description accuracy (M1), member ordering (M4), and ideally the var-width / filter-drop negative-limit tests (M3). The performance items (M2) and minors are non-blocking. The multi-key output order now matches the plain-index path (timestamp-ordered), which is why 7 existing assertions were updated — a correct, user-visible consistency improvement. 🤖 Generated with Claude Code |
Heads-up: k-way merge is introduced in #7150 (there is a separate bug which requires k-way merge similar to FilterOnValues cursor's one). |
wrapCoveringWithFilter created the limit function and received a CoveringIndexRecordCursorFactory (which already owns the partition frame cursor factory) but had no cleanup on its own throw paths. A throw from mayBeNegativeLimit's init(), collectColumnIndexes, the worker-filter JIT compile, or the AsyncFilteredRecordCursorFactory constructor leaked the covering factory's cursors and key functions, the filter, and the freshly created limit function. The only enclosing handler freed the partition frame cursor factory alone. The method body now runs under a try/catch(Throwable) that frees the filter, the limit function, and the covering factory before rethrowing. Closing the covering factory also closes the partition frame cursor factory, so the two callers now null their own reference once they hand it to the covering factory, and the outer catch no longer double-frees it. The method frees the limit function through x = Misc.free(x) on the non-pushed and serial-fallback paths, keeping the catch idempotent when it was already released or transferred. Both factory constructors take no effective ownership on failure -- the half-built object never reaches a caller, so close() never runs -- so freeing all three inputs in the catch is correct. Add four negative-limit tests to CoveringIndexMultiKeyOrderingTest: - var-width covered columns (VARCHAR with a NULL, DOUBLE[]) on the single-key backward path, at default cap and 2-row sub-frames - the same columns on the multi-key merge and serial-fallback paths - a 10k-row single partition with LIMIT -5000 that forces a growFrameBuffers mid-fill relocation - a 2-row cap with a residual filter that drops rows in the surviving top descending sub-frame under a negative limit CoveringIndexMultiKeyOrderingTest 19/19 and CoveringIndexTest 378/378 pass under assertMemoryLeak. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- nextImplDescending now derives the sub-frame low bound from Math.max(1, maxRowsPerFrame), so a test-only cap of 0 cannot stall the descending loop (subLo == descSubHi). - Move mayBeNegativeLimit, with its javadoc, into alphabetical order between lookupColumnIndexesUsingVanillaNames and prepareLatestByColumnIndexes. - Restore the fillFrameForKey rowCap/pendingRowCursor contract javadoc onto fillFrameForKey and give allocFrameBuffers its own short doc; the cap split had left the contract sitting on allocFrameBuffers. - Correct the MultiKeyCoveringCursor merge comment: it matches the result order HeapRowCursorFactory produces but uses a linear O(R*N) min-scan, not that path's O(log N) heap poll. - The unreachable backward-multi-key guard now throws CairoException.nonCritical() instead of critical(0), which falsely implied storage corruption. - Reword a stale comment: an empty multiKeys list makes hasNext()'s merge find no heads and openNextPartitionCursors() open nothing, rather than the never-called advanceKey() reporting no rows. CoveringIndexMultiKeyOrderingTest 19/19 and CoveringIndexTest 378/378 pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Two moderate items I'd like to highlight. They're not blockers and are more or less essential for the chosen fix, but that's something we could improve in the future patches. M1 — Negative-limit descending path over-reads/over-materializes [in-diff, perf]SingleKeyCoveringPageFrameCursor.nextImplDescending (CoveringIndexRecordCursorFactory.java:2366) sizes each backward sub-frame to a fixed maxRowsPerFrame (default 1,000,000) window of row-ids and calls fillFrameForKey(..., Integer.MAX_VALUE), which materializes every matching row in that window into native buffers (writeCoveredRow) before the consumer reads anything. AsyncFilteredNegativeLimitRecordCursor.fetchAllFrames (:215) cancels only after the first frame is fully built. So ... LIMIT -5 over a hot key in a large latest partition copies up to ~1M rows to return 5. Unlike the standard backward cursor, this path ignores pageFrameMinRows/sharedQueryWorkerCount — BwdTableReaderPageFrameCursor:480 sizes frames via calculatePageFrameRowLimit(...). The per-row copy is inherent to covering (no mmap'd column to point at, unlike the zero-copy BwdTableReaderPageFrameCursor), but the fixed 1M window is an avoidable parameter choice. Note: enabling covering can therefore be slower than the plain index for large LIMIT -N. Confirmed independently by Agents 3, 10, 11. Mitigation: thread the limit N down to bound the first sub-frame, and/or adopt calculatePageFrameRowLimit for consistency. M2 — Multi-key negative-limit serial fallback scans ~2–3× [in-diff, perf tradeoff]The PR routes multi-key negative/bind-var limits to FilteredRecordCursorFactory + LimitRecordCursorFactory. Because MultiKeyCoveringCursor.size() returns -1 (:1924), LimitRecordCursorFactory.ensureBoundsResolved (LimitRecordCursorFactory.java:279) runs a full calculateSize counting pass, then toTop + skip + take — reconstructing the k-way merge ~2–3 times. The PR explicitly accepts this throughput tradeoff, and it's correct. But SingleKeyCoveringCursor.size() (:2202) already shows the pattern: a cheap MultiKeyCoveringCursor.size() |
The CI "Applying formatting" step runs IntelliJ's CLI formatter and then
fails the build via "git diff --exit-code" if it changes anything. Three
.returns("""...""") text blocks had their content and closing delimiter
committed at the same indent as the opening .returns call, while the
formatter expects the content indented eight spaces deeper (matching every
other text block in the file).
Reindent the three blocks to the formatter's expectation. The change is
whitespace-only: Java strips the common leading indentation of a text
block based on its least-indented line, so the asserted string values are
unchanged and the tests still verify the same output.
This unblocks the "Rust Test and Lint on linux-jdk25" job, where the
formatting gate runs.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
/azp run macwin |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run macwin |
|
Azure Pipelines successfully started running 1 pipeline(s). |
[PR Coverage check]😍 pass : 243 / 256 (94.92%) file detail
|
Resolve conflicts from upstream PRs #7182 (LIMIT -N over a covering index), #7187 (fluent QueryAssertion builder) and #7157 (ORDER BY native-memory byte caps). CoveringIndexRecordCursorFactory: both this branch and #7182 independently reworked the multi-key merge to emit rows in row-id (ts-ascending) order. Take upstream's version as the base -- its min-scan merge plus the new single-key LIMIT -N backward scan -- then re-apply this branch's additive fixes on top: getScanDirection(), duplicate IN-key dedup, and the TablePageFrameCursor promotion (fixes the parallel-keyed-group-by-under-SelectedRecord ClassCastException). getScanDirection() reports OTHER only for multi-key latestBy; single-key latestBy stays FORWARD (it returns one row, trivially ts-ordered) so the designated-timestamp metadata upstream's tests expect is preserved. EncodedSortRecordCursor: keep upstream's extracted throwLimitOverflow(). Tests: take upstream's fluent-builder assertions for shared cases and keep this branch's new regression tests. #7187 removed the legacy positional assertQuery()/assertQueryNoLeakCheck() helpers, so migrate ~35 remaining call sites across 10 test files to the fluent builder. Verified: core test-compile clean; CoveringIndexTest (394), ExplainPlanTest (534), AsOfJoinTest (115), CompiledFilterTest (50), LimitTest (47) and the migrated group-by / distinct / IPv4 / array tests all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring in master's #7182 ("return the last N rows for LIMIT -N over a covering index"). Its production changes to SqlCodeGenerator and CoveringIndexRecordCursorFactory apply cleanly; this branch touched neither file. The only conflict was in CoveringIndexTest.java. Master added four LIMIT -N regression tests (testNegativeLimit*) plus two private helpers, getPlan and assertPlanDoesNotContain. This branch had already deleted both helpers when it migrated covering-index plan checks to the QueryAssertion builder (.assertsPlanContaining / .assertsPlanNotContaining). Resolution: keep master's four new tests and restore getPlan, which those tests call. Drop assertPlanDoesNotContain, which now has no callers - this matches the branch's deletion and master's intent of adding the tests. The new tests still use assertSql/getPlan rather than the builder; migrating them is left as a follow-up since they exercise the parallel Async Filter path whose factory does not fit the builder's full random-access battery without per-test tuning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixes #7181
Summary
SELECT ... LIMIT -Nover a covering (posting) index with a residual filter returned the first N rows instead of the last N. The result was silently wrong — no error, no warning. This affected single-key queries (WHERE sym = 'A' AND price > 0 LIMIT -3) and the same shape with a multi-key (sym IN (...)) predicate or a bind-variable limit.The covering factory ignored the requested scan order when serving a page-frame cursor:
getPageFrameCursor(executionContext, order)always opened ascending partition frames, whilegetScanDirection()advertisedFORWARD. The parallel negative-limit machinery then collected the lowest-timestamp rows believing they were the highest.Fix
The two query shapes need different handling because they have different ordering guarantees.
Single-key covering queries are globally timestamp-ordered (partitions ascending, rows ascending within a partition), so a backward scan yields the true last N rows.
getPageFrameCursornow honorsORDER_DESC:maxRowsPerFramewide and emit the highest row-range sub-frame first, each read forward so rows stay ascending within a frame.This mirrors
BwdTableReaderPageFrameCursorand stays correct when a partition holds more thanmaxRowsPerFramematching rows — a case where a partition-reverse-only fix would still return wrong results. Parallel filtering is preserved for this shape.Multi-key covering is only per-key-per-partition ordered, never globally timestamp-ordered, so a correct last-N-by-timestamp cannot be produced from it without a k-way merge. When the limit may be negative (a constant below zero, or a bind variable whose sign is unknown at compile time), codegen routes multi-key queries to the serial
FilteredRecordCursorFactory, whereLimitRecordCursorFactorycomputes last-N via size plus skip. A defensive guard ingetPageFrameCursorrejects a backward scan on the multi-key path.getScanDirection()staysFORWARDsoORDER BY ts DESCis not wrongly elided onto the ascending cursor.Tradeoffs
N(early termination from the high end), instead of scanning ascending and discarding.Limitnode. The bind-variable case takes the serial path even when the value turns out positive at runtime, because the sign is unknown at compile time. This is a correctness-for-throughput trade on an otherwise silently-wrong path; positive-limit and no-limit multi-key queries are unaffected and still run in parallel.Test plan
CoveringIndexTest:LIMIT -3returns8,9,10), including a filter that eliminates the low rows and a larger limit spanning all partitionssetMaxRowsPerFrameForTesting(2), which a partition-reverse-only fix would still fail, across-3/-4/-5to cross sub-frame boundariesAsync Filterpath and returns the correct tail, while a positive multi-key limit still doesCoveringIndexTest(378),PostingIndexCriticalIssuesTest(40), andCoveringCompressorTest(17) pass.Summary by CodeRabbit
New Features
Bug Fixes
Tests