Overhaul test integration in build system#7851
Conversation
c5d6f7c to
1a56762
Compare
c79e655 to
5d67be5
Compare
|
Awesome! Thanks for tackling this! I'm very very happy to have real isolated tests and also the ability to run individual tests. @mqudsi can you say a bit about why you chose CTest? I don't know much about it - is it actually good, or just convenient? I see the configure-time test discovery step, I worry about that since it means you have to reconfigure to pick up new tests. |
c1b9c35 to
1d21329
Compare
|
The primary reason for going with ctest is to avoid any additional dependencies or further complications to our build system (I feel like they're all ugly, so might as well stick with what we have). CTest has integration for Google Test if we ever want to formally adopt a framework. Looking through the code for that, I have an idea for how we can avoid the configure-time discovery for the low-level tests suite that I might try my hand at (all other tests discovered via globbing do not require a manual reconfigure to be picked up, as the addition of any files matching the glob pattern for the little check and pexpect tests will automatically trigger a CMake run). On a more positive note, as of last night all the tests are passing. The new test driver was fine, but some of the tests had shared state via the |
krobelus
left a comment
There was a problem hiding this comment.
Running individual littlecheck and pexpect tests is a great improvement.
With fish_tests that already works fairly well, so if test discovery causes problems we could get by without that. Though a better integration is surely nice here as well.
A detail: the individual tests are named just by the basename, like abbr.fish. I'd prefer something like test/abbr.fish so they are all grouped. I noticed this in the tab-completion results of ninja that are sorted alphabetically by fish, which makes sense since ninja -t targets does not give a consistent order.
| # dynamically discover tests, our options are either this or resorting to sed/awk to parse the low | ||
| # level tests source file to get the list of individual tests. Or to split each test into its own | ||
| # source file. | ||
| execute_process( |
There was a problem hiding this comment.
I get this on linux:
/usr/bin/ld: /tmp/ccPMN3np.o: warning: relocation against `_ZN10features_t15global_featuresE' in read-only section `.text._Z21mutable_fish_featuresv[_Z21mutable_fish_featuresv]'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
./fish_tests_list: error while loading shared libraries: unexpected PLT reloc type 0x00
A quick fix would be to use conditional compilation:
#ifdef FISH_LIST_LOW_LEVEL_TESTS
int main(int argc, char **argv) {
list_tests();
}
# else
// normal fish_tests.cpp goes here
#endifAnother idea is to keep a separate file like
TEST("utility_functions", "test_utility")
TEST("wcstring_tok, "test_wcstring_tok")Define TEST as macro and include it from fish_tests.cpp, and the build system. One advantage of the current setup is that we get nicer stacktraces, that point to failing test cases, though that's not really a problem.
However, I don't think that's worth the split to a separate file (unless it greatly improves on test discovery).
There was a problem hiding this comment.
I developed this branch on Linux, compiling with gcc but linking with lldb. What’s your toolchain?
There was a problem hiding this comment.
Linux + gcc (same with clang) + GNU ld
If little check returns a specific error code, we can program that to be a “skipped” marker and have ctest act accordingly. I’m guessing it’s just returning 0 here. |
Ok, let's use exit code 125 since git bisect uses the same? Anything other than 0 or 1 should work. We can change littlecheck's behavior here since the |
|
The "wip (do not merge)" label seems superfluous, that's what PR drafts are for - they block merging in the UI, so I've taken the liberty of converting this to one. |
Unfortunately, CMake target names may not contain a slash: |
| -Wl,-undefined,dynamic_lookup,--unresolved-symbols=ignore-all | ||
| WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} | ||
| ) | ||
| execute_process( |
There was a problem hiding this comment.
I think we should not try to make each low-level test a separate CTest test. This adds a lot of complexity at configuration time, and it is easy to run individual low-level tests with the fish_tests binary itself.
There was a problem hiding this comment.
The hack has been removed and CMake now uses text preprocessing combined with a no-op macro in fish_tests.cpp to detect low-level tests. Complexity is considerably reduced and imho now at an acceptable level. I am strongly in favor of making this new system holistic, and believe that all test atomics should be runnable via a unified harness/interface.
Okay, then maybe |
|
@mqudsi do you plan to pick this up again near term? If not I'd be eager to take it over, I'm very excited for this improvement. |
|
@ridiculousfish, thanks for the ping. I needed some time away from things and when I started dipping my toes back in the water I was too worried about how long I'd abandoned this and how out-of-date it must be by now, which wasn't exactly the kind of motivation I needed. Your enthusiasm, however, is quite inspiring. Let me blow the dust off this and see what the deal is; it was already very close to ready iirc, so hopefully it won't be that bad. |
4c13b36 to
f8f1630
Compare
It depends on `mktemp -d TAG` returning a relative path, which isn't guaranteed to be the case (and isn't the case when run by our test driver).
The default matching logic for fish_tests was prefix based, so when we were running `history` we were also running all history tests. This causes the test to fail for an unknown reason.
test_driver.sh is guaranteed to take care of them.
Aside from the fact that the shared state could cause problems, tests were randomly assuming it would be created where that wasn't the case. In particular, `redirect.fish` and `basic.fish` were failing on only macOS because `../test/temp` didn't exist yet - it would be created by other tests later.
Instead of trying to assert that there are no zombies when the test starts (which often fails) and to prevent conflating existing or irrelevant zombies with the ones we are interested in checking for, have `ps` also emit the parent process id and filter its output to include only children of the current fish instance.
This is in preparation for adding skipped test support to our ctest integration.
This prevents tests that were skipped (e.g. because of a missing REQUIRES) from being reported as successes in the CTest overall run results list.
`test:foo` is not allowed by CMake ("reserved name") and `test/foo`
won't work since CMake doesn't allow targets to have a directory
separator in their name.
Tests are now executed in a test-specific temporary directory, so test output on failure should be reproducible/reusable as-is without needing to have TMPDIR defined (as it only exists by default under macOS).
If called within a temporary directory that had an X in the path, it would fail. This caused sporadic CI test failures.
krobelus
left a comment
There was a problem hiding this comment.
Sorry for the delay - this looks awesome, let's merge.
I was confused for a bit because ninja test_basic.fish would take 4 seconds but that's just because I had BUILD_DOCS=ON which runs sphinx
This was requested by a team member who would like for some tests to remain invokable (in thier own $HOME) directly via littlecheck without relying on the test driver to prep the environment. A comment explaining the rationale is also added so this doesn't get passed down as folklore "you need to include this for tests to run" even though no one understands why.
On request of a team member, this patches `basic.fish` to no longer depend on being invoked by the test driver and started up in a $PWD that points to a clean temporary directory.
|
🎉 thank you, all. |
|
There are two bits of fallout in the package builders from this merge, neither of which are showstoppers. It looks like the overrides from 4c2a0de have been moved into the test driver, which means the low-level tests go back to running in the user's home directory. This is a bit of an issue on Debian pbuilders which use a non-existent HOME - see d518b01 as well. Second, old CMake versions call skipped tests "failed". I can't see that this was changed explicitly - it's broken in 3.7 but working in 3.10. This only affects the package builds on LTS version of Debian, so I'll try and chase it down at some stage but it's not a blocker by any means. |
[ 327s]
The following tests did not run: |
Lovely. The alternative here would be to "pass" these tests on our end, maybe even just if cmake < 3.10. It's not like skipped tests are super visible anyway. |
The overrides have indeed been moved into the test driver but at the same time the low-level tests are now executed via the test driver. Perhaps I missed a target in the CMake file and they're being executed both independent of the test driver and then again under it?
I thought maybe the feature to support skipping a test based on the return code was only introduced post CMake 3.7 but it seems to have been supported since at least version 3.0! (Side note: I can't believing viewing the documentation for a CMake API/feature in the latest documentation doesn't include a "introduced in CMake version xxx" or something). Maybe the API for setting the skip code changed or something? |
No, the low-tests are run directly - the test driver always invokes fish first. foreach(LTEST ${LOW_LEVEL_TESTS})
add_test(
NAME ${LTEST}
COMMAND ${CMAKE_BINARY_DIR}/fish_tests ${LTEST}
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
)
set_tests_properties(${LTEST} PROPERTIES SKIP_RETURN_CODE ${SKIP_RETURN_CODE})
add_test_target("${LTEST}")
endforeach(LTEST)
They were counted as failures prior to 3.9 (see https://gitlab.kitware.com/cmake/cmake/-/issues/16822).
Heh. fish's documentation isn't great in that sense either. |
|
@zanchey can you please try the ctest_redux branch from mqudsi/fish-shell? It should fix the debian problem. |
|
@mqudsi, it works great! |
|
Thanks for merging that branch in 93aaa18. Unfortunately that uncovered a new issue in Ubuntu Xenial, which uses CMake 3.5 (which I didn't test until after the merge). The |
CMake 3.5 (shipped in Ubuntu Xenial) doesn't generate the test target with appropriate dependencies. Build them in dh_auto_build; it's too hard to convince any of the other steps to do it. See #7851.
|
You're most welcome. I saw your patch - that's certainly one way to work around it! |

This is my second attempt at overhauling our varied test suites and unifying them under the ctest umbrella. Due to the limitations of the way ctest was hacked on to CMake,
testremains a separate custom target, but nevertheless utilizesctestunder the hood.The biggest change is that each individual test is run in a clean environment, created just for that test. And "test" here no longer means one of three (low level, pexpect, little check) but rather each test run by those test drivers. (Never again will you have to account for an alias that was added but not removed in a previous test).
This change also made it possible to run the tests in parallel (well, mostly), which lets me run the entire fish test suite, including the overhead of creating a new environment for each test, in 13 seconds. Tests continue to be run sequentially under CI.
An additional improvement is the addition of top-level targets for each test, meaning if test
foofailed and you would like to retry just that test,ninja -C build test_foowill retry that test (regardless of which test suite it belongs to).That said, the ability to run tests in parallel has shown some tests that are designed with global system state in mind - they need to be changed in order to pass when executed in parallel (or else can be forced to execute serially, but I think it's a good idea to clean up the test suite to make everything less brittle) - this applies mainly to
jobs.fish.The history merge test suite has issues that I believe might indicate a bug in either the creation of a new environment or what constitutes a unique fish session, the history race condition test continues to be extremely flaky and reports items dropped from history when the system is under heavy load when the test is executed - at this point, I suspect an actual issue.Edit: later commits have changed the serialization level for history tests to prevent multiple history tests being run (serially) in one process (inadvertently in the case of this fork, by design inmaster); this seems to have resolved their flakiness.The actual changes are fairly free of any hacks with one notable exception: cmake/ctest doesn't have a "discover tests" stage so as part of the "configure" step,
a bastardized version offish_testsis built (without any dependencies on fish itself) and invoked to emit a list of tests - the alternative to this is for each low level test to be explicitly named in the CMake files (ugly) or each low level test to be moved to a separate source file (cringe).