Comparing array subset checks#125
Conversation
|
Your code above is massively dependent to used sample arrays. Using this samples ARRAY1 = [*(400..500)].freeze
ARRAY2 = [*(1..500)].freezewill change your results completely. I would prefer to use a more real-world sample for the test to avoid misleadings… |
|
Maybe |
|
@mblumtritt What you suggest as a real world example? It would be nice if I generate a random-filled array of integers and let it static in the test or something like that? @Arcovion Nice, I'll add both. Thank you guys. (: |
|
@gabteles Do you mean something like this? ARRAY1 = [*(400..500)].shuffle.freeze
ARRAY2 = [*(1..500)].shuffle.freezeThen we have to other problem: how is the weight between the set and the subset 🤔 Or do you prefer a complete random set to compare?… In "real world" I would check my chances to be faster with a complete different algorithm. For sample what about sorting first descending? This might be faster if both sets already nearly sorted… I'm unsure to give a "general advice" for this subset problem. In "real world" it always depends… ;) |
|
@mblumtritt Yeah, I agree that the weight between set/subset is a big problem to benchmark. I'm not sure how to handle it. Before I meant something like this:
Using it statically grants that we won't have performance improved/degraded by randomness effect of shuffle or rand in runtime. But it also suffers of the weight problem. |
There was a problem hiding this comment.
Pull request overview
Adds a new benchmark to compare different ways of checking whether one Ruby array is a subset of another, and documents the results in the main README alongside existing performance comparisons.
Changes:
- Added a new benchmark script for array subset detection methods (
code/array/subset-checking.rb). - Added a README section documenting the benchmark command and sample results (
README.md).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| README.md | Adds a new documented benchmark section and sample output for array subset detection. |
| code/array/subset-checking.rb | Introduces the benchmark implementation comparing multiple subset-check approaches. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ARRAY2 = [*1..100] | ||
|
|
||
| def slow_set | ||
| ARRAY2.to_set.subset?(ARRAY1.to_set) |
| def slow_interception | ||
| (ARRAY1 & ARRAY2) == ARRAY1 | ||
| end | ||
|
|
||
| def slow_interception_size | ||
| (ARRAY1 & ARRAY2).size == ARRAY1.size | ||
| end | ||
|
|
||
| def slow_minus_empty | ||
| (ARRAY1 - ARRAY2).empty? | ||
| end | ||
|
|
||
| def fast | ||
| ARRAY1.all?{|element| ARRAY2.include?(element) } | ||
| end | ||
|
|
||
| Benchmark.ips do |x| | ||
| x.report("(a1 - a2).empty?") { slow_minus_empty } | ||
| x.report("(a1 & a2) == a1") { slow_interception } | ||
| x.report("Array#all?#include?") { fast } | ||
| x.report("Array#&#size") { slow_interception_size } |
| end | ||
|
|
||
| def fast | ||
| ARRAY1.all?{|element| ARRAY2.include?(element) } |
| ##### Subset detection with various methods [code](code/array/subset-checking.rb) | ||
|
|
||
| ``` | ||
| $ ruby -v code/array/subset-checking.rb | ||
| ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux] |
|
Thanks for this, @gabteles, and thanks to @mblumtritt and @Arcovion for the thoughtful review back in 2017. I revisited this on Ruby 4.0.0 to see whether it still holds up, and unfortunately the concern @mblumtritt raised is exactly what sinks it. Going to close it, but I want to lay out the reasoning so it's useful for anyone who finds this later. 1. The conclusion no longer holds on Ruby 4.0Running the benchmark verbatim, the ranking has flipped and the gaps have collapsed:
2. The
|
No description provided.