ENH: Improve np.kron performance#21354
Merged
Merged
Conversation
4 tasks
mattip
reviewed
Apr 17, 2022
Member
|
Nice speed up. Could you add a comment about what is going on into the code? |
Member
Author
|
Yeah sure, will add comments in code 👍 , added a TODO |
* Changed product computing logic for kron to use broadcasting
06d3494 to
e18e312
Compare
mattip
approved these changes
Apr 18, 2022
mattip
left a comment
Member
There was a problem hiding this comment.
LGTM. I will wait to merge in case anyone else wants to take a look.
Member
|
Thanks @ganesh-k13 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improve
np.kronperformanceBoost amount
Compare with bb811f4 (main)
Compare with latest release (v1.22.3)
Total speedup of about 70-80% compared to the current release
Explanation
Ok let me try my best to explain the current flow:
aandbsuch thatshapendims of smaller array (bin this case) to make them equal, hencea's shape stays(2,0,2)whilebbecomes(1,2,2). We prepend in case you were wondering. This is arbitrary from my searching, as few people prefer to append as well.aand even forb. This is to compute the product for the required sub parts. Using broadcasting for the product of course which is helping in the performance.ais now(2, 1, 0, 1, 2, 1)andbwill be(1, 1, 1, 2, 1, 2)2, 0, 4. We get this shape by multiplying shapes ofaandb.TODO
Part of #21257