Merge pull request #49 from codellm-devkit/feat/jedi-shard-planner#51
Merged
Conversation
PyCG sharding: Jedi planner + adaptive decomposition of runaways
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces CodeQL with PyCG as the level 2 call graph backend, and adds coupling-aware sharding so PyCG scales to large apps.
Motivation and Context
PyCG does not scale past a few hundred files. A flat file-count shard forces every shard small (severs many call edges, hurts recall) just to tame the few shards that diverge. This shards by Jedi module coupling instead, and recovers diverging shards by re-sharding only them.
How Has This Been Tested?
Unit tests for the planner, dep exclusion, max_iter, and the adaptive loop (14 tests, all pass).
End to end on a real app. Benchmark app: Odoo, 1028 modules, level 2, Ray. PyCG edges recovered:
Adaptive recovers 22210 edges (+30% over the best uniform run), losing only 20 of 1028 files instead of a whole 100-file shard.
Breaking Changes
Yes.
Types of changes
Checklist
Additional context
Sharding algorithm:
New flags:
Caveats: