Skip to content

fix: Reject filesystem paths in HTTP_CALLS route extraction so client routes can join server routes#485

Open
mvanhorn wants to merge 1 commit into
DeusData:mainfrom
mvanhorn:fix/455-http-route-filesystem-path-misclassification
Open

fix: Reject filesystem paths in HTTP_CALLS route extraction so client routes can join server routes#485
mvanhorn wants to merge 1 commit into
DeusData:mainfrom
mvanhorn:fix/455-http-route-filesystem-path-misclassification

Conversation

@mvanhorn

Copy link
Copy Markdown

Summary

On a large Python repo, HTTP route / HTTP_CALLS extraction emits false data: 1062/1213 Route nodes are empty URL-string stubs with no method/route_path, and filesystem paths like /root/.aws/credentials, /etc/crio/crio.conf and str.split('/locations/') delimiters are misclassified as HTTP routes (os.remove / os.path.join become HTTP_CALLS). Because of this, the client HTTP_CALLS set and the server HANDLES set are fully disjoint: zero client calls resolve to a handled endpoint.

Changes

In internal/cbm/service_patterns.c the HTTP client heuristic matches callee names by method suffix (.get/.post/...) and library prefix (requests, etc.) and lifts the call's first string argument into url_path. Add a URL-vs-filesystem guard before classifying an argument string as a route: reject absolute filesystem-looking paths (leading / followed by known fs roots like etc/, root/, var/, usr/, home/, tmp/, or a path segment carrying a file extension such as .conf/.credentials/.json on disk) and reject str.split/str.join delimiter literals, so only HTTP-shaped paths produce HTTP_CALLS edges with a populated url_path.

Fixes #455

… routes can join server routes

Signed-off-by: mvanhorn <mvanhorn@gmail.com>
@mvanhorn mvanhorn force-pushed the fix/455-http-route-filesystem-path-misclassification branch from 16662d1 to 3ee3129 Compare June 17, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: HTTP route heuristic misclassifies filesystem paths; client/server routes never join

1 participant