-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
Our GCC LTO flags can be improved #132257
Copy link
Copy link
Closed
Labels
buildThe build process and cross-buildThe build process and cross-buildperformancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Metadata
Metadata
Assignees
Labels
buildThe build process and cross-buildThe build process and cross-buildperformancePerformance or resource usagePerformance or resource usagetype-featureA feature request or enhancementA feature request or enhancement
Fields
Give feedbackNo fields configured for issues without a type.
Feature or enhancement
Proposal:
@thesamesam pointed out to me that our GCC LTO configuration builds serially and as a single translation unit IIUC. This is the slowest configuration possible. On GCC 15, the LTO build takes 10m14.972s, in my first PR, it takes 2m28.287s. This is a multiple factor reduction in build times.
Benchmarks show basically no change in performance --- 1.004x slower on one machine, and 1.000x faster on another machine. This is basically in the realm of noise.
https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250407-3.14.0a6+-8891cd2
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Linked PRs