This directory contains scripts for parsing and plotting timing data from
gpuRefactor, kluRefactor, gluRefactor, and sysRefactor.
The scripts process timing logs emitted by ReSolve refactor examples. They do not change solver behavior.
The GridKit workflow below is one benchmark use case for these utilities. Other benchmark data can be used if the logs follow the same timing row format.
parse_refactor_logs.py: parses raw benchmark logs into CSV.plot_refactor_results.py: generates timing and residual plots from the parsed CSV.
parse_refactor_logs.py uses only the Python standard library.
plot_refactor_results.py requires matplotlib:
python3 -m pip install matplotlibA temporary virtual environment can be used if needed:
python3 -m venv /tmp/resolve-plot-venv
source /tmp/resolve-plot-venv/bin/activate
python -m pip install --upgrade pip
python -m pip install matplotlibDo not commit virtual environments, logs, CSV files, or generated plots.
The example GridKit benchmark workflow is split between Frontier and a local CUDA machine.
Frontier is used for:
- CPU/KLU timing logs from
kluRefactorandkluRefactor -i - HIP timing logs from
gpuRefactor,gpuRefactor -i,sysRefactor, andsysRefactor -i
The local CUDA machine is used for:
- CUDA timing logs from
gpuRefactor,gpuRefactor -i,gluRefactor,sysRefactor, andsysRefactor -i - combining CPU, HIP, and CUDA timing logs into one CSV
- generating plots from the combined CSV
Recommended workflow:
- Configure and build ReSolve on Frontier with HIP and KLU support.
- Run the CPU/KLU timing cases with
kluRefactorandkluRefactor -ion Frontier. - Run the HIP timing cases with
gpuRefactor,gpuRefactor -i,sysRefactor, andsysRefactor -ion Frontier. - Copy the CPU/KLU and HIP logs from Frontier to the local machine.
- Copy the GridKit matrices from Frontier to the local machine if they are not already present.
- Configure and build ReSolve locally with CUDA support.
- Run the CUDA timing cases with
gpuRefactor,gpuRefactor -i,sysRefactor,sysRefactor -i, andgluRefactorlocally. - Parse all collected
kluRefactor,gpuRefactor,sysRefactor, andgluRefactorlogs into one CSV. - Generate the final plots from the combined CSV.
Only the benchmark executables need the appropriate compute environment. Parsing, CSV generation, and plotting can be done locally after the logs are collected.
Run the examples with -t or --time to emit timing rows:
TIMING,example,backend,ir_enabled,system,time_ms
Example:
TIMING,gpuRefactor,CUDA,0,2,2.4852799099999999e+02
Column meanings:
TIMING Marker used to identify timing rows
example Example executable name
backend Hardware/backend label
ir_enabled 0 for no iterative refinement, 1 for iterative refinement
system Linear system index
time_ms Solve time in milliseconds
The example workflow uses the GridKit cases on Frontier for:
N=125
N=250
N=500
N=1000
For each N, collect logs for:
kluRefactor
kluRefactor -i
gpuRefactor
gpuRefactor -i
gluRefactor
sysRefactor
sysRefactor -i
Run the benchmark cases for the supported backends. gpuRefactor and sysRefactor support CUDA and HIP builds while gluRefactor is only CUDA. kluRefactor provides the CPU/KLU timing baseline.
Use an output directory outside tracked source files:
mkdir -p timing_outputs/logs
mkdir -p timing_outputs/plotsTo keep generated files out of local git status, add the output directory to
the local exclude file:
echo "timing_outputs/" >> .git/info/excludeBefore collecting Frontier logs, configure a Frontier environment with HIP, ROCm, KLU, and the AMD GPU target available. The exact module stack is site-specific and may change, so verify the environment before configuring ReSolve. See the OLCF Frontier User Guide for the maintained documentation on Frontier programming environments, compiler wrappers, ROCm modules, and compiling on Frontier:
Check that the HIP compiler is visible:
which hipcc
hipcc --version
echo "$ROCM_PATH"Then configure a HIP/KLU build from the ReSolve source directory:
cd /ccs/home/$USER/resolve/source
cmake -S . -B ../build-hip-klu \
-DCMAKE_BUILD_TYPE=Release \
-DRESOLVE_USE_HIP=ON \
-DRESOLVE_USE_CUDA=OFF \
-DRESOLVE_USE_KLU=ON
cmake --build ../build-hip-klu -jConfirm that the expected benchmark executables were built:
find ../build-hip-klu \
-path '*gpuRefactor.exe' -o \
-path '*kluRefactor.exe' -o \
-path '*sysRefactor.exe' | sortRun HIP benchmark commands from an allocated Frontier compute node.
HIP benchmark commands should be run from a Frontier compute-node allocation. Frontier uses Slurm for scheduled compute resources. See the OLCF Frontier User Guide for the maintained documentation on Frontier batch scripts, interactive jobs, common Slurm options, and monitoring/modifying jobs:
For a one-node interactive allocation, use the project account and walltime appropriate for the benchmark run:
salloc -A <project> -N 1 -t 02:00:00Useful queue checks are:
squeue -u $USER
squeue --start -j <job_id>
scontrol show job <job_id> | grep -E 'JobState|Reason|StartTime|EndTime|RunTime|TimeLimit|Partition|NumNodes'After the allocation starts, run the Frontier benchmark commands from the ReSolve source directory. If the benchmark commands are saved in a script, they can be launched on the active allocation with:
srun --jobid=<job_id> -N1 -n1 bash -lc ./run_final_benchmarks.shThe exact account, queue time, module stack, and preferred batch/interactive workflow are site-specific. This section is intended to provide enough ReSolve-specific information to use the timing scripts to benchmark the examples. Use the OLCF documentation as the source of truth for Frontier allocation details.
The Python scripts do not run the benchmark executables. They only process logs after the benchmark runs are complete.
Create logs by running the examples and saving output with tee.
On Frontier, collect the CPU/KLU logs with kluRefactor and the HIP logs with
gpuRefactor and sysRefactor. Run HIP benchmark commands from an allocated
compute node.
Example CPU/KLU run without iterative refinement:
../build-hip-klu/examples/kluRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/cpu_N125_klu.logExample CPU/KLU run with iterative refinement:
../build-hip-klu/examples/kluRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-i \
-t | tee timing_outputs/logs/cpu_N125_klu_ir.logExample HIP gpuRefactor run without iterative refinement:
../build-hip-klu/examples/gpuRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/hip_N125_gpu.logExample HIP gpuRefactor run with iterative refinement:
../build-hip-klu/examples/gpuRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-i \
-t | tee timing_outputs/logs/hip_N125_gpu_ir.logExample HIP sysRefactor run without iterative refinement:
../build-hip-klu/examples/sysRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/hip_N125_sys.logExample HIP sysRefactor run with iterative refinement:
../build-hip-klu/examples/sysRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-i \
-t | tee timing_outputs/logs/hip_N125_sys_ir.logUse log names that include the backend, problem size, and method:
cpu_N125_klu.log
cpu_N125_klu_ir.log
hip_N125_gpu.log
hip_N125_gpu_ir.log
hip_N125_sys.log
hip_N125_sys_ir.log
Repeat the same naming pattern for:
N250
N500
N1000
After the Frontier CPU/KLU and HIP runs are complete, copy the logs back to the local CUDA machine before parsing and plotting. Use the transfer method appropriate for the amount of data being moved. See the OLCF documentation for maintained guidance on Frontier storage and data transfer:
- Frontier User Guide: Data and Storage
- Frontier User Guide: Using Globus to Move Data to and from Orion
For small log files, scp may be sufficient. For larger matrix/output
transfers, use the OLCF-recommended transfer workflow.
After the Frontier logs are copied locally, collect or add the local CUDA logs before parsing all logs into the combined CSV.
Collect CUDA logs on a local CUDA machine using the same GridKit matrices.
Example CUDA gpuRefactor run without iterative refinement:
./build-cuda/examples/gpuRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/cuda_N125_gpu.logExample CUDA gpuRefactor run with iterative refinement:
./build-cuda/examples/gpuRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-i \
-t | tee timing_outputs/logs/cuda_N125_gpu_ir.logExample CUDA gluRefactor run without iterative refinement:
./build-cuda/examples/gluRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/cuda_N125_glu.logExample CUDA sysRefactor run without iterative refinement:
./build-cuda/examples/sysRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-t | tee timing_outputs/logs/cuda_N125_sys.logExample CUDA sysRefactor run with iterative refinement:
./build-cuda/examples/sysRefactor.exe \
-m <matrix_prefix> \
-r <rhs_prefix> \
-n <num_systems> \
-i \
-t | tee timing_outputs/logs/cuda_N125_sys_ir.logUse log names that include the backend, problem size, and method:
cuda_N125_gpu.log
cuda_N125_gpu_ir.log
cuda_N125_glu.log
cuda_N125_sys.log
cuda_N125_sys_ir.log
Repeat the same naming pattern for:
N250
N500
N1000
For GridKit logs, the N value in the log filename allows parse_refactor_logs.py to infer the
GridKit problem size automatically.
After CPU/KLU, HIP, and CUDA logs are collected, parse them into one CSV file:
python3 scripts/timing/parse_refactor_logs.py \
timing_outputs/logs/*.log \
-o timing_outputs/refactor_timings.csvIf the GridKit problem size cannot be inferred from the log filename, pass it explicitly:
python3 scripts/timing/parse_refactor_logs.py \
timing_outputs/logs/*.log \
--N 125 \
-o timing_outputs/refactor_timings.csvThe parser writes:
source_log,N,example,backend,method,ir_enabled,system,time_ms,residual
Generate plots from the parsed CSV:
python3 scripts/timing/plot_refactor_results.py \
timing_outputs/refactor_timings.csv \
-o timing_outputs/plotsWith all four GridKit sizes present in the parsed CSV, the script generates nine plots:
N125_solve_time.png
N125_residual.png
N250_solve_time.png
N250_residual.png
N500_solve_time.png
N500_residual.png
N1000_solve_time.png
N1000_residual.png
average_solve_time_scaling.png
Open the plot directory in VS Code:
code timing_outputs/plotsClick each .png file in the VS Code Explorer to preview it.
If working through VS Code Remote SSH on Frontier, open the generated plot directory there. Otherwise, copy the output directory back to a local machine and open the plots locally.
A local smoke test can be run using existing ReSolve test logs and a placeholder
N value:
python3 scripts/timing/parse_refactor_logs.py \
/tmp/klu_timing_test.log \
/tmp/klu_timing_ir_test.log \
/tmp/gpu_timing_test.log \
/tmp/gpu_timing_ir_test.log \
/tmp/glu_timing_test.log \
/tmp/sys_timing_test.log \
/tmp/sys_timing_ir_test.log \
--N 2000 \
-o /tmp/refactor_local_timings.csv
python3 scripts/timing/plot_refactor_results.py \
/tmp/refactor_local_timings.csv \
-o /tmp/refactor_local_plotsTo view temporary local smoke-test plots in VS Code:
rm -rf refactor_local_plots
mkdir -p refactor_local_plots
cp /tmp/refactor_local_plots/*.png refactor_local_plots/
code refactor_local_plotsAfter checking the images, remove the temporary folder:
rm -rf refactor_local_plotsThe local smoke test only verifies that parsing and plotting work. The GridKit benchmark workflow should use the GridKit data on Frontier.