Skip to content

*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests fail sporadically #1488

@ldorau

Description

@ldorau

*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests:

  • mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global (test_memoryPool) and
  • disjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global (test_disjoint_pool)

fail sporadically in the following way:
https://github.com/oneapi-src/unified-memory-framework/actions/runs/16843892970/job/47720079405

[ RUN      ] mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global
/home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)

or:
https://github.com/ldorau/unified-memory-framework/actions/runs/16845396177/job/47724161570

[ RUN      ] disjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global
/home/testuser/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)

Environment Information

  • UMF version (hash commit or a tag): cc0565d
  • OS(es) version(s): Linux

Please provide a reproduction of the bug:

$ while ./test/test_memoryPool --gtest_filter="*multiThreadedpow2AlignedAlloc/disjoint_w_params*" > ./log.txt 2>&1 && ./test/test_disjoint_pool --gtest_filter="*multiThreadedpow2AlignedAllo
c/disjoint_w_params*" > ./log.txt 2>&1 ; do date ; done

How often bug is revealed:

rare

Details

The root cause is pool_register_slab: register failed because the address is already registered!:

[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered!
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!

More logs:

$ grep -e "ERROR UMF" -e Failure -e 0x7fd07f81e008 ./log.txt
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe496e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] pool_unregister_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7fd07fe496e8, start: 0x7fd07f81e008)
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe495e8, start: 0x7fd07f81e008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure

and:

$ grep -e "ERROR UMF" -e Failure -e 0x7f15c647f008 ./log.txt
[PID:772    TID:776    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:776    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772    TID:776    DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:773    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:773    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c17e8, start: 0x7f15c647f008
[PID:772    TID:773    ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7f15c64c17e8, start: 0x7f15c647f008)
[PID:772    TID:773    ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:772    TID:773    DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:776    DEBUG UMF] pool_unregister_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772    TID:775    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:775    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1668, start: 0x7f15c647f008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions