Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp fail on Arc Linux

### Describe the bug

https://github.com/intel/llvm/actions/runs/19151375733/job/54743282506?pr=20542

```
  -- Testing: 1911 of 2495 tests, 24 workers --
  FAIL: SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp (415 of 1911)
  ******************** TEST 'SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp' FAILED ********************
  Exit Code: -11
  
  Command Output (stdout):
  --
  # RUN: at line 13
  env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # .---command stdout------------
  # | Testing: 8 x 8 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 135.409 Gop/s
  # | Testing: 32 x 32 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 75.8815 Gop/s
  # `-----------------------------
  # RUN: at line 13
  env ONEAPI_DEVICE_SELECTOR=opencl:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # executed command: env ONEAPI_DEVICE_SELECTOR=opencl:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # .---command stdout------------
  # | Testing: 8 x 8 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 980.834 Gop/s
  # | Testing: 32 x 32 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 611.607 Gop/s
  # `-----------------------------
  # RUN: at line 13
  env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # executed command: env env UR_LOADER_USE_LEVEL_ZERO_V2=1 ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu_vnni.out
  # .---command stdout------------
  # | Testing: 8 x 8 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 138.971 Gop/s
  # | Testing: 32 x 32 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 39.9461 Gop/s
  # `-----------------------------
  # RUN: at line 16
  env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu.out
  # executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_SLM.cpp.tmp_gpu.out
  # .---command stdout------------
  # | Testing: 8 x 8 x 16 [TM x TN x TK]
  # | DONE for size 256
  # | GOPS is 112.432 Gop/s
  # | Testing: 32 x 32 x 16 [TM x TN x TK]
  # `-----------------------------
  # error: command failed with exit status: -11
  
  --
```

### To reproduce

_No response_

### Environment

- OS: [e.g Windows/Linux]
- Target device and vendor: [e.g. Intel GPU]
- DPC++ version: [e.g. commit hash or output of `clang++ --version`]
- Dependencies version: [e.g. the output of `sycl-ls --verbose`]


### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp fail on Arc Linux #20595

Describe the bug

To reproduce

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp fail on Arc Linux #20595

Description

Describe the bug

To reproduce

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions