-
Notifications
You must be signed in to change notification settings - Fork 158
Merge cktile ck gemm tune #1503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR consolidates the CK and CKTile GEMM A8W8 implementations into a unified tuning framework. It replaces separate configuration files with a single shared config that includes a libtype field to distinguish between implementations.
Key Changes
- Added
--libtypeparameter to the tuning script allowing selection of "ck", "cktile", "asm", or "all" implementations - Merged separate configuration files (
a8w8_bpreshuffle_cktile_*) into a unifieda8w8_bpreshuffle_*configuration - Updated runtime selection logic to use
libtypefield from config instead of comparing execution times
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| csrc/cktile_gemm_a8w8_bpreshuffle/include/gemm_a8w8_bpreshuffle_cktile_common.cuh | Fixed include order and removed hardcoded bf16 output type |
| csrc/cktile_gemm_a8w8_bpreshuffle/gen_instances.py | Added libtype filtering to only load cktile configs |
| csrc/cktile_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_cktile_tune.py | Updated to use unified config path |
| csrc/cktile_gemm_a8w8_bpreshuffle/README.md | Updated documentation to reflect unified tuning approach |
| csrc/ck_gemm_a8w8_bpreshuffle/gen_instances.py | Added libtype filtering and commented out bounds check |
| csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py | Added libtype parameter, combined ck/cktile/asm tuning tasks |
| aiter/utility/base_tuner.py | Fixed column selection order in preprocessing |
| aiter/ops/gemm_op_a8w8.py | Simplified dispatch logic using libtype field |
| aiter/jit/optCompilerConfig.json | Updated blob generation path for unified config |
| aiter/jit/core.py | Commented out separate cktile config constant |
| aiter/configs/a8w8_bpreshuffle_cktile_untuned_gemm.csv | Deleted (consolidated) |
| aiter/configs/a8w8_bpreshuffle_cktile_tuned_gemm.csv | Deleted (consolidated) |
Comments suppressed due to low confidence (1)
csrc/ck_gemm_a8w8_bpreshuffle/gen_instances.py:248
- This comment appears to contain commented-out code.
# if kid < 0 or kid > len(kernels_list):
# continue
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # if kid < 0 or kid > len(kernels_list): | ||
| # continue |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented-out bounds checking code should either be removed or re-enabled. If the bounds check is intentionally removed, it could lead to index-out-of-bounds errors when accessing kernels_list[kid] on line 249. Consider either uncommenting this validation or documenting why it's safe to remove.
| # if kid < 0 or kid > len(kernels_list): | |
| # continue | |
| if not isinstance(kid, int) or kid < 0 or kid >= len(kernels_list): | |
| continue |
| First add GEMM shapes in `aiter/configs/a8w8_bpreshuffle_cktile_untuned_gemm.csv`, then run the following cmd to start tuning, please wait a few minutes as it will build gemm_a8w8_bpreshuffle_cktile_tune via jit: | ||
| `FLATMM_HIP_CLANG_PATH=/data/llvm-project/build/bin/ python3 csrc/cktile_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_cktile_tune.py -i aiter/configs/a8w8_bpreshuffle_cktile_untuned_gemm.csv -o aiter/configs/a8w8_bpreshuffle_cktile_tuned_gemm.csv` | ||
| First add GEMM shapes in `aiter/configs/a8w8_bpreshuffle_untuned_gemm.csv`, then run the following cmd to start tuning, please wait a few minutes as it will build gemm_a8w8_bpreshuffle_tune via jit: | ||
| `FLATMM_HIP_CLANG_PATH=/data/llvm-project/build/bin/ python3 csrc/cktile_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py -i aiter/configs/a8w8_bpreshuffle_untuned_gemm.csv -o aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv` |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command references the wrong script path. Since the tuning is now unified, this should point to csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py with --libtype cktile to tune CKTile specifically, not csrc/cktile_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py.
| `FLATMM_HIP_CLANG_PATH=/data/llvm-project/build/bin/ python3 csrc/cktile_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py -i aiter/configs/a8w8_bpreshuffle_untuned_gemm.csv -o aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv` | |
| `FLATMM_HIP_CLANG_PATH=/data/llvm-project/build/bin/ python3 csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py -i aiter/configs/a8w8_bpreshuffle_untuned_gemm.csv -o aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv --libtype cktile` |
| ## More | ||
| If you want to re-install gemm_a8w8_bpreshuffle_cktile, you should remove `aiter/jit/module_gemm_a8w8_bpreshuffle_cktile.so` and `aiter/jit/build/module_gemm_a8w8_bpreshuffle_cktile` first. | ||
| If you use flag `PREBUILD_KERNELS=1` when you install aiter, it will build gemm a8w8 kernels in `aiter/configs/a8w8_bpreshuffle_cktile_tuned_gemm.csv` by default. If you want to use the new result of gemm_a8w8_bpreshuffle_cktile_tune, please remove `build` and `*.so` first, then re-intall aiter after finishing tune. | ||
| If you use flag `PREBUILD_KERNELS=1` when you install aiter, it will build gemm a8w8 kernels in `aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv` by default. If you want to use the new result of gemm_a8w8_bpreshuffle_cktile_tune, please remove `build` and `*.so` first, then re-intall aiter after finishing tune. |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 're-intall' to 're-install'.
| If you use flag `PREBUILD_KERNELS=1` when you install aiter, it will build gemm a8w8 kernels in `aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv` by default. If you want to use the new result of gemm_a8w8_bpreshuffle_cktile_tune, please remove `build` and `*.so` first, then re-intall aiter after finishing tune. | |
| If you use flag `PREBUILD_KERNELS=1` when you install aiter, it will build gemm a8w8 kernels in `aiter/configs/a8w8_bpreshuffle_tuned_gemm.csv` by default. If you want to use the new result of gemm_a8w8_bpreshuffle_cktile_tune, please remove `build` and `*.so` first, then re-install aiter after finishing tune. |
Motivation
move a8w8_bpreshuffle ck and ck tile implementation together to tune
Technical Details
Test Plan
Test Result
Submission Checklist