Skip to content

[WIP]Feat(tests): build test infrastructure#144

Open
chen2021673 wants to merge 7 commits intomasterfrom
CTest-clean
Open

[WIP]Feat(tests): build test infrastructure#144
chen2021673 wants to merge 7 commits intomasterfrom
CTest-clean

Conversation

@chen2021673
Copy link
Copy Markdown
Contributor

@chen2021673 chen2021673 commented Apr 14, 2026

Summary

This PR refactors InfiniTrain’s test infrastructure around CTest and GoogleTest.

It consolidates the old test/ and tests/ layout into a single tests/ directory, introduces shared CMake utilities for test registration, and migrates applicable tests to device-parameterized TEST_P so CPU/CUDA cases can share the same test logic where appropriate.

Closes #120.

Changes

  • merge the old test/ directory into tests/
  • add shared CMake/GTest utilities under tests/common/
  • reduce repeated test registration boilerplate in per-suite CMakeLists.txt
  • migrate applicable tests from fixed-device TEST_F to device-parameterized TEST_P
  • replace hardcoded device selection with shared helpers such as GetDevice()
  • improve label-based selection for CPU/CUDA-related tests
  • refactor registration for all tests

How to run

ctest --output-on-failure
ctest -L cpu --output-on-failure
ctest -L cuda --output-on-failure

Impact

This is mainly a test infrastructure refactor. It is not intended to change training/runtime behavior, but it does change how tests are organized and registered.

Result

ctest --output-on-failure -j1 (并行可能抢占,先串行)

image

luoyueyuguang and others added 5 commits April 28, 2026 08:28
- Add infini_train_add_test CMake macro for simplified test registration
- Integrate gtest_discover_tests for automatic test case discovery
- Refactor all test directories to use unified macro (autograd, optimizer, hook, slow, lora)
- Reduce test CMakeLists.txt code by 68%
- Add LoRA tests (12 test cases)
- Delete TEST_REPORT.md
- Test labels: cpu/cuda/distributed/slow for flexible test execution
- Add shared test_macros.cmake in tests/common/

BREAKING CHANGE: Test registration now uses macro instead of manual add_test()

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace TEST_F with TEST_P across all test suites so each suite runs on
both CPU and CUDA without duplicating test logic. Adds InfiniTrainTestP,
TensorTestBaseP, AutogradTestBaseP, and DistributedInfiniTrainTestP base
classes with automatic CUDA/NCCL skip guards. Introduces
INFINI_TRAIN_REGISTER_TEST* C++ macros and infini_train_add_test_suite
CMake macro to eliminate repetitive INSTANTIATE_TEST_SUITE_P /
infini_train_add_test boilerplate. Removes deprecated test/, slow/, and
split optimizer test files; consolidates optimizer tests into a single
binary with creation  + step suites.
- Simplify CMakeLists: single CTest target per suite, remove label splitting
- Migrate old test/ directory into tests/ and delete test/
Comment thread CMakeLists.txt
"Run: git submodule update --init third_party/googletest")
endif()
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
add_subdirectory(third_party/googletest)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

googletest 但没有在 .gitmodules 里注册

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补充注册了

Comment thread CMakeLists.txt
link_infini_train_exe(llama3)

# Tools
add_subdirectory(tools/infini_run)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个infini_run是不是误删了

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,我加回来

Comment thread CMakeLists.txt Outdated
add_subdirectory(third_party/glog)
# add_compile_definitions(GLOG_USE_GLOG_EXPORT=1)
include_directories(${glog_SOURCE_DIR}/src)
# include_directories(${glog_BINARY_DIR}/glog)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释掉的内容直接删掉吧

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

- Add docs/test_usage_guide.md with build/run/write instructions
- Rename hook_mechanism.md → hook_mechanism_design.md
- Rename lora_usage.md → lora_usage_guide.md
- Add googletest as submodule in .gitmodules
- Add infini_run tool target in CMakeLists.txt, remove stale comments
Comment thread tests/common/test_utils.h Outdated

class InfiniTrainTest : public ::testing::TestWithParam<Device::DeviceType> {
protected:
static void SetUpTestSuite() { nn::parallel::global::GlobalEnv::Instance().Init(1, 1, false, 1, 1); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlobalEnv::Instance().Init 过程会检查是否已经初始化,CHECK(!initialized_) << "Repeated initialization of GlobalEnv!";,如果两个测例在同一个进程里重复初始化就会崩溃。在test_transformer_architecture.cc里注册了两个测试类,单独执行./build/test_transformer_architecture_cpu,会误触发这个崩溃。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,添加 IsInitialized 检查,如果已经初始化过就不再重复初始化

"-DCMAKE_CXX_STANDARD_REQUIRED=ON"
"-DCMAKE_CXX_EXTENSIONS=OFF"
"-DCMAKE_CXX_FLAGS=-I${PROJECT_SOURCE_DIR}"
OUTPUT_VARIABLE DTYPE_DISPATCH_TRY_COMPILE_OUTPUT
Copy link
Copy Markdown
Contributor

@JYMiracle305 JYMiracle305 Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里编译失败的原因是引用不到头文件,没有真正验证到类型检查,应该是之前的用例就有问题
建议在下面打印一下失败内容 message(STATUS "compile-fail output:\n${DTYPE_DISPATCH_TRY_COMPILE_OUTPUT}")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Add IsInitialized() to GlobalEnv and guard SetUpTestSuite so a second
test class in the same process skips re-initialization instead of
hitting CHECK(!initialized_). Also print try_compile output on
compile-fail test to surface header-not-found vs real type errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants