Add GPT-J architecture adapter tests#1314
Conversation
|
Hi! Just flagging that the Othello_GPT notebook check failure seems unrelated to this PR. Cell 4 only does |
|
This looks great @along-l! The Othello failure is related specifically to HuggingFace rate limits that occur when too many PRs are running unit tests at the same time. I restarted the test when the queue was clear and it passed as expected. Merging this, thank you for taking it on! We do have a PR open that has begun to address this specific issue (Issue is #1291, open PR is #1296), but that PR does not cover everything that could be done to reduce our overall HF calls, if you are interested in digging into it feel free, just take care not to conflict with the work done in #1296. |
Description
Add unit tests for
GptjArchitectureAdapter, contributing to the architecture adapter test coverage effort in #1302.Coverage includes:
pos_embed, no top-levelrotary_emb, noln2, no MLP gate, exactly QKVO conversion keys)GPTJForCausalLM→GptjArchitectureAdapter)49 tests, all passing locally via
uv run pytest.Related to #1302
Type of change
Checklist: