Skip to content

Conversation

@0xtoward
Copy link
Contributor

@0xtoward 0xtoward commented Nov 3, 2025

Summary

Fix #920
Note: The test uses subprocess isolation because cross_entropy patches modify a global function (transformers.loss.loss_utils.nn.functional.cross_entropy). I'm wondering if there is an elegant way.

Testing Done

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

@Tcc0403
Copy link
Collaborator

Tcc0403 commented Nov 4, 2025

I'm thinking about just patching the forward method as flce does with some modifcitions in liger's loss_utils, so we only need to toggle an option to switch between flce and ce with functools.partial. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Check if cross entropy is applied correctly in all monkey patch functions

2 participants