-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Fix: Remove deprecated .path access in Muon optimizer for TF 2.16+ compatibility #21797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix: Remove deprecated .path access in Muon optimizer for TF 2.16+ compatibility #21797
Conversation
…andling and tests
Summary of ChangesHello @utsab345, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical compatibility issue within the Keras Muon optimizer, specifically targeting an AttributeError that arises in TensorFlow 2.16+ due to the removal of the .path attribute from ResourceVariable objects. The core change involves updating the optimizer's internal logic to use the .name attribute for variable identification, ensuring seamless operation with modern TensorFlow versions. Additionally, the PR enhances the _should_use_adamw method with improved variable dimension checks and more resilient handling of layer exclusion patterns. The changes are thoroughly validated with a suite of new, targeted unit tests, and a minor refinement to image saving utilities is also included. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses the AttributeError for variable.path in recent TensorFlow versions by migrating to variable.name in the Muon optimizer. The changes are well-implemented and include several improvements:
- The logic in
_should_use_adamwis corrected to properly handle 4D tensors as intended for the Muon update step. - Robustness is improved by adding error handling for invalid regex patterns in
exclude_layers. - The test suite for the Muon optimizer has been significantly improved by refactoring a large test into smaller, more focused tests with clear purposes, and by adding a specific test to prevent regressions of the
.pathattribute issue.
The changes look solid. I have one suggestion to further improve the robustness of the error handling in _should_use_adamw.
keras/src/optimizers/muon.py
Outdated
| except re.error: | ||
| # Skip invalid regex patterns in exclude_layers | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try...except block is a great addition for robustness against invalid regex patterns. However, re.search can also raise a TypeError if an element in self.exclude_layers is not a string or a compiled regex pattern (e.g., a number). To make this even more robust, consider catching TypeError as well.
| except re.error: | |
| # Skip invalid regex patterns in exclude_layers | |
| continue | |
| except (re.error, TypeError): | |
| # Skip invalid regex patterns in exclude_layers | |
| continue |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21797 +/- ##
===========================================
- Coverage 82.66% 34.34% -48.33%
===========================================
Files 577 577
Lines 59477 59478 +1
Branches 9329 9327 -2
===========================================
- Hits 49167 20426 -28741
- Misses 7907 38143 +30236
+ Partials 2403 909 -1494
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
586b5c0 to
4a299d3
Compare
keras/src/optimizers/muon.py
Outdated
|
|
||
| def _muon_update_step(self, gradient, variable, lr): | ||
| m = self.adam_momentums[variable.path] | ||
| m = self.adam_momentums[variable.name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is that variable names are not unique. For instance, all Dense layers have a "kernel" variable. So multiple variables will clobber one another.
hertschuh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I want to clarify one thing. It's not the .path is deprecated. .path is a concept that exists in keras.Variable but doesn't exist in tf.Variable.
Second, this issue happens because there is a limitation with TensorFlow whereby the variable that is passed to update_step is not the keras.Variable but the tf.Variable.
However, there is a mechanism in place to workaround this limitation. Please take a look at how Adam is implemented. Everything has to be based on self._get_variable_index(var), which is guaranteed to work with both keras.Variable and tf.Variable.
So instead of what was done:
- you should inspect
.pathinbuild, but only inbuild - you should only use
self._get_variable_index(variable)inupdate_step.
All the momentums and velocities should be lists, not dicts. You want to initialize them all with [None] * len(var_list).
Then do:
for var in var_list:
if not self._overwrite_variable_with_gradient(var):
self.adam_momentums[self._get_variable_index(var)] = (
self.add_variable_from_reference(
reference_variable=var, name="momentum"
)
)
if self._should_use_adamw(var):
self.adam_velocities[self._get_variable_index(var)] = (
self.add_variable_from_reference(
reference_variable=var, name="velocity"
)
)And the same for the muon momentums / velocities.
Then in update_step, you'll check:
self.adam_velocities[self._get_variable_index(variable)]
If it's None, you call _muon_update_step, else _adamw_update_step.
Does that make sense? Sorry, this is probably a bit more complicated than what you had in mind.
| if isinstance(x, np.ndarray): | ||
| return np.array(img) | ||
| return img | ||
| return img |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this file
| loaded_img = load_img(path) | ||
| loaded_array = img_to_array(loaded_img) | ||
| self.assertEqual(loaded_array.shape, (50, 50, 3)) | ||
| self.assertEqual(loaded_array.shape, (50, 50, 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this file
keras/src/optimizers/muon.py
Outdated
| # Exclude any user-specified layer patterns | ||
| for pattern in self.exclude_layers: | ||
| try: | ||
| if re.search(pattern, var_identifier): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work because the layer name is not part of the variable name, only the variable path.
…mpatibility Fixes keras.optimizers.Muon failing with AttributeError: 'ResourceVariable' object has no attribute 'path' in Keras 3 / TF 2.16-2.20. Changes: - Replaced deprecated .path references with _get_variable_index() for variable identification - Updated build() to use lists instead of dicts, initialized with [None] * len(var_list) - Updated _should_use_adamw() logic to safely check .path only during build - Updated update_step(), _muon_update_step(), and _adamw_update_step() to use _get_variable_index() - Added robust error handling for invalid regex patterns in exclude_layers - Reverted image_utils.py changes as requested by reviewer Result: All tests pass. Compatible with TensorFlow 2.16+. Closes keras-team#21793
7bca156 to
ea5cd09
Compare
|
Please let me know when this is ready for re-review. Thanks! |
Summary
Fixes keras.optimizers.Muon failing with AttributeError: 'ResourceVariable' object has no attribute 'path' in Keras 3 / TF 2.16–2.20.
Changes
.pathreferences with.namefor variable identification._should_use_adamw()logic to match modern Keras 3 variable handling..pathattribute accessResult
All 11 tests passed locally on TensorFlow 2.16+ (Windows).
Closes #21793