-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Open
Labels
Description
Bug Issue
The doc of keras.losses.Huber() shows its description as below:
keras/keras/src/losses/losses.py
Lines 294 to 295 in 6d06085
| delta: A float, the point where the Huber loss function changes from a | |
| quadratic to linear. |
For the repros below, we can see that delta=0.5 is special when calculating the memory usage, using tf 2.19.0 and keras latest:
Repro 1 (delta == 0.5)
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.5)
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)Output 1
Huber loss: 1.3573407
1792
Repro 2 (delta == 0.1 / 0.3 / 1.0 / 10000000.0)
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.1) # choices: 0.1 / 0.3 / 1.0 / 10000000.0
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)Output 2
For each one choice, the outputs are below, in which the memory usage is the same:
Huber loss: 0.29216948
2048
Huber loss: 0.8585064
2048
Huber loss: 2.5695057
2048
Huber loss: 5.2090187
2048
The related codes is here:
keras/keras/src/losses/losses.py
Lines 319 to 325 in 6d06085
| super().__init__( | |
| huber, | |
| name=name, | |
| reduction=reduction, | |
| dtype=dtype, | |
| delta=delta, | |
| ) |
keras/keras/src/losses/losses.py
Lines 1969 to 1983 in 6d06085
| y_pred = ops.convert_to_tensor(y_pred) | |
| y_true = ops.convert_to_tensor(y_true, dtype=y_pred.dtype) | |
| y_true, y_pred = squeeze_or_expand_to_same_rank(y_true, y_pred) | |
| delta = ops.convert_to_tensor(delta, dtype=y_pred.dtype) | |
| error = ops.subtract(y_pred, y_true) | |
| abs_error = ops.abs(error) | |
| half = ops.convert_to_tensor(0.5, dtype=abs_error.dtype) | |
| return ops.mean( | |
| ops.where( | |
| abs_error <= delta, | |
| half * ops.square(error), | |
| delta * abs_error - half * ops.square(delta), | |
| ), | |
| axis=-1, | |
| ) |
But I didn't find something that related to the 0.5, and I don't know if this is expected that the result for delta == 0.5 is special.
Thanks for noting!