Skip to content

0.5 is special for delta in keras.losses.Huber() or not #21804

@ILCSFNO

Description

@ILCSFNO

Bug Issue

The doc of keras.losses.Huber() shows its description as below:

delta: A float, the point where the Huber loss function changes from a
quadratic to linear.

For the repros below, we can see that delta=0.5 is special when calculating the memory usage, using tf 2.19.0 and keras latest:

Repro 1 (delta == 0.5)

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.5)
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
    memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)

Output 1

Huber loss: 1.3573407
1792

Repro 2 (delta == 0.1 / 0.3 / 1.0 / 10000000.0)

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
# Main Code -->
import numpy as np
import keras
x = np.random.rand(1000, 1)
y = (((3 * x) + 2) + np.random.randn(1000, 1))
huber_loss = keras.losses.Huber(delta=0.1) # choices: 0.1 / 0.3 / 1.0 / 10000000.0
loss = huber_loss(y, x)
print('Huber loss:', loss.numpy())
# Main Code <--
memory = 0
for i in range(len(gpus)):
    memory += tf.config.experimental.get_memory_usage('GPU:%d' % i)
print(memory)

Output 2

For each one choice, the outputs are below, in which the memory usage is the same:

Huber loss: 0.29216948
2048
Huber loss: 0.8585064
2048
Huber loss: 2.5695057
2048
Huber loss: 5.2090187
2048

The related codes is here:

super().__init__(
huber,
name=name,
reduction=reduction,
dtype=dtype,
delta=delta,
)

y_pred = ops.convert_to_tensor(y_pred)
y_true = ops.convert_to_tensor(y_true, dtype=y_pred.dtype)
y_true, y_pred = squeeze_or_expand_to_same_rank(y_true, y_pred)
delta = ops.convert_to_tensor(delta, dtype=y_pred.dtype)
error = ops.subtract(y_pred, y_true)
abs_error = ops.abs(error)
half = ops.convert_to_tensor(0.5, dtype=abs_error.dtype)
return ops.mean(
ops.where(
abs_error <= delta,
half * ops.square(error),
delta * abs_error - half * ops.square(delta),
),
axis=-1,
)

But I didn't find something that related to the 0.5, and I don't know if this is expected that the result for delta == 0.5 is special.

Thanks for noting!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions