Implementing PReLu #29147

MythicArrow · 2025-05-31T20:08:09Z

Implementing the Parametric ReLU from the well-known paper "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification", which was proposed in ICCV 2015. This function introduces a parameter named "a" that is learnable, and it allows the function to adapt during training, potentially improving model accuracy and convergence compared to standard ReLU or Leaky ReLU functions.
Denoted as: f(x) = x if x>= 0 | ax if x < 0
ArXiv link: https://arxiv.org/abs/1502.01852#

Implementing the Parametric ReLU from the well-known paper "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification", which was proposed in ICCV 2015.

MythicArrow · 2025-05-31T20:09:40Z

Could you please provide feedback on my implementation to confirm that it has been implemented correctly?

jakevdp · 2025-06-02T18:20:48Z

Hi - thanks for the contribution! It looks like this would be a good contribution to jax.nn, but there are a number of changes we'd have to make: mainly, the a value should be an explicit parameter of the function, otherwise we wouldn't be able to differentiate with respect to it. With this in mind, the function should not take init or rng or num_parameters as arguments. Also, the implementation should probably modeled after that of the existing relu. We would also need to add tests for the new function, in tests/nn_test.py. Is this something you'd like to work on?

MythicArrow · 2025-06-02T21:17:47Z

Yes of course sir, I would.

MythicArrow · 2025-06-02T21:19:25Z

So would you willing to give me some time to work on it? Currently, I am a bit busy.

Made the function differentiable,defined "a" as an explicit argument, deleted unnecessary args, will add tests later

MythicArrow · 2025-06-02T21:56:20Z

Would like to hear your feedback!

MythicArrow · 2025-06-02T21:57:06Z

I will also add tests later.

Adding grad and value tests

MythicArrow · 2025-06-03T20:49:28Z

Ok the pr combination has been done.

jax/_src/nn/functions.py

jakevdp · 2025-06-03T21:00:43Z

jax/_src/nn/functions.py

+
+    Args:
+        x (jnp.ndarray): Input tensor.
+        a (jnp.ndarray): Learnable parameter.


Please use two-space indentations. Also "learnable parameter" is not particularly descriptive – maybe say "slope parameter" or something similar?

Additionally: we should not list the type in parentheses after the argument name (see other docstrings in this file for examples of proper formatting).

jakevdp · 2025-06-03T21:01:34Z

jax/_src/nn/functions.py

+                             
+
+    Returns:
+        jnp.ndarray: Output tensor with the PReLU activation applied.


Remove the jnp.ndarray:

jakevdp · 2025-06-03T21:01:50Z

jax/_src/nn/functions.py

+
+    Returns:
+        jnp.ndarray: Output tensor with the PReLU activation applied.
+  """


We should add an example or two here.

jakevdp · 2025-06-03T21:02:12Z

jax/_src/nn/functions.py

+        jnp.ndarray: Output tensor with the PReLU activation applied.
+  """
+
+    x = jnp.asarray(x)


Use check_arraylike before passing these to asarray

I had actually used it in the beginning of the function in the parenthesis.

def prelu(x: ArrayLike, a: ArrayLike) -> Array:

Here I had used it.

jakevdp · 2025-06-03T21:04:35Z

jax/_src/nn/functions.py

+
+    x = jnp.asarray(x)
+    a = jnp.asarray(a)
+    return jnp.where(x >= 0, x, a * x)


I'm not sure that this will be sufficient under autodiff: for example, relu has a custom JVP rule and some notes about its gradient behavior in the docstring. We should probably do similar here, but I'm not entirely sure what the best form to use is. a * relu(x) may be sufficient, but in other places it seems to be implemented as max(0, x) + a * min(0, x).

It would probably require constructing a few examples with gradients to see which form works in practice.

The a*relu(x) may be incorrect because I checked the official paper and PyTorch implements it correctly. So should I do the same?

By the way, I checked the derivative of prelu and it was 1 for positive inputs and for negative inputs it was a, the learned slope.
So should I do it according to its grad?

jakevdp · 2025-06-03T21:05:25Z

tests/nn_test.py

+
+  def testPreluGrad(x, a): 
+     return jnp.sum(nn.prelu(x, a)) 
+     check_grads(testprelugrad, (x, a), order=1) 


This test looks strange – copy-paste problem?

Yeah, after seeing the test file I realized that the tests for different purposes like grad and value were separated so I did the same.

For the new diff type see the previous commit

MythicArrow · 2025-06-08T11:29:52Z

Hello sir, have you been busy recently?

DanisNone · 2025-06-10T19:29:43Z

@MythicArrow It seems that prelu is equivalent to leaky_relu?

MythicArrow · 2025-06-10T19:32:48Z

Yeah it looks similar but there is a significant difference between them. Leaky ReLu contains "a" as a fixed constant but when it comes to PReLu its "a" is a learnable slope parameter, which is trained through backpropagation.

DanisNone · 2025-06-10T19:47:14Z

You can pass a JAX array as the second argument to leaky_relu, and JAX will have no issues computing gradients through it.

jakevdp · 2025-06-10T20:06:27Z

Yeah indeed @DanisNone – it looks like this is the same as leaky_relu. Sorry @MythicArrow, I should have noticed that earlier.

jakevdp · 2025-06-10T20:06:52Z

I think given that, this PR can probably be closed.

MythicArrow · 2025-06-10T20:07:46Z

But the PReLu's "a" is learnable not a constant.

MythicArrow · 2025-06-10T20:08:51Z

I can make it eligible for being trained through backpropagation.

jakevdp · 2025-06-10T20:09:01Z

But the PReLu's "a" is learnable not a constant.

That is true of leaky_relu as well.

MythicArrow · 2025-06-10T20:09:29Z

But the PReLu's "a" is learnable not a constant.

That is true of leaky_relu as well.

Oh ok

MythicArrow · 2025-06-10T20:09:57Z

Yeah indeed @DanisNone – it looks like this is the same as leaky_relu. Sorry @MythicArrow, I should have noticed that earlier.

No problem sir.

MythicArrow · 2025-06-10T20:12:54Z

Ok then I will close this pr.

MythicArrow · 2025-06-15T22:26:06Z

I checked the LeakyReLu's proportions and I see that it has a fixed value of "a" differing from the learnable one in the PReLu so would you merge the pr if I had edited the implementation in a way that it would have made the "a" learnable and trainable through backpropagation instead of the fixed one in LeakyReLu?

jakevdp · 2025-06-16T04:37:55Z

You mention LeakyReLu – are you talking about the function in the stax example library?

jax/jax/example_libraries/stax.py

Line 158 in 9678a76

LeakyRelu = elementwise(leaky_relu)

If so, then we likely wouldn't accept such a contribution, because we are not adding new features to stax.

If you're thinking of jax.nn.leaky_relu, then I can't say I understand your request here. The negative_slope is a parameter just like any other, and you can take the gradient with respect to it, which means its value can be optimized given an appropriate loss function. That makes me think it qualifies as a "learnable parameter" – is there something more you're looking for?

MythicArrow · 2025-06-18T11:28:45Z

I had checked the original paper of LeakyReLu and its "a" was a fixed constant like 0.01 and wasn't learned. But in this case the jax's LeakyReLu func is like PReLu so I think there is no need for implementing PReLu.

MythicArrow · 2025-12-11T08:26:09Z

Sir, I have researched the PReLu and found out that it indeed has a learnable parameter which is included in the model's weights. So if I define it as a parameter then it will be automatically learnable differing from the LeakyReLu. Does JAX allow to define 'a' as a model parameter?

Refactor PReLU implementation to use jnp.maximum and jnp.minimum for better idiomatic expression.

Implementing PReLu

24fe65c

Implementing the Parametric ReLU from the well-known paper "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification", which was proposed in ICCV 2015.

jakevdp self-assigned this Jun 2, 2025

Small fixes

c568f7f

Made the function differentiable,defined "a" as an explicit argument, deleted unnecessary args, will add tests later

Adding tests for the PReLu

d6a86de

Adding grad and value tests

jakevdp reviewed Jun 3, 2025

View reviewed changes

MythicArrow added 5 commits June 4, 2025 00:04

Improved explanation

ca38642

Added return clarification

61e4851

Adding custom jvp rule for differentiaton

f57a119

Updated tests according to the new differentiation rules

a22873c

For the new diff type see the previous commit

Added an example

86693c5

MythicArrow closed this Jun 10, 2025

MythicArrow reopened this Dec 11, 2025

MythicArrow added 2 commits December 16, 2025 20:17

Merge branch 'main' into patch-3

055a883

Refactor PReLU function for idiomatic implementation

7d8ed8c

Refactor PReLU implementation to use jnp.maximum and jnp.minimum for better idiomatic expression.



		Returns:
		jnp.ndarray: Output tensor with the PReLU activation applied.

Implementing PReLu #29147

Are you sure you want to change the base?

Implementing PReLu #29147

Uh oh!

Conversation

MythicArrow commented May 31, 2025

Uh oh!

MythicArrow commented May 31, 2025

Uh oh!

jakevdp commented Jun 2, 2025

Uh oh!

MythicArrow commented Jun 2, 2025

Uh oh!

MythicArrow commented Jun 2, 2025

Uh oh!

MythicArrow commented Jun 2, 2025

Uh oh!

MythicArrow commented Jun 2, 2025

Uh oh!

MythicArrow commented Jun 3, 2025

Uh oh!

Uh oh!

jakevdp Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MythicArrow Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MythicArrow Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MythicArrow commented Jun 8, 2025

Uh oh!

DanisNone commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 10, 2025

Uh oh!

DanisNone commented Jun 10, 2025

Uh oh!

jakevdp commented Jun 10, 2025

Uh oh!

jakevdp commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MythicArrow commented Jun 10, 2025

Uh oh!

jakevdp commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 10, 2025

Uh oh!

MythicArrow commented Jun 15, 2025

Uh oh!

jakevdp commented Jun 16, 2025

Uh oh!

MythicArrow commented Jun 18, 2025

jakevdp Jun 3, 2025 •

edited

Loading

MythicArrow Jun 3, 2025 •

edited

Loading

MythicArrow Jun 3, 2025 •

edited

Loading

MythicArrow commented Jun 10, 2025 •

edited

Loading