Skip to content

Conversation

@Bougeant
Copy link

@Bougeant Bougeant commented Mar 19, 2024

This fix is intended to fix the bug found here: #1463

@MaartenGr
Copy link
Owner

Thanks for the PR! Just to be sure I understand correctly, why is this specific change a solution to the problem? Also, which version of cuML are you using? I believe there were some changes to recent versions that might have fixed this.

@Bougeant
Copy link
Author

Bougeant commented Mar 20, 2024

So I think when func == "membership_vector", we should calculate probabilities using membership_vector(model, embeddings), not using approximate_predict(model, embeddings) which merely calculates the most relevant cluster and the associated probability.

membership_vector returns a 2D of size (n_sample, n_topics), while approximate_predict returns a tuple of two 1D arrays of size (n_sample) each. This explains why probabilities is a tuple, not an array in #1463 (AttributeError: 'tuple' object has no attribute 'shape'). I stumbled upon the same issue and was able to get past it with this fix.

I believe that this was just a typo (you probably meant membership_vector but ended up writing approximate_predict)

@MaartenGr
Copy link
Owner

@Bougeant I believe this relates to #1324 where approximate_predict was used for earlier versions of cuML compared to newer versions which do use membership_vector. So this was not a typo but merely the functionality that was in v23.03 and lower I believe. If I'm not mistaken, membership_vector was introduced in v23.04.

Perhaps that related PR should be updated and merged instead?

@Bougeant
Copy link
Author

Oh right, makes sense. In any case, the current implementation does not work since we expect a 2D array but are getting a 2-tuple. #1324 adds some unit tests as well, so that's probably the way to go.

@Bougeant Bougeant closed this by deleting the head repository Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants