Miscellaneous tests to add (from legacy notebooks) #22

richarddushime · 2025-07-25T16:08:02Z

This PR fixes a small part of #16

3 notebooks
2_polblogs_benchmark.ipynb:
3_verify_shapes.ipynb:
5_sampling.ipynb:

richarddushime · 2025-07-29T20:19:43Z

@pchlenski would you please give a feedback ?

pchlenski · 2025-07-31T16:29:30Z

@richarddushime Please be patient, as I have several high-priority tasks I need to finish first. I will try to look at this within the next week.

richarddushime · 2025-07-31T16:37:25Z

@richarddushime Please be patient, as I have several high-priority tasks I need to finish first. I will try to look at this within the next week.

Thanks 🙏

pchlenski · 2025-08-05T01:27:17Z

tests/test_manifolds.py

+            dist2_sum = sum(M.dist2(X1[:, slc], X2[:, slc]) for M, slc in zip(pm.P, slices))
+
+            assert torch.allclose(pdist2_total, pdist2_sum, atol=1e-5), "pdist2 does not match sum of component pdist2"
+            assert torch.allclose(dist2_total, dist2_sum, atol=1e-5), "dist2 does not match sum of component dist2"


These tests would error as written because the single manifold class Manifold does not have the P attribute. This should be moved to a separate function, or possibly into test_product_manifold_methods instead.

pchlenski · 2025-08-05T01:28:41Z

tests/test_manifolds.py

+            dist2_total = pm.dist2(X1, X2)
+
+            # Compute dimension slices manually
+            slices = []


Please use the purpose-built pm.factorize method to slice the manifold instead

pchlenski · 2025-08-05T01:30:44Z

tests/test_manifolds.py

+        from manify.embedders._losses import dist_component_by_manifold # type: ignore
+        if len(pm.P) > 1:  # Requires multiple components
+            contributions = dist_component_by_manifold(pm, X1)
+            assert torch.isclose(torch.tensor(sum(contributions)), torch.tensor(1.0), atol=1e-5), "Contributions do not sum to 1"


Similar to this comment above, this code will error when run with the Manifold class; please move this as well

pchlenski · 2025-08-05T01:34:50Z

tests/test_manifolds.py

+        # Compute distances to origin
+        distances_to_origin = m.dist(samples, m.mu0.unsqueeze(0)).squeeze()
+
+        # For hyperbolic and Euclidean manifolds, distances should be consistent


Perhaps we can have something like this instead:

max_dist = float('inf') if K > 0 else torch.pi / torch.sqrt(K) # Rest of the code does not have if K <= 0 / else control flow any longer: # Check that distances are reasonable (not all zero, not all infinite) assert distances_to_origin.min() > 0, f"Distances to origin should be positive for K={K}" assert distances_to_origin.max() < max_dist, f"Distances to origin should be less than {max_dist} for K={K}" # Check that distances have reasonable variance (not all identical) assert distances_to_origin.std() > 1e-6, f"Distances to origin should have variance for K={K}"

pchlenski · 2025-08-05T01:40:30Z

tests/test_manifolds.py

+        elif K > 0:
+            # Spherical manifolds have bounded distances
+            max_possible_distance = math.pi / math.sqrt(K)
+            assert distances_to_origin.max() <= max_possible_distance, f"Distances should be bounded for spherical K={K}"


In general, we have much stronger expectations here: that the distance to origin should be, on average, close to the distance of mu to the origin, and that the standard deviation of the distance should be related to Sigma in some way (need to figure out what the exact relation is)

pchlenski · 2025-08-05T01:43:14Z

tests/test_manifolds.py

+            assert distances_to_origin.max() <= max_possible_distance, f"Distances should be bounded for spherical K={K}"
+
+
+def test_log_likelihood_differences():


It seems like test_log_likelihood_differences and test_product_manifold_log_likelihood_differences can be added to the shared manifold tests instead, letting the loop over curvatures in test_manifold_methods and over signatures in def test_product_manifold_methods handle this variation.

pchlenski · 2025-08-05T01:44:21Z

tests/test_manifolds.py

+        assert log_likelihood_diff.std() > 0, f"Log-likelihood differences should have variance for signature={signature}"
+
+
+def test_kl_divergence_equivalence():


Same as this comment above, please incorporate this into shared tests rather than writing separate loops over curvatures/signatures

"This" meaning KL divergence equivalence tests for single/product manifolds

pchlenski · 2025-08-05T04:08:34Z

tests/test_manifolds.py

+
+    curvatures = [-1.0, 0.0, 1.0]
+
+    for K in curvatures:


This could be strengthened by setting random seeds specifically: in that case, we should be able to verify that setting n_samples or using stacked z_mean are equivalent, as e.g. in this code from my original _shared_tests function:

# Sampling shapes should support a variety of inputs stacked_means = torch.stack([M.mu0] * 5) s1 = M.sample(100) assert s1.shape == (100, M.ambient_dim), "Sampled points should have the correct shape" s2 = M.sample(100, z_mean=M.mu0) assert s2.shape == (100, M.ambient_dim), "Sampled points should have the correct shape" s3 = M.sample(z_mean=stacked_means) assert s3.shape == (5, M.ambient_dim), "Sampled points should have the correct shape" s3 = M.sample(100, z_mean=stacked_means) assert s3.shape == (500, M.ambient_dim), "Sampled points should have the correct shape"

Try setting seeds equivalently and verifying that this behaves as intended

pchlenski · 2025-08-05T04:15:50Z

@richarddushime thanks for all of your work. I have added comments for this PR, please review if you have time. Let me know if you have any questions or if anything is unclear.

Additionally a have a few general questions/comments:

No random seeds are set. This might be OK for tests that are inside the call to _shared_tests, since we set the torch random seed prior to calling it, but for any new test functions involving sampling, please make sure to set a seed.
Please change imports from math library to use their torch equivalents, e.g. torch.pi.
In the future, please run the tests and verify that they pass. I can tell you didn't do this because, for instance, Manifold ojects have no P attribute.
I'm not convinced that the extreme/edge case tests at the end of this PR will pass. It's a nice idea, but working with these manifolds can be pretty numerically unstable. A more moderate approach may be to carve out a range of curvatures, dimensionalities, etc., which are "officially supported," and have the tests reflect the extremes of that range. Let's talk more about this once you've addressed the rest of the comments / determined which of the final tests fail.

…ance tests, sampling consistency, and edge cases

richarddushime · 2025-09-04T16:53:21Z

@pchlenski Its clear that i do not master the codebase and all the funcs as for now but i liked so much the project , its very interesting, and i want to learn more

RDM and others added 4 commits July 25, 2025 17:35

Miscellaneous tests to add (from legacy notebooks)

f03cd27

Computing dimension slices manually bcoz i was getting a key-error

32e19b8

Computing dimension slices manually bcoz i was getting a key-error

de5a0df

Add 5_sampling.ipynb tests

5246e3f

richarddushime changed the title ~~Tests misc~~ Miscellaneous tests to add (from legacy notebooks) Jul 27, 2025

pchlenski reviewed Aug 5, 2025

View reviewed changes

Feedbacks Addressed by Improving manifold tests: add statistical dist…

9ae82d4

…ance tests, sampling consistency, and edge cases

		assert distances_to_origin.max() <= max_possible_distance, f"Distances should be bounded for spherical K={K}"


		def test_log_likelihood_differences():

		assert log_likelihood_diff.std() > 0, f"Log-likelihood differences should have variance for signature={signature}"


		def test_kl_divergence_equivalence():

Miscellaneous tests to add (from legacy notebooks) #22

Are you sure you want to change the base?

Miscellaneous tests to add (from legacy notebooks) #22

Uh oh!

Conversation

richarddushime commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

richarddushime commented Jul 29, 2025

Uh oh!

pchlenski commented Jul 31, 2025

Uh oh!

richarddushime commented Jul 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pchlenski commented Aug 5, 2025

Uh oh!

richarddushime commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richarddushime commented Jul 25, 2025 •

edited

Loading